Seek .(SKLTY)
Search documents
DeepSeek V3.2正式版发布:官方称推理比肩GPT-5
Feng Huang Wang· 2025-12-03 09:04
12月1日,深度求索(DeepSeek)正式发布新一代开源大模型DeepSeek-V3.2及其长思考增强版DeepSeek-V3.2-Speciale。官方网页端、App及API已同步更新 至V3.2版本。 根据官方数据,在公开的推理基准测试中,DeepSeek-V3.2的推理能力达到GPT-5水平,与Gemini-3.0-Pro接近,同时输出长度较Kimi-K2-Thinking显著缩短, 以降低计算开销。V3.2-Speciale版本融合了DeepSeek-Math-V2的定理证明能力,在IMO、CMO、ICPC及IOI等多项国际竞赛中取得金牌成绩,其中ICPC成绩 达到人类选手第二名水平。 新版本首次实现了思考模式与工具调用的融合,支持在思考过程中调用外部工具。通过大规模Agent训练数据合成方法,模型在1800多个环境和超过8.5万条 复杂指令上进行了强化学习训练,提升了泛化能力。官方称其在智能体评测中达到当前开源模型最高水平,进一步缩小了与闭源模型的差距。 此前的实验版本DeepSeek-V3.2-Exp于两个月前发布,经用户反馈测试,其采用的DSA稀疏注意力机制在各项场景中未出现显著性能下降。Sp ...
聊DeepSeek、聊AI硬件、聊竞争对手,OpenAI首席研究官专访信息密度有点大
3 6 Ke· 2025-12-03 07:46
Core Insights - OpenAI's Chief Research Officer Mark Chen discussed the company's strategic vision amid intense AI competition and technological advancements, addressing concerns about talent retention and the pursuit of AGI [1] Group 1: Talent Acquisition and Retention - OpenAI faces aggressive talent poaching from competitors like Meta, which reportedly invests billions annually in recruitment efforts, yet most OpenAI employees have chosen to stay [2] - Despite competitive salary pressures, OpenAI does not engage in salary wars, focusing instead on a shared vision of achieving AGI as the key to retaining talent [2] Group 2: Resource Allocation and Project Management - OpenAI is managing approximately 300 concurrent research projects, with a focus on prioritizing those that are most likely to advance AGI, emphasizing exploratory research over following trends [3] - The company maintains a transparent and strict resource allocation process, allowing for secondary projects but clearly defining their subordinate status to ensure efficiency [3] Group 3: Competitive Landscape and Model Development - OpenAI monitors competitor releases, such as Google's Gemini 3, but maintains its own development pace, emphasizing confidence in internal progress rather than reacting to external pressures [4] - The company is refocusing on pre-training capabilities, which had been deprioritized, believing there is still significant potential for improvement in this area [5] Group 4: AGI Development and Future Goals - Mark Chen believes that significant changes in AI capabilities will occur within the next two years, with goals set for AI to participate in research processes and eventually conduct end-to-end research autonomously [7] - The demand for computational power is expected to remain high, with Chen stating that even a threefold increase in resources would be quickly utilized [8] Group 5: Hardware Development and Future Interactions - OpenAI is collaborating with designer Jony Ive to develop next-generation AI hardware that aims to enhance user interaction by enabling continuous learning and memory capabilities [9] - The goal is to evolve AI from a passive assistant to a more intelligent entity that can remember user interactions and improve over time [9] Group 6: Strategic Focus Amid Competition - In response to the emergence of open-source models like DeepSeek, OpenAI emphasizes the importance of maintaining its research pace and innovation focus, rather than being swayed by competitive pressures [10]
DeepSeek V3.2发布!实测效果惊艳,便宜是最大优势
3 6 Ke· 2025-12-03 03:57
Core Insights - DeepSeek has launched its V3.2 version, which reportedly matches the inference capabilities of OpenAI's GPT-5 while being significantly cheaper [1][22] - The V3.2 version includes two variants: a free version for users and a Speciale version that supports API access, which boasts enhanced reasoning capabilities [2][22] Performance Enhancements - DeepSeek V3.2-Speciale has demonstrated superior performance in various competitions, achieving gold medal results in IMO 2025, CMO 2025, ICPC World Finals 2025, and IOI 2025, outperforming GPT-5 High in all tests [4][22] - The introduction of the DeepSeek Sparse Attention (DSA) mechanism has fundamentally improved the efficiency of attention in AI models, reducing computational costs by over 60% and increasing inference speed by approximately 3.5 times [6][12] Cost Efficiency - The DSA mechanism allows for a significant reduction in the cost of processing long sequences, with costs dropping from $0.7 to $0.2 per million tokens during the pre-fill phase and from $2.4 to $0.8 during the decoding phase [12][22] - This cost reduction positions DeepSeek V3.2 as one of the most affordable models for long-text inference in its category [12][22] Tool Utilization - DeepSeek V3.2 allows the AI model to call tools during its reasoning process without requiring additional training, enhancing its general performance and compatibility with user-created tools [13][22] - The model demonstrates the ability to break down complex tasks and utilize different tools effectively, showcasing its decision-making capabilities [20][22] Market Impact - The release of DeepSeek V3.2 challenges the notion that open-source models lag behind closed-source counterparts, as it offers competitive performance at a fraction of the cost [22][23] - The DSA mechanism's cost revolution is expected to significantly impact the commercialization of AI models, making advanced AI applications more accessible to smaller enterprises and consumers [22][23]
如果你非得用DeepSeek看病,建议这么看(附详细提问模版)
3 6 Ke· 2025-12-03 03:23
你用DeepSeek看过病了吗? 打开它,说出自己的不舒服或拍照上传检查结果,几秒后就能得到诊断和治疗建议。继续问这个病是怎么回事或药怎么用,它还能给出更详细易懂的解 释,有问必答。 不花钱、不用抢号,还比医生耐心得多,是不是以后看病找DeepSeek就行?如果问DeepSeek本人,它会回答: DeepSeek对自己可不可以看病的回答 | DeepSeek截图 实际让DeepSeek看一次病,你会在回复的末尾见到一个提示框: 3. 开出另外几项检查,分辨表现相近的疾病、确定诊断; 问其他问题的时候,一般不会出现这个提示框 | DeepSeek截图 "不能""不应""仅供参考",这是DeepSeek太过谦虚,还是看病这件事有什么特殊的地方? 下面,我们来看看到底能不能用DeepSeek看病,和怎么用它把病看得更好(附详细提问模板)。 能不能用AI看病?当专家不能,当助手很能 有一种用DeepSeek等人工智能助手(AI)看病的方法是,得到它的回复之后就给自己确诊,然后听从AI建议开始吃药,就像刚刚找医学专家看过病。 可是,医学专家看病时很少单凭几句描述或者一张检查单,就给出一个确定的诊断,接来下可能还会做这些 ...
DeepSeek杀出一条血路:国产大模型突围不靠运气
3 6 Ke· 2025-12-03 03:21
进入2025年末,全球大模型赛道的技术焦点几乎被Google重新夺回。Gemini 3 Pro横空出世,在多个权 威基准上超越所有开源模型,重新确立了闭源阵营的技术高地。一时间,业内关于"开源模型是否已到 极限""Scaling Law是否真的撞墙"的质疑声再起,一股迟滞情绪在开源社区弥漫。 但就在此时,DeepSeek没有选择沉默。12月1日,它一口气发布了两款重磅模型:推理性能对标GPT-5 的DeepSeek-V3.2,以及在数学、逻辑和多轮工具调用中表现异常强势的Speciale版本。这不仅是对技术 能力的集中展示,也是在当前算力资源并不占优的前提下,对闭源"新天花板"的正面回应。 这不是一次简单的模型更新。DeepSeek试图在后Scaling时代找出一条全新路径:如何用架构重塑弥补 预训练差距?如何通过"工具使用中的思考链"实现低token高效率的智能体表现?更关键的是,Agent为 何从附属功能变成了模型能力跃迁的核心引擎? 本文将围绕这三条主线展开分析:DeepSeek是如何在技术瓶颈下突破的?为何率先在开源阵营中重注 Agent?而这是否意味着,开源模型仍有穿透闭源护城河的那条路? 这背后的 ...
DeepSeek发布新模型!创业板50ETF(159949)涨0.48%,机构持续看好AI产业链投资机会
Xin Lang Cai Jing· 2025-12-03 02:33
Core Viewpoint - The news highlights the performance of the ChiNext 50 ETF (159949), which has shown a slight increase of 0.48% to 1.467 CNY, amidst a broader market fluctuation, indicating ongoing investor interest and activity in the growth sector [1][6]. Market Performance - As of 10:20 AM on December 3, the ChiNext 50 ETF (159949) was trading at 1.467 CNY, with a trading volume of 4.22 billion CNY and a turnover rate of 1.66% [1][6]. - The ETF has experienced a cumulative trading amount of 323.05 billion CNY over the last 20 trading days, averaging 16.15 billion CNY per day, and a total of 3,205.79 billion CNY over 222 trading days this year, averaging 14.44 billion CNY per day [7][10]. Top Holdings - The top ten holdings of the ChiNext 50 ETF (159949) include leading companies such as CATL, Zhongji Xuchuang, Dongfang Caifu, Xinyi Technology, Sungrow Power, Shenghong Technology, Huichuan Technology, Mindray, Yiwei Lithium Energy, and Tonghuashun [3][8]. Industry Insights - Longcheng Securities reports that the continuous implementation of AI applications will drive the acceleration of computing infrastructure, particularly in the AIDC industry chain, which includes optical modules, PCBs, and main equipment manufacturers, indicating a strong demand release and potential for performance and valuation growth [10]. - The report suggests that the demand for edge computing modules will steadily increase as AI applications continue to develop, transitioning from traditional data transmission modules to intelligent and computing modules [10]. Investment Recommendations - The ChiNext 50 ETF (159949) is presented as a convenient and efficient investment tool for investors looking to capitalize on the long-term growth of China's technology sector, with recommendations for dollar-cost averaging or phased investment strategies to mitigate short-term volatility [10].
AI产业速递:从DeepSeek V3
2025-12-03 02:12
Summary of Key Points from the Conference Call Industry and Company Overview - The conference call discusses advancements in the AI industry, specifically focusing on the Deepseek V3.2 model developed by DeepMind, which showcases significant improvements in reinforcement learning and inference efficiency [1][3][5]. Core Insights and Arguments - **Model Architecture and Mechanisms**: Deepseek V3.2 introduces the Dynamic Spatial Attention (DSA) mechanism, replacing the previous Multi-Level Attention (MLA) mechanism. DSA optimizes computational efficiency by focusing on key attention parameters, particularly in complex tasks [3][5]. - **Performance Enhancements**: The C9 version of Deepseek V3.2 utilizes approximately 10% of the pre-training computational resources to significantly enhance its performance in complex tasks, such as code debugging, achieving a global leading level [1][3]. - **Context Management Strategy**: The model employs an efficient context management strategy that intelligently handles frequent task switching, multi-turn dialogues, and ambiguous inputs, effectively reducing inference costs [1][3]. - **Synthetic Data Utilization**: The training process for Deepseek V3.2 incorporates a substantial amount of high-difficulty synthetic data, which has doubled compared to previous versions. This data is crucial for the subsequent reinforcement learning phase and requires significant computational resources [1][6]. - **Open Source Innovations**: Deepseek has made strides in open-source capabilities by completing a comprehensive post-training process and supporting agent invocation, potentially leveling the playing field with closed-source models [7]. Additional Important Insights - **Reinforcement Learning Developments**: The evolution of reinforcement learning techniques has been marked by the introduction of human prompts based on Rubik's rules, enhancing the model's ability to think and execute simultaneously, thus improving overall efficiency [8][9]. - **Future of Model Pricing**: It is anticipated that by 2026, the cost of models will significantly decrease, potentially dropping to one-fifth of current prices due to advancements in technology and competitive pricing strategies among vendors [2][20]. - **Impact of Sparsity Techniques**: The implementation of sparsity techniques is expected to lower training computational requirements while increasing the upper limits of model training, encouraging more startups to engage in large model development [2][19]. - **Vertical Scene Task Solutions**: The application of reinforcement learning in e-commerce platforms illustrates the model's ability to adapt recommendations based on user feedback through multi-turn dialogue mechanisms, enhancing user satisfaction [12]. Conclusion - The advancements in Deepseek V3.2 highlight a significant shift in the AI landscape, emphasizing the importance of efficient computational mechanisms, the role of synthetic data, and the potential for open-source models to compete with proprietary solutions. The expected decrease in model costs and the rise of new startups indicate a dynamic and evolving market landscape [1][2][20].
DeepSeek上新两款模型,计算机ETF(159998)昨日成交额居同标的产品第一,机构:全球AI产业进入共振期
2 1 Shi Ji Jing Ji Bao Dao· 2025-12-03 01:24
Group 1: A-Share Market Performance - The A-share market experienced fluctuations on December 2, with the Shenzhen Component Index and the ChiNext Index both dropping over 1% at one point [1] - The CSI Computer Theme Index fell by 1.38%, while stocks such as Guolian Co., Ltd., Zhongke Xingtu, and Aerospace Information saw gains [1] - The CSI Hong Kong-Shenzhen Cloud Computing Industry Index decreased by 0.17%, with stocks like Xinyisheng, Zhongji Xuchuang, and Alibaba-W leading in gains [1] Group 2: ETF Performance - The Computer ETF (159998) recorded a trading volume exceeding 64 million yuan, with a turnover rate of 2.58%, ranking first among similar products [1] - The Tianhong Cloud Computing ETF (517390) saw an increase of nearly 170 million yuan in shares year-to-date, with a remarkable growth rate of 351.26%, also ranking first among similar products [1] Group 3: Industry Insights - The Computer ETF tracks the CSI Computer Theme Index, which encompasses both hardware and software sectors, reflecting the overall performance of the computer industry [1] - Key areas of certainty in AI development include hardware for edge AI, software for C-end overseas markets, B-end enterprise services, and G-end private deployment of large models [1] - The Tianhong Cloud Computing ETF closely follows the CSI Hong Kong-Shenzhen Cloud Computing Industry Index, providing access to competitive cloud computing assets across A-shares and Hong Kong [1] Group 4: Quantum Computing and AI Developments - DeepSeek released two official model versions, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, enhancing agent capabilities [2] - The first domestic photonic quantum computer manufacturing plant was inaugurated in Shenzhen, Guangdong, on November 24, entering small-scale production [2] - The space computing industry is approaching a critical point, with competition expected to create new opportunities [2] - The integration of quantum computing, blockchain, and AI with cloud computing is anticipated to expand market boundaries, with a projected growth rate of over 20% in the next five years, potentially exceeding 3 trillion yuan by 2030 [2]
DeepSeek的小更新,暴打了OpenAI,追上了Gemini
3 6 Ke· 2025-12-03 00:58
Core Insights - DeepSeek has launched two new models, DeepSeek V3.2 and DeepSeek-V3.2-Speciale, which are designed to compete with leading models like GPT-5 and Gemini [1][5][20]. Model Performance - DeepSeek V3.2 has shown competitive performance in various benchmarks, achieving scores close to or surpassing those of GPT-5 and Gemini in several tests [6][20]. - The model's performance in specific benchmarks includes: - AIME 2025: DeepSeek V3.2 scored 93.1, while DeepSeek V3.2-Speciale scored 96.0 [6]. - HMMT Feb 2025: DeepSeek V3.2 scored 92.5, and DeepSeek V3.2-Speciale scored 99.2 [6]. - Overall, DeepSeek V3.2-Speciale is noted for its ability to compete effectively with Gemini 3 [20][27]. Technological Innovations - DeepSeek has implemented Sparse Attention (DSA) in its models, which allows for more efficient processing of longer texts by reducing computational complexity [9][13]. - The company has focused on enhancing post-training processes for open-source models, investing over 10% of total training compute to improve model performance in challenging tasks [17][21]. - DeepSeek V3.2 Speciale encourages longer reasoning without penalizing the model for extended thought processes, enhancing its ability to tackle complex problems [18][20]. Cost Efficiency - Despite higher token consumption compared to competitors, DeepSeek offers a more cost-effective solution, with a significant price advantage over models like Gemini [32][33]. - For example, using 8077 tokens on DeepSeek costs approximately $0.0032, while using 4972 tokens on Gemini costs around $0.06, highlighting a 20-fold price difference [33]. Industry Context - The gap between open-source and closed-source models is reportedly widening, but DeepSeek is actively working to close this gap through innovative approaches and cost-saving measures [35][36]. - The company's strategy emphasizes algorithmic improvements over merely increasing computational power, aligning with industry insights on the importance of efficient model training [38][39].
DeepSeek-V3.2正式版及高计算版发布
Xin Hua Wang· 2025-12-02 12:14
Core Insights - DeepSeek has officially launched two models: DeepSeek-V3.2 and a high-performance version, DeepSeek-V3.2-Speciale [1] - The DeepSeek-V3.2 model balances exceptional reasoning capabilities and agent performance with high computational efficiency [1] Company Overview - DeepSeek, officially known as Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., was established in July 2023 [1] - The company focuses on the research and development of large language models and multimodal AI technologies [1]