DeepSeek
Search documents
DeepSeek V3.2发布!实测效果惊艳,便宜是最大优势
3 6 Ke· 2025-12-03 03:57
Core Insights - DeepSeek has launched its V3.2 version, which reportedly matches the inference capabilities of OpenAI's GPT-5 while being significantly cheaper [1][22] - The V3.2 version includes two variants: a free version for users and a Speciale version that supports API access, which boasts enhanced reasoning capabilities [2][22] Performance Enhancements - DeepSeek V3.2-Speciale has demonstrated superior performance in various competitions, achieving gold medal results in IMO 2025, CMO 2025, ICPC World Finals 2025, and IOI 2025, outperforming GPT-5 High in all tests [4][22] - The introduction of the DeepSeek Sparse Attention (DSA) mechanism has fundamentally improved the efficiency of attention in AI models, reducing computational costs by over 60% and increasing inference speed by approximately 3.5 times [6][12] Cost Efficiency - The DSA mechanism allows for a significant reduction in the cost of processing long sequences, with costs dropping from $0.7 to $0.2 per million tokens during the pre-fill phase and from $2.4 to $0.8 during the decoding phase [12][22] - This cost reduction positions DeepSeek V3.2 as one of the most affordable models for long-text inference in its category [12][22] Tool Utilization - DeepSeek V3.2 allows the AI model to call tools during its reasoning process without requiring additional training, enhancing its general performance and compatibility with user-created tools [13][22] - The model demonstrates the ability to break down complex tasks and utilize different tools effectively, showcasing its decision-making capabilities [20][22] Market Impact - The release of DeepSeek V3.2 challenges the notion that open-source models lag behind closed-source counterparts, as it offers competitive performance at a fraction of the cost [22][23] - The DSA mechanism's cost revolution is expected to significantly impact the commercialization of AI models, making advanced AI applications more accessible to smaller enterprises and consumers [22][23]
DeepSeek杀出一条血路:国产大模型突围不靠运气
3 6 Ke· 2025-12-03 03:21
进入2025年末,全球大模型赛道的技术焦点几乎被Google重新夺回。Gemini 3 Pro横空出世,在多个权 威基准上超越所有开源模型,重新确立了闭源阵营的技术高地。一时间,业内关于"开源模型是否已到 极限""Scaling Law是否真的撞墙"的质疑声再起,一股迟滞情绪在开源社区弥漫。 但就在此时,DeepSeek没有选择沉默。12月1日,它一口气发布了两款重磅模型:推理性能对标GPT-5 的DeepSeek-V3.2,以及在数学、逻辑和多轮工具调用中表现异常强势的Speciale版本。这不仅是对技术 能力的集中展示,也是在当前算力资源并不占优的前提下,对闭源"新天花板"的正面回应。 这不是一次简单的模型更新。DeepSeek试图在后Scaling时代找出一条全新路径:如何用架构重塑弥补 预训练差距?如何通过"工具使用中的思考链"实现低token高效率的智能体表现?更关键的是,Agent为 何从附属功能变成了模型能力跃迁的核心引擎? 本文将围绕这三条主线展开分析:DeepSeek是如何在技术瓶颈下突破的?为何率先在开源阵营中重注 Agent?而这是否意味着,开源模型仍有穿透闭源护城河的那条路? 这背后的 ...
DeepSeek发布新模型!创业板50ETF(159949)涨0.48%,机构持续看好AI产业链投资机会
Xin Lang Cai Jing· 2025-12-03 02:33
Core Viewpoint - The news highlights the performance of the ChiNext 50 ETF (159949), which has shown a slight increase of 0.48% to 1.467 CNY, amidst a broader market fluctuation, indicating ongoing investor interest and activity in the growth sector [1][6]. Market Performance - As of 10:20 AM on December 3, the ChiNext 50 ETF (159949) was trading at 1.467 CNY, with a trading volume of 4.22 billion CNY and a turnover rate of 1.66% [1][6]. - The ETF has experienced a cumulative trading amount of 323.05 billion CNY over the last 20 trading days, averaging 16.15 billion CNY per day, and a total of 3,205.79 billion CNY over 222 trading days this year, averaging 14.44 billion CNY per day [7][10]. Top Holdings - The top ten holdings of the ChiNext 50 ETF (159949) include leading companies such as CATL, Zhongji Xuchuang, Dongfang Caifu, Xinyi Technology, Sungrow Power, Shenghong Technology, Huichuan Technology, Mindray, Yiwei Lithium Energy, and Tonghuashun [3][8]. Industry Insights - Longcheng Securities reports that the continuous implementation of AI applications will drive the acceleration of computing infrastructure, particularly in the AIDC industry chain, which includes optical modules, PCBs, and main equipment manufacturers, indicating a strong demand release and potential for performance and valuation growth [10]. - The report suggests that the demand for edge computing modules will steadily increase as AI applications continue to develop, transitioning from traditional data transmission modules to intelligent and computing modules [10]. Investment Recommendations - The ChiNext 50 ETF (159949) is presented as a convenient and efficient investment tool for investors looking to capitalize on the long-term growth of China's technology sector, with recommendations for dollar-cost averaging or phased investment strategies to mitigate short-term volatility [10].
DeepSeek的小更新,暴打了OpenAI,追上了Gemini
3 6 Ke· 2025-12-03 00:58
Core Insights - DeepSeek has launched two new models, DeepSeek V3.2 and DeepSeek-V3.2-Speciale, which are designed to compete with leading models like GPT-5 and Gemini [1][5][20]. Model Performance - DeepSeek V3.2 has shown competitive performance in various benchmarks, achieving scores close to or surpassing those of GPT-5 and Gemini in several tests [6][20]. - The model's performance in specific benchmarks includes: - AIME 2025: DeepSeek V3.2 scored 93.1, while DeepSeek V3.2-Speciale scored 96.0 [6]. - HMMT Feb 2025: DeepSeek V3.2 scored 92.5, and DeepSeek V3.2-Speciale scored 99.2 [6]. - Overall, DeepSeek V3.2-Speciale is noted for its ability to compete effectively with Gemini 3 [20][27]. Technological Innovations - DeepSeek has implemented Sparse Attention (DSA) in its models, which allows for more efficient processing of longer texts by reducing computational complexity [9][13]. - The company has focused on enhancing post-training processes for open-source models, investing over 10% of total training compute to improve model performance in challenging tasks [17][21]. - DeepSeek V3.2 Speciale encourages longer reasoning without penalizing the model for extended thought processes, enhancing its ability to tackle complex problems [18][20]. Cost Efficiency - Despite higher token consumption compared to competitors, DeepSeek offers a more cost-effective solution, with a significant price advantage over models like Gemini [32][33]. - For example, using 8077 tokens on DeepSeek costs approximately $0.0032, while using 4972 tokens on Gemini costs around $0.06, highlighting a 20-fold price difference [33]. Industry Context - The gap between open-source and closed-source models is reportedly widening, but DeepSeek is actively working to close this gap through innovative approaches and cost-saving measures [35][36]. - The company's strategy emphasizes algorithmic improvements over merely increasing computational power, aligning with industry insights on the importance of efficient model training [38][39].
DeepSeekV3.2技术报告还是老外看得细
量子位· 2025-12-03 00:11
Core Insights - The article discusses the launch of two open-source models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which have gained significant attention in Silicon Valley, indicating a shift in the competitive landscape of AI models [2][6]. Group 1: Model Performance - DeepSeek-V3.2 has achieved the highest level among current open-source models, significantly narrowing the gap with top closed-source models [6]. - The standard version of DeepSeek-V3.2 reached performance levels comparable to GPT-5, while the Speciale version surpassed GPT-5 and competed closely with Gemini-3.0-Pro in mainstream reasoning tasks [7][8]. - DeepSeek-V3.2-Speciale won gold medals in various competitions, demonstrating its advanced capabilities [9]. Group 2: Technical Innovations - The model utilizes DSA sparse attention to address efficiency issues with long contexts, laying the groundwork for subsequent long-sequence reinforcement learning [14]. - By introducing scalable reinforcement learning and allocating over 10% of pre-training compute for post-training, the model significantly enhances general reasoning and agent capabilities [15]. - The Speciale version allows for extended reasoning chains, enabling deeper self-correction and exploration, which unlocks stronger reasoning abilities without increasing pre-training scale [16][17]. Group 3: Economic Implications - DeepSeek-V3.2 is approximately 24 times cheaper than GPT-5 and 29 times cheaper than Gemini 3 Pro in terms of output token costs [29][30]. - The cost of using DeepSeek-V3.2 for generating extensive content is significantly lower, making it an economically attractive option compared to its competitors [31][32]. - The model's deployment on domestic computing power (e.g., Huawei, Cambricon) could further reduce inference costs, posing a challenge to established players like Google and OpenAI [36]. Group 4: Market Impact - The success of DeepSeek-V3.2 challenges the notion that open-source models lag behind closed-source ones, indicating a potential shift in market dynamics [10][26]. - The article highlights that the gap between DeepSeek and top models is now more of an economic issue rather than a technical one, suggesting that with sufficient resources, open-source models can compete effectively [26].
OpenAI首席研究员Mark Chen长访谈:小扎亲手端汤来公司挖人,气得我们端着汤去了Meta
量子位· 2025-12-03 00:11
Core Insights - The interview with OpenAI's Chief Research Officer Mark Chen reveals the competitive landscape in AI talent acquisition, particularly between OpenAI and Meta, highlighting the lengths to which companies will go to attract top talent, including sending homemade soup [4][9][11] - OpenAI maintains a strong focus on AI research, with a core team of approximately 500 people and around 300 ongoing projects, emphasizing the importance of pre-training and the development of next-generation models [4][20][27] - Mark Chen expresses confidence in OpenAI's ability to compete with Google's Gemini 3, stating that internal models have already matched its performance and that further advancements are imminent [4][26][119] Talent Acquisition and Competition - Meta's aggressive recruitment strategy has led to a "soup war," where both companies are trying to entice talent through unconventional means [4][11] - Despite Meta's efforts, many OpenAI employees have chosen to stay, indicating a strong belief in OpenAI's mission and future [10][14] - The competition for talent is intense, with companies recognizing the necessity of attracting the best individuals to build effective AI labs [9][10] Research Focus and Model Development - OpenAI's research strategy prioritizes exploratory research over merely replicating existing benchmarks, aiming to discover new paradigms in AI [22][27] - The company has invested heavily in pre-training, believing it still holds significant potential, contrary to claims that scaling has reached its limits [118][119] - Mark Chen emphasizes the importance of maintaining a clear focus on core research priorities and effectively communicating these to the team [24][20] Response to Competitors - OpenAI aims to avoid being reactive to competitors, focusing instead on long-term research goals and breakthroughs rather than short-term updates [26][28] - The company has already developed models that can compete with Gemini 3, showcasing its confidence in upcoming releases [34][119] - Mark Chen highlights the significance of reasoning capabilities in language models, which OpenAI has been developing for over two years [26][116] Company Culture and Management - OpenAI's culture remains rooted in its original mission as a pure AI research organization, despite its growth and the introduction of product lines [27][28] - Mark Chen's management style emphasizes collaboration and open communication, fostering a strong sense of community among researchers [101][104] - The company has navigated internal challenges, including leadership changes, by promoting unity and a shared vision among its team [98][102]
OpenAI’s ‘code red’ memo lays bare pressure from Google, DeepSeek and its $1.4 trillion AI bet
CNBC Television· 2025-12-02 18:31
Uh McKenzie Seagalos joins us now. What does this what does this mean. I mean, is this now uh put put Google in a in a position now where they have um a a uh an opportunity now to to to beat uh Open AI in any stretch.>> It certainly seems to signal that. So this code red warning comes from a leaked memo cited by the journal and the information and in it Sam Alman tells staff to pause work on ads health and shopping agents and then shift focus back to their core chat GBT experience faster responses better pe ...
OpenAI's ‘code red' memo lays bare pressure from Google, DeepSeek and its $1.4 trillion AI bet
Youtube· 2025-12-02 18:31
Core Insights - The article discusses the competitive landscape in the AI sector, particularly focusing on OpenAI and Google, highlighting the pressure OpenAI is facing from Google and other competitors [2][3]. Group 1: Competitive Dynamics - OpenAI has issued a "code red" warning, indicating a shift in focus back to enhancing their core ChatGPT experience due to competitive pressures from Google [2]. - Google’s Gemini 3 has reportedly surpassed ChatGPT on key benchmarks, contributing to a significant increase in Gemini's monthly users from 450 million to 650 million [2]. - Deepseek has also introduced two new models that reportedly match both ChatGPT and Gemini 3 in benchmarking tests, indicating a rapidly changing competitive landscape in AI [3]. Group 2: Strategic Responses - OpenAI is under pressure to improve its offerings, as indicated by a memo from CEO Sam Altman directing staff to prioritize faster responses, better personalization, and more reliable answers [2]. - The competitive environment is evolving quickly, with significant changes occurring within weeks rather than months or years, emphasizing the urgency for companies to adapt [3]. - OpenAI is committed to a substantial long-term investment of $1.4 trillion in AI infrastructure, a commitment that was more reassuring to investors when ChatGPT was leading the market [3].
好家伙!DeepSeek 一口气连发 2 个新模型
程序员的那些事· 2025-12-02 13:49
转自:量子位 | 公众号 QbitAI 突袭! ChatGPT发布三周年,DeepSeek嚯一下发出两个模型: 前者聚焦平衡实用 ,适用于日常问答、通用Agent任务、真实应用场景下的工具调用。 推理达GPT-5水平,略低于Gemini-3.0-Pro。 后者主打极致推理, 推理基准性能媲美Gemini-3.0-Pro。 还一把斩获IMO 2025、CMO 2025、ICPC World Finals 2025、IOI 2025金牌。 划重点,ICPC达到人类选手第二、IOI人类选手第十名水平。 具体来说,DeepSeek-V3.2侧重于平衡推理能力与输出长度,降低计算开销。 DeepSeek官微推文中写道,"DeepSeek-V3.2模型在Agent评测中达到了当前开源模型的最高水平"。 该模型其他情况如下: 下图展示的是DeepSeek-V3.2与其他模型在各类Agent工具调用评测集上的得分 DeepSeek-V3.2 DeepSeek-V3.2-Speciale 推理能力比肩GPT-5; 相比Kimi-K2-Thinking大幅缩短输出长度,减少用户等待时间; DeepSeek旗下首个"思考融入工具调 ...
Sam Altman Declares Code Red
Seeking Alpha· 2025-12-02 11:57
Listen on the go! A daily podcast of Wall Street Breakfast will be available by 8:00 a.m. on Seeking Alpha, iTunes, Spotify.Getty Images Good morning! Here is the latest in trending:Sweetened offer: Warner Bros. Discovery (WBD) received a mostly cash offer from Netflix (NFLX), which is arranging a bridge loan worth tens of billions of dollars for its bid.Tariff refund: Costco (COST) sued the U.S. government to ensure it gets a full refund of tariffs if the Supreme Court rules against President Trump's levie ...