Workflow
Seek .(SKLTY)
icon
Search documents
AI三国杀:OpenAI狂卷,DeepSeek封神,却被Mistral偷了家?
3 6 Ke· 2025-12-03 11:55
Core Insights - Mistral has launched two significant products: the Mistral Large 3 model and the Ministral 3 series, both of which are open-source, multimodal, and designed for practical applications [1][3]. Mistral Large 3 - Mistral Large 3 features a MoE architecture with 41 billion active parameters and 675 billion total parameters, showcasing advanced image understanding and multilingual capabilities, ranking 6th among open-source models [3][6]. - It has achieved a high ELO score, placing it in the top tier of open-source models, comparable to Kimi K2 and slightly behind DeepSeek v3.2 [6][10]. - The model performs on par with larger models like DeepSeek 37B and Kimi K2 127B across various foundational tasks, indicating its competitive strength [8][10]. - Mistral has partnered with NVIDIA to enhance the model's stability and performance by optimizing the underlying inference pathways, making it faster and more cost-effective [10][16]. Ministral 3 Series - The Ministral 3 series includes models of 3B, 8B, and 14B sizes, all capable of running on various devices, including laptops and drones, and optimized for performance [11][18]. - The instruct versions of the Ministral 3 models show significant improvements in performance, with scores of 31 (14B), 28 (8B), and 22 (3B), surpassing the previous generation [11][29]. - The 14B version of Ministral has demonstrated superior performance in reasoning tasks, outperforming competitors like Qwen 14B in multiple benchmarks [25][28]. Strategic Positioning - Mistral aims to address enterprise needs by providing customizable AI solutions that are cost-effective and reliable, contrasting with the high costs associated with proprietary models from competitors like OpenAI and Google [29][33]. - The company is evolving into a platform that not only offers models but also integrates various functionalities such as code execution and structured reasoning through its Mistral Agents API [33][37]. - Mistral's approach reflects a shift towards a more decentralized AI model, emphasizing accessibility and usability across different devices and environments, which could reshape the global AI landscape [37][39].
朱啸虎:DeepSeek对人类历史的改变被低估了 |未竟之约
Xin Lang Cai Jing· 2025-12-03 10:40
新浪声明:所有会议实录均为现场速记整理,未经演讲者审阅,新浪网登载此文出于传递更多信息之目 的,并不意味着赞同其观点或证实其描述。 责任编辑:梁斌 SF055 由新浪财经 、微博着力打造,微博财经 × 语言即世界工作室联合出品的泛财经人文对话栏目《未竟之 约》首期深度访谈即将上线。主持人张小珺对话金沙江创投主管合伙人朱啸虎,直面AI浪潮下的激流 与暗礁。 朱啸虎:DeepSeek对人类历史的改变被低估了。 由新浪财经 、微博着力打造,微博财经 × 语言即世界工作室联合出品的泛财经人文对话栏目《未竟之 约》首期深度访谈即将上线。主持人张小珺对话金沙江创投主管合伙人朱啸虎,直面AI浪潮下的激流 与暗礁。 朱啸虎:DeepSeek对人类历史的改变被低估了。 新浪声明:所有会议实录均为现场速记整理,未经演讲者审阅,新浪网登载此文出于传递更多信息之目 的,并不意味着赞同其观点或证实其描述。 责任编辑:梁斌 SF055 ...
老外傻眼,明用英文提问,DeepSeek依然坚持中文思考
3 6 Ke· 2025-12-03 09:14
Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which show significant improvements in reasoning capabilities, with DeepSeek-V3.2 competing directly with GPT-5 and Speciale performing comparably to Gemini-3.0-Pro [1] - There is a notable phenomenon where even when queries are made in English, the model sometimes reverts to using Chinese during its reasoning process, leading to confusion among overseas users [3][5] - The prevalent belief is that Chinese characters have a higher information density, allowing for more efficient expression of the same textual meaning compared to English [5][9] Model Performance and Efficiency - Research indicates that using non-English languages for reasoning can lead to a 20-40% reduction in token consumption without sacrificing accuracy, with DeepSeek R1 showing token reductions ranging from 14.1% (Russian) to 29.9% (Spanish) [9] - A study titled "EfficientXLang" supports the idea that reasoning in non-English languages can enhance token efficiency, which translates to lower reasoning costs and reduced computational resource requirements [6][9] - Another study, "One ruler to measure them all," reveals that English is not the best-performing language for long-context tasks, ranking sixth among 26 languages, with Polish taking the top spot [10][15] Language and Training Data - The observation that Chinese is frequently used in reasoning by models trained on substantial Chinese datasets is considered normal, as seen in the case of the AI programming tool Cursor's new version [17] - The phenomenon of models like OpenAI's o1-pro occasionally using Chinese during reasoning is attributed to the higher proportion of English data in their training, which raises questions about the language selection process in large models [20] - The increasing richness of Chinese training data suggests that models may eventually exhibit more characteristics associated with Chinese language processing [25]
DeepSeek V3.2正式版发布:官方称推理比肩GPT-5
Feng Huang Wang· 2025-12-03 09:04
12月1日,深度求索(DeepSeek)正式发布新一代开源大模型DeepSeek-V3.2及其长思考增强版DeepSeek-V3.2-Speciale。官方网页端、App及API已同步更新 至V3.2版本。 根据官方数据,在公开的推理基准测试中,DeepSeek-V3.2的推理能力达到GPT-5水平,与Gemini-3.0-Pro接近,同时输出长度较Kimi-K2-Thinking显著缩短, 以降低计算开销。V3.2-Speciale版本融合了DeepSeek-Math-V2的定理证明能力,在IMO、CMO、ICPC及IOI等多项国际竞赛中取得金牌成绩,其中ICPC成绩 达到人类选手第二名水平。 新版本首次实现了思考模式与工具调用的融合,支持在思考过程中调用外部工具。通过大规模Agent训练数据合成方法,模型在1800多个环境和超过8.5万条 复杂指令上进行了强化学习训练,提升了泛化能力。官方称其在智能体评测中达到当前开源模型最高水平,进一步缩小了与闭源模型的差距。 此前的实验版本DeepSeek-V3.2-Exp于两个月前发布,经用户反馈测试,其采用的DSA稀疏注意力机制在各项场景中未出现显著性能下降。Sp ...
聊DeepSeek、聊AI硬件、聊竞争对手,OpenAI首席研究官专访信息密度有点大
3 6 Ke· 2025-12-03 07:46
Core Insights - OpenAI's Chief Research Officer Mark Chen discussed the company's strategic vision amid intense AI competition and technological advancements, addressing concerns about talent retention and the pursuit of AGI [1] Group 1: Talent Acquisition and Retention - OpenAI faces aggressive talent poaching from competitors like Meta, which reportedly invests billions annually in recruitment efforts, yet most OpenAI employees have chosen to stay [2] - Despite competitive salary pressures, OpenAI does not engage in salary wars, focusing instead on a shared vision of achieving AGI as the key to retaining talent [2] Group 2: Resource Allocation and Project Management - OpenAI is managing approximately 300 concurrent research projects, with a focus on prioritizing those that are most likely to advance AGI, emphasizing exploratory research over following trends [3] - The company maintains a transparent and strict resource allocation process, allowing for secondary projects but clearly defining their subordinate status to ensure efficiency [3] Group 3: Competitive Landscape and Model Development - OpenAI monitors competitor releases, such as Google's Gemini 3, but maintains its own development pace, emphasizing confidence in internal progress rather than reacting to external pressures [4] - The company is refocusing on pre-training capabilities, which had been deprioritized, believing there is still significant potential for improvement in this area [5] Group 4: AGI Development and Future Goals - Mark Chen believes that significant changes in AI capabilities will occur within the next two years, with goals set for AI to participate in research processes and eventually conduct end-to-end research autonomously [7] - The demand for computational power is expected to remain high, with Chen stating that even a threefold increase in resources would be quickly utilized [8] Group 5: Hardware Development and Future Interactions - OpenAI is collaborating with designer Jony Ive to develop next-generation AI hardware that aims to enhance user interaction by enabling continuous learning and memory capabilities [9] - The goal is to evolve AI from a passive assistant to a more intelligent entity that can remember user interactions and improve over time [9] Group 6: Strategic Focus Amid Competition - In response to the emergence of open-source models like DeepSeek, OpenAI emphasizes the importance of maintaining its research pace and innovation focus, rather than being swayed by competitive pressures [10]
DeepSeek V3.2发布!实测效果惊艳,便宜是最大优势
3 6 Ke· 2025-12-03 03:57
Core Insights - DeepSeek has launched its V3.2 version, which reportedly matches the inference capabilities of OpenAI's GPT-5 while being significantly cheaper [1][22] - The V3.2 version includes two variants: a free version for users and a Speciale version that supports API access, which boasts enhanced reasoning capabilities [2][22] Performance Enhancements - DeepSeek V3.2-Speciale has demonstrated superior performance in various competitions, achieving gold medal results in IMO 2025, CMO 2025, ICPC World Finals 2025, and IOI 2025, outperforming GPT-5 High in all tests [4][22] - The introduction of the DeepSeek Sparse Attention (DSA) mechanism has fundamentally improved the efficiency of attention in AI models, reducing computational costs by over 60% and increasing inference speed by approximately 3.5 times [6][12] Cost Efficiency - The DSA mechanism allows for a significant reduction in the cost of processing long sequences, with costs dropping from $0.7 to $0.2 per million tokens during the pre-fill phase and from $2.4 to $0.8 during the decoding phase [12][22] - This cost reduction positions DeepSeek V3.2 as one of the most affordable models for long-text inference in its category [12][22] Tool Utilization - DeepSeek V3.2 allows the AI model to call tools during its reasoning process without requiring additional training, enhancing its general performance and compatibility with user-created tools [13][22] - The model demonstrates the ability to break down complex tasks and utilize different tools effectively, showcasing its decision-making capabilities [20][22] Market Impact - The release of DeepSeek V3.2 challenges the notion that open-source models lag behind closed-source counterparts, as it offers competitive performance at a fraction of the cost [22][23] - The DSA mechanism's cost revolution is expected to significantly impact the commercialization of AI models, making advanced AI applications more accessible to smaller enterprises and consumers [22][23]
如果你非得用DeepSeek看病,建议这么看(附详细提问模版)
3 6 Ke· 2025-12-03 03:23
你用DeepSeek看过病了吗? 打开它,说出自己的不舒服或拍照上传检查结果,几秒后就能得到诊断和治疗建议。继续问这个病是怎么回事或药怎么用,它还能给出更详细易懂的解 释,有问必答。 不花钱、不用抢号,还比医生耐心得多,是不是以后看病找DeepSeek就行?如果问DeepSeek本人,它会回答: DeepSeek对自己可不可以看病的回答 | DeepSeek截图 实际让DeepSeek看一次病,你会在回复的末尾见到一个提示框: 3. 开出另外几项检查,分辨表现相近的疾病、确定诊断; 问其他问题的时候,一般不会出现这个提示框 | DeepSeek截图 "不能""不应""仅供参考",这是DeepSeek太过谦虚,还是看病这件事有什么特殊的地方? 下面,我们来看看到底能不能用DeepSeek看病,和怎么用它把病看得更好(附详细提问模板)。 能不能用AI看病?当专家不能,当助手很能 有一种用DeepSeek等人工智能助手(AI)看病的方法是,得到它的回复之后就给自己确诊,然后听从AI建议开始吃药,就像刚刚找医学专家看过病。 可是,医学专家看病时很少单凭几句描述或者一张检查单,就给出一个确定的诊断,接来下可能还会做这些 ...
DeepSeek杀出一条血路:国产大模型突围不靠运气
3 6 Ke· 2025-12-03 03:21
进入2025年末,全球大模型赛道的技术焦点几乎被Google重新夺回。Gemini 3 Pro横空出世,在多个权 威基准上超越所有开源模型,重新确立了闭源阵营的技术高地。一时间,业内关于"开源模型是否已到 极限""Scaling Law是否真的撞墙"的质疑声再起,一股迟滞情绪在开源社区弥漫。 但就在此时,DeepSeek没有选择沉默。12月1日,它一口气发布了两款重磅模型:推理性能对标GPT-5 的DeepSeek-V3.2,以及在数学、逻辑和多轮工具调用中表现异常强势的Speciale版本。这不仅是对技术 能力的集中展示,也是在当前算力资源并不占优的前提下,对闭源"新天花板"的正面回应。 这不是一次简单的模型更新。DeepSeek试图在后Scaling时代找出一条全新路径:如何用架构重塑弥补 预训练差距?如何通过"工具使用中的思考链"实现低token高效率的智能体表现?更关键的是,Agent为 何从附属功能变成了模型能力跃迁的核心引擎? 本文将围绕这三条主线展开分析:DeepSeek是如何在技术瓶颈下突破的?为何率先在开源阵营中重注 Agent?而这是否意味着,开源模型仍有穿透闭源护城河的那条路? 这背后的 ...
DeepSeek发布新模型!创业板50ETF(159949)涨0.48%,机构持续看好AI产业链投资机会
Xin Lang Cai Jing· 2025-12-03 02:33
Core Viewpoint - The news highlights the performance of the ChiNext 50 ETF (159949), which has shown a slight increase of 0.48% to 1.467 CNY, amidst a broader market fluctuation, indicating ongoing investor interest and activity in the growth sector [1][6]. Market Performance - As of 10:20 AM on December 3, the ChiNext 50 ETF (159949) was trading at 1.467 CNY, with a trading volume of 4.22 billion CNY and a turnover rate of 1.66% [1][6]. - The ETF has experienced a cumulative trading amount of 323.05 billion CNY over the last 20 trading days, averaging 16.15 billion CNY per day, and a total of 3,205.79 billion CNY over 222 trading days this year, averaging 14.44 billion CNY per day [7][10]. Top Holdings - The top ten holdings of the ChiNext 50 ETF (159949) include leading companies such as CATL, Zhongji Xuchuang, Dongfang Caifu, Xinyi Technology, Sungrow Power, Shenghong Technology, Huichuan Technology, Mindray, Yiwei Lithium Energy, and Tonghuashun [3][8]. Industry Insights - Longcheng Securities reports that the continuous implementation of AI applications will drive the acceleration of computing infrastructure, particularly in the AIDC industry chain, which includes optical modules, PCBs, and main equipment manufacturers, indicating a strong demand release and potential for performance and valuation growth [10]. - The report suggests that the demand for edge computing modules will steadily increase as AI applications continue to develop, transitioning from traditional data transmission modules to intelligent and computing modules [10]. Investment Recommendations - The ChiNext 50 ETF (159949) is presented as a convenient and efficient investment tool for investors looking to capitalize on the long-term growth of China's technology sector, with recommendations for dollar-cost averaging or phased investment strategies to mitigate short-term volatility [10].
AI产业速递:从DeepSeek V3
2025-12-03 02:12
Summary of Key Points from the Conference Call Industry and Company Overview - The conference call discusses advancements in the AI industry, specifically focusing on the Deepseek V3.2 model developed by DeepMind, which showcases significant improvements in reinforcement learning and inference efficiency [1][3][5]. Core Insights and Arguments - **Model Architecture and Mechanisms**: Deepseek V3.2 introduces the Dynamic Spatial Attention (DSA) mechanism, replacing the previous Multi-Level Attention (MLA) mechanism. DSA optimizes computational efficiency by focusing on key attention parameters, particularly in complex tasks [3][5]. - **Performance Enhancements**: The C9 version of Deepseek V3.2 utilizes approximately 10% of the pre-training computational resources to significantly enhance its performance in complex tasks, such as code debugging, achieving a global leading level [1][3]. - **Context Management Strategy**: The model employs an efficient context management strategy that intelligently handles frequent task switching, multi-turn dialogues, and ambiguous inputs, effectively reducing inference costs [1][3]. - **Synthetic Data Utilization**: The training process for Deepseek V3.2 incorporates a substantial amount of high-difficulty synthetic data, which has doubled compared to previous versions. This data is crucial for the subsequent reinforcement learning phase and requires significant computational resources [1][6]. - **Open Source Innovations**: Deepseek has made strides in open-source capabilities by completing a comprehensive post-training process and supporting agent invocation, potentially leveling the playing field with closed-source models [7]. Additional Important Insights - **Reinforcement Learning Developments**: The evolution of reinforcement learning techniques has been marked by the introduction of human prompts based on Rubik's rules, enhancing the model's ability to think and execute simultaneously, thus improving overall efficiency [8][9]. - **Future of Model Pricing**: It is anticipated that by 2026, the cost of models will significantly decrease, potentially dropping to one-fifth of current prices due to advancements in technology and competitive pricing strategies among vendors [2][20]. - **Impact of Sparsity Techniques**: The implementation of sparsity techniques is expected to lower training computational requirements while increasing the upper limits of model training, encouraging more startups to engage in large model development [2][19]. - **Vertical Scene Task Solutions**: The application of reinforcement learning in e-commerce platforms illustrates the model's ability to adapt recommendations based on user feedback through multi-turn dialogue mechanisms, enhancing user satisfaction [12]. Conclusion - The advancements in Deepseek V3.2 highlight a significant shift in the AI landscape, emphasizing the importance of efficient computational mechanisms, the role of synthetic data, and the potential for open-source models to compete with proprietary solutions. The expected decrease in model costs and the rise of new startups indicate a dynamic and evolving market landscape [1][2][20].