闭源模型 - filings, earnings calls, financial reports, news - Reportify

闭源模型

Search documents

AI泡沫要破？巨佬颠覆认知的观点来了！

Ge Long Hui· 2025-12-04 07:29

大模型的决战越来越激烈了!谷歌的崛起令OpenAI感到恐惧，并酝酿新的大动作! OpenAI直接拉响警报，推迟赚钱的广告业务，也要把所有资源梭哈到ChatGPT的改进上。现在的AI圈子，像是星球大战前夜，由于恐惧，每个人都把手指扣在了扳机上。兵荒马乱的年代，蔡崇信在香港大学炉边对话中，抛出了非常反直觉的观点：现在美国人定义谁赢得AI竞赛的方式，纯粹是看大型语言模型，我们不看美国定义的AI竞赛。当所有人都在盯着谁的模型参数大、谁的算力强时，蔡崇信却认为——胜负手根本不在这里。如果不看模型，这场万亿赌局的赢家到底看什么?中国手里到底还有没有牌? 看完发现，原来大佬眼里的世界，和我们看到的完全不一样。 1 中国AI的真正优势现在美国硅谷大模型怎么算输赢?很简单：看谁的"大语言模型"更强、更聪明、参数更多。今天是OpenAI遥遥领先，明天Anthropic发个新版本追平，后天谷歌又搞个大新闻。大家都在卷模型，仿佛谁的模型智商高了一点，谁就统治了世界。但在蔡崇信看来，事实未必如此。他在演讲中说了这么一句极具穿透力的话： "The winner is not about who has the bes ...

闭源越跑越快之后，DeepSeek V3.2 如何为开源模型杀出一条新路

深思SenseAI· 2025-12-03 09:51

Core Viewpoint - The article emphasizes that closed-source models are increasingly outperforming open-source models in complex tasks, with the performance gap widening over time [1]. Group 1: Key Issues with Open-Source Models - Open-source models face three critical issues: reliance on Vanilla Attention mechanisms limits computational efficiency in long-sequence scenarios, insufficient computational resources during post-training phases restrict performance on difficult tasks, and significant lag in generalization and instruction-following capabilities compared to closed-source systems [2]. Group 2: DeepSeek's Innovations - DeepSeek introduced two new models, DeepSeek V3.2 and DeepSeek V3.2 Speciale, which address the aforementioned issues through three improvements: the introduction of a highly efficient attention mechanism called DSA (DeepSeek Sparse Attention) to reduce computational complexity, a stable and scalable reinforcement learning protocol to significantly increase computational resources during post-training, and a new data pipeline to enhance generalization and instruction-following capabilities in AI agent scenarios [2][3]. Group 3: DSA Mechanism - The DSA mechanism reduces the complexity of core attention from O(L^2) to O(L*k), where k is much smaller than L, thus maintaining model performance even in long-context scenarios [11]. The DSA employs a two-stage sparsification mechanism that transforms full computation into selective computation, enhancing efficiency [7][10]. Group 4: Reinforcement Learning Strategy - DeepSeek V3.2 allocates over 10% of the computational budget to post-training, exceeding pre-training costs, and employs a mixed reinforcement learning approach to optimize performance [12][14]. This strategy combines reasoning, agent, and human alignment tasks into a single RL phase to mitigate catastrophic forgetting common in traditional multi-stage training [14]. Group 5: Impact on Open-Source Ecosystem - DeepSeek's advancements demonstrate that significant improvements in model performance can be achieved without relying on closed-source systems, suggesting a shift back to a more research-driven era in large model development. The company sets a precedent for the open-source community on how to innovate within limited budgets and reshape agent systems [16].

Artificial Intelligence

DeepSeek V3.2 Speciale

Artificial Intelligence

DeepSeek V3.2 Speciale

DeepSeek杀出一条血路：国产大模型突围不靠运气

3 6 Ke· 2025-12-03 03:21

进入2025年末，全球大模型赛道的技术焦点几乎被Google重新夺回。Gemini 3 Pro横空出世，在多个权威基准上超越所有开源模型，重新确立了闭源阵营的技术高地。一时间，业内关于"开源模型是否已到极限""Scaling Law是否真的撞墙"的质疑声再起，一股迟滞情绪在开源社区弥漫。但就在此时，DeepSeek没有选择沉默。12月1日，它一口气发布了两款重磅模型：推理性能对标GPT-5 的DeepSeek-V3.2，以及在数学、逻辑和多轮工具调用中表现异常强势的Speciale版本。这不仅是对技术能力的集中展示，也是在当前算力资源并不占优的前提下，对闭源"新天花板"的正面回应。这不是一次简单的模型更新。DeepSeek试图在后Scaling时代找出一条全新路径：如何用架构重塑弥补预训练差距？如何通过"工具使用中的思考链"实现低token高效率的智能体表现？更关键的是，Agent为何从附属功能变成了模型能力跃迁的核心引擎？本文将围绕这三条主线展开分析：DeepSeek是如何在技术瓶颈下突破的？为何率先在开源阵营中重注 Agent？而这是否意味着，开源模型仍有穿透闭源护城河的那条路？这背后的 ...

Seek .(US:SKLTY)

Artificial Intelligence

Artificial Intelligence

开源最强！“拳打GPT 5”，“脚踢Gemini-3.0”，DeepSeek V3.2为何提升这么多？

华尔街见闻· 2025-12-02 04:21

Core Insights - DeepSeek has released two official models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, with the former achieving performance levels comparable to GPT-5 and the latter winning gold medals in four international competitions [1][3]. Model Performance - DeepSeek-V3.2 has reached the highest level of tool invocation capabilities among current open-source models, significantly narrowing the gap with closed-source models [2]. - In various benchmark tests, DeepSeek-V3.2 achieved a 93.1% pass rate in AIME 2025, closely trailing GPT-5's 94.6% and Gemini-3.0-Pro's 95.0% [20]. Training Strategy - The model's significant improvement is attributed to a fundamental change in training strategy, moving from a simple "direct tool invocation" to a more sophisticated "thinking + tool invocation" mechanism [9][11]. - DeepSeek has constructed a new large-scale data synthesis pipeline, generating over 1,800 environments and 85,000 complex instructions specifically for reinforcement learning [12]. Architectural Innovations - The introduction of the DeepSeek Sparse Attention (DSA) mechanism has effectively addressed efficiency bottlenecks in traditional attention mechanisms, reducing complexity from O(L²) to O(Lk) while maintaining model performance [6][7]. - The model's architecture allows for better context management, retaining relevant reasoning content during tool-related messages, thus avoiding inefficient repeated reasoning [14]. Competitive Landscape - The release of DeepSeek-V3.2 signals a shift in the competitive landscape, indicating that the absolute technical monopoly of closed-source models is being challenged by open-source models gaining first-tier competitiveness [20][22]. - This development has three implications: lower costs and greater customization for developers, reduced reliance on overseas APIs for enterprises, and a shift in the industry focus from "who has the largest parameters" to "who has the strongest methods" [22].

Artificial Intelligence

Artificial Intelligence

DeepSeek又上新！模型硬刚谷歌承认开源与闭源差距拉大

Di Yi Cai Jing· 2025-12-01 23:13

Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which are positioned to compete with leading proprietary models like GPT-5 and Gemini 3.0, showcasing significant advancements in reasoning capabilities [1][4]. Model Overview - DeepSeek-V3.2 aims to balance reasoning ability and output length, making it suitable for everyday applications such as Q&A and general intelligence tasks. It has achieved performance levels comparable to GPT-5 and is slightly below Google's Gemini 3 Pro in public reasoning tests [4]. - DeepSeek-V3.2-Speciale is designed to push the limits of reasoning capabilities, integrating enhanced long-thinking features and theorem-proving abilities from DeepSeek-Math-V2. It has surpassed Gemini 3 Pro in several reasoning benchmarks, including prestigious math competitions [4][5]. Benchmark Performance - In various benchmarks, DeepSeek models have shown competitive results: - AIME 2025: DeepSeek-V3.2 scored 93.1, while GPT-5 and Gemini-3.0 scored 94.6 and 95.0 respectively [5]. - Harvard MIT Math Competition: DeepSeek-V3.2-Speciale scored 92.5, outperforming Gemini 3 Pro's 97.5 [5]. - International Math Olympiad: DeepSeek-V3.2-Speciale scored 78.3, close to Gemini 3 Pro's 83.3 [5]. Limitations and Future Plans - Despite these achievements, DeepSeek acknowledges limitations compared to proprietary models, including narrower world knowledge and lower token efficiency. The team plans to enhance pre-training and optimize reasoning chains to improve model performance [6][7]. - DeepSeek has identified three key areas where open-source models lag behind proprietary ones: reliance on standard attention mechanisms, insufficient computational resources during post-training, and gaps in generalization and instruction-following capabilities [7]. Technological Innovations - DeepSeek has introduced a sparse attention mechanism (DSA) to reduce computational complexity without sacrificing long-context performance. This innovation has been integrated into the new models, contributing to significant performance improvements [7]. Availability - The official website, app, and API for DeepSeek-V3.2 have been updated, while the enhanced Speciale version is currently available only through a temporary API for community evaluation [8]. Community Reception - The release has been positively received in social media, with users noting that DeepSeek's models have effectively matched the capabilities of GPT-5 and Gemini 3 Pro, highlighting the importance of rigorous engineering design over sheer parameter size [9].

Seek .(US:SKLTY)

稀疏注意力机制（DSA）

Artificial Intelligence

DeepSeek-V3.2-Speciale

稀疏注意力机制（DSA）

Artificial Intelligence

DeepSeek-V3.2-Speciale

开源最强！“拳打GPT 5”，“脚踢Gemini-3.0”，DeepSeek V3.2为何提升这么多？

美股IPO· 2025-12-01 22:29

V3.2在工具调用能力上达到当前开源模型最高水平，大幅缩小了开源模型与闭源模型的差距。作为DeepSeek首个将思考融入工具使用的模型，V3.2 在"思考模式"下仍然支持工具调用。公司通过大规模Agent训练数据合成方法，构造了1800多个环境、85000多条复杂指令的强化学习任务，大幅提升了模型在智能体评测中的表现。在大模型赛道逐渐从"参数竞赛"走向"能力竞赛"的当下，一个显著的变化正在发生：开源模型开始在越来越多关键能力维度上逼近、甚至冲击顶级闭源模型。 12月1日，DeepSeek同步发布两款正式版模型—— DeepSeek-V3.2 与 DeepSeek-V3.2-Speciale ，前者在推理测试中达到GPT-5水平，仅略低于 Gemini-3.0-Pro，而后者在IMO 2025等四项国际顶级竞赛中斩获金牌。 V3.2在工具调用能力上达到当前开源模型最高水平，大幅缩小了开源模型与闭源模型的差距。据官方介绍， V3.2是DeepSeek首个将思考融入工具使用的模型，在"思考模式"下仍然支持工具调用。该公司通过大规模Agent训练数据合成方法，构造了1800多个环境、85000多条复杂指令的 ...

DeepSeek-V3.2-Speciale

DeepSeek-V3.2-Speciale

DeepSeek又上新！模型硬刚谷歌，承认开源与闭源差距拉大

Di Yi Cai Jing· 2025-12-01 13:31

Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which are leading in reasoning capabilities globally [1][3]. Model Overview - DeepSeek-V3.2 aims to balance reasoning ability and output length, suitable for everyday use such as Q&A and general intelligence tasks. It has reached the level of GPT-5 in public reasoning tests, slightly below Google's Gemini3 Pro [3]. - DeepSeek-V3.2-Speciale is designed to push the reasoning capabilities of open-source models to the extreme, combining features from DeepSeek-Math-V2 for theorem proving, and excels in instruction following and logical verification [3][4]. Performance Metrics - Speciale has surpassed Google's Gemini3 Pro in several reasoning benchmark tests, including the American Mathematics Invitational, Harvard MIT Mathematics Competition, and International Mathematical Olympiad [4]. - In various benchmarks, DeepSeek's performance is competitive, with specific scores noted in a comparative table against GPT-5 and Gemini-3.0 [5]. Technical Limitations - Despite achievements, DeepSeek acknowledges limitations compared to proprietary models like Gemini3 Pro, particularly in knowledge breadth and token efficiency [6]. - The company plans to enhance pre-training computation and optimize reasoning chains to improve model efficiency and capabilities [6][7]. Mechanism Innovations - DeepSeek introduced a Sparse Attention Mechanism (DSA) to reduce computational complexity, which has proven effective in enhancing performance without sacrificing long-context capabilities [7][8]. - Both new models incorporate this mechanism, making DeepSeek-V3.2 a cost-effective alternative that narrows the performance gap with proprietary models [8]. Community Reception - The release has been positively received in the community, with users noting that DeepSeek's models are now comparable to GPT-5 and Gemini3 Pro, marking a significant achievement in open-source model development [8].

Seek .(US:SKLTY)

稀疏注意力机制（DSA）

Artificial Intelligence

稀疏注意力机制（DSA）

Artificial Intelligence

“力量平衡变了，中国AI愈发成为硅谷技术基石”

Guan Cha Zhe Wang· 2025-12-01 00:19

Core Viewpoint - The article discusses the increasing adoption of Chinese open-source AI models by Silicon Valley startups, highlighting their competitive advantages over traditional closed-source models from American companies like OpenAI and Anthropic. This shift raises questions about the sustainability of the closed-source model approach in the U.S. AI industry [1][4][10]. Group 1: Adoption of Chinese AI Models - Many U.S. AI startups are increasingly utilizing Chinese open-source AI models due to their lower costs, higher customization, and strong privacy protection, with some models performing comparably to leading American models [1][4][6]. - Reflection AI, a startup founded by Misha Laskin, aims to provide American alternatives to these high-performance Chinese models, reflecting a growing trend in the industry [2][4]. - The acceptance of Chinese models is seen as a potential challenge to the U.S. AI industry, as investors have heavily backed American companies, raising doubts about the actual advantages of U.S. models [4][10]. Group 2: Performance and Cost Efficiency - Chinese models like DeepSeek and Alibaba's Tongyi Qianwen have made significant technological advancements, closing the performance gap with American closed-source models [5][9]. - Companies like Exa have reported that running Chinese models on their own hardware can be faster and cheaper than using models from OpenAI or Google [4][5]. - The cost-effectiveness of open-source models is crucial for startups, with some users preferring local processing for privacy reasons, further driving the adoption of Chinese models [6][7]. Group 3: Ecosystem and Community Support - The growing ecosystem around Chinese open-source models is attracting more developers, as these models are often accompanied by extensive training resources and community support [7][8]. - Platforms like Kilo Code show a preference for Chinese models among developers, indicating a shift in the default starting point for model customization [8][9]. - The rapid release cycle of Chinese models, with Alibaba launching new models approximately every 20 days, contrasts with the slower pace of American companies, highlighting a competitive edge [9][10]. Group 4: U.S. Response and Future Outlook - The U.S. government has recognized the need to encourage the development of open-source AI models, as evidenced by the release of the AI Action Plan and new open-source initiatives from companies like OpenAI and the Allen Institute [12][13]. - The ATOM initiative aims to reclaim the U.S. leadership position in open-source models, emphasizing the importance of maintaining a competitive edge in the AI landscape [13].

Artificial Intelligence

中国开源AI模型

OpenAI闭源旗舰模型

Artificial Intelligence

中国开源AI模型

OpenAI闭源旗舰模型

技术先行：阿里千问APP为何跑出更快的C端加速度？

Sou Hu Cai Jing· 2025-11-24 18:24

Core Insights - The article discusses the emerging narrative of "catching up" in the AI large model sector between China and the US, highlighting the competitive dynamics between Google and Alibaba [2][6] - Both companies are pursuing a "full-stack" approach, integrating cloud computing, chips, large models, and applications to create a comprehensive ecosystem [4][6] Group 1: Company Strategies - Google was initially perceived as lagging in AI, but the release of Gemini 3 has garnered positive feedback from industry leaders [3][6] - Alibaba's Qwen series models have achieved significant success, with the Qwen app surpassing 10 million downloads in its first week, breaking previous records [4][7] - Both companies are focusing on building robust foundational technologies before launching consumer-facing applications, demonstrating strategic patience [8][10] Group 2: Market Dynamics - The AI landscape is characterized by instability, with user engagement fluctuating significantly among competing applications [10][11] - Alibaba's Qwen model has become the most widely downloaded open-source large model globally, indicating a shift in developer preferences towards open-source solutions [12][13] - The competition between open-source and closed-source models is highlighted, with Alibaba favoring an open-source approach to foster a developer ecosystem, while Google maintains a closed-source strategy to protect its core assets [11][12] Group 3: Future Outlook - The article suggests that the ultimate goal for AI applications is to create a "business closed loop" that continuously generates value for users [19][21] - Alibaba's strategy includes leveraging its AI capabilities to enhance existing business operations, creating a seamless integration of AI across its services [22][23] - The full-stack approach adopted by both companies is expected to yield higher value elasticity and resilience in the face of market fluctuations [23]

阿里千问APP

阿里千问APP

中美大模型分歧下，企业们也站在选择路口

财富FORTUNE· 2025-11-22 13:09

祥峰投资东南亚与印度区执行董事Chan Yip Pang认为，公司选择路线时要基于使用目的——是将它用于内部生产力的提升，还是用于原生AI应用程序的构建？如果是前者，企业要测试AI解决方案是否真的能够提高生产力，那么通常会采用闭源模型，这样可以迅速获取投资回报率。但随着时间推移，费用会逐渐增加，在一个时间点公司会为了降低成本转向开源。如果是为了开发AI应用并将其作为服务销售的初创公司，选择开源模型是更好的选择，因为开源模式能够让公司完全掌控技术栈，成本可控，且不必依赖大模型背后的巨头。相比之下，闭源模型随时可能涨价，甚至改变模型特征，而用户公司对此毫无还手之力。来自金融科技领域的Dyna.AI总经理兼投资者关系负责人Cynthia Siantar指出，她所在的领域受到严格监管，监管者不会问公司的大模型是开源还是闭源，而是会问如何做出决策的？公司需要对此给出解释，这时开源模型的优势就会凸显。 Amplify AI Group首席执行官Will Liang的客户大多来自金融服务行业，他表示，如果AI是用于关乎公司竞争优势和机密的事项，大多情况下开源模式是更安全的选择，因为公司可以亲自部署并严 ...

通用人工智能（AGI）

通用人工智能（AGI）