开源大模型
Search documents
扎克伯格押注阿里千问,全球AI竞赛格局变了
Sou Hu Cai Jing· 2025-12-12 04:19
千问的崛起和广泛应用,证明了在软件和算法层面,中国已经具备了与硅谷分庭抗礼,甚至在开源生态 上略胜一筹的实力。 撰文丨沸雪 谁也没想到,美股科技七巨头之首的Meta创始人扎克伯格,居然有一天也成为了中国AI模型支持者。 12月10日,彭博社报道称,曾经的全球开源霸主Meta新模型"牛油果"(Avocado)项目,选择蒸馏中国 阿里千问的开源模型。 根据报道,扎克伯格密切关注新组建的TBD实验室团队,他们的"牛油果"模型训练,蒸馏了多方开源模 型,除了谷歌的Gemma、OpenAI的gpt-oss之外,这一次还出乎大家预料地选择了中国科技巨头阿里巴 巴旗下的通义千问。 这也意味着,扎克伯格面对日益强大的中国开源模型,出现了180度的态度转变,此前,扎克伯格多次 呼吁要支持美国模型,然而随着Meta今年Llama4的失败和中国模型的强势崛起,扎克伯格也转投阿里 千问。 那么问题来了,为什么一度被视为"美国优先"的硅谷开源强硬派的小扎,如今也开始选择中国AI大厂作 为自己的模型底座? 开源开放、全栈AI 开源,应该可以算是这场自2023年打响的AI战争中的最大变量。 曾几何时,Meta凭借Llama系列模型,几乎以 ...
开源模式重构产业竞争格局
Jing Ji Ri Bao· 2025-12-10 22:38
Core Insights - The open-source ecosystem in China is rapidly expanding, with over 3 million active projects and 2.27 million developers expected by the end of 2024, indicating a diverse and large talent pool [1] - The openEuler operating system has seen significant growth, with an expected installation base of over 16 million units by the end of 2025, making it a leading choice in various industries [1] - Open-source initiatives are driving technological breakthroughs and high-quality development, particularly in the AI sector, where China is positioned as a leader with projects like Qwen and DeepSeek [1] Group 1 - The open-source community has grown significantly, with over 2,100 member organizations and more than 23,000 global contributors, alongside a user base exceeding 5.5 million [1] - The open-source model is reshaping the AI competitive landscape, as demonstrated by the recent success of 360 Group's FG-CLIP2 model, which surpassed major competitors in benchmark tests [2] - The UBML project, part of Inspur's low-code platform, aims to lower the barriers for small and medium enterprises to adopt open-source technologies, facilitating efficient technology transfer across the industry [2] Group 2 - Beijing E-Town is establishing itself as a hub for high-tech industries, implementing policies to support open-source projects and creating the first AI open-source root community in China [3] - The Open Atom Open Source Foundation is enhancing its services for project incubation and talent development, promoting open-source culture through various channels [3] - The transition of open-source communities towards intelligent development communities is seen as a necessary evolution to meet technological and industry demands [3][4]
开源和闭源模型的差距在拉大:这是DeepSeek论文揭示的残酷真相
3 6 Ke· 2025-12-06 00:03
Core Insights - DeepSeek's V3.2 technical report indicates that the performance gap between open-source models and closed-source models is not narrowing but rather widening, based on extensive empirical data [1][2]. Performance Comparison - In benchmark tests, DeepSeek V3.2 scored 85.0 in MMLU-Pro, while GPT-5 scored 87.5 and Gemini 3.0 Pro achieved 90.1. In the GPQA Diamond test, the scores were 82.4 for DeepSeek, 85.7 for GPT-5, and 91.9 for Gemini 3.0 Pro [2][3]. - The most significant gap was observed in the HLE test, where DeepSeek V3.2 scored 25.1, compared to GPT-5's 26.3 and Gemini 3.0 Pro's 37.7, indicating a substantial performance disparity [3][4]. Structural Issues Identified - The report identifies three structural issues limiting the capabilities of open-source models in complex tasks: 1. **Architectural Limitations**: Open-source models rely on traditional vanilla attention mechanisms, which are inefficient for long sequences, hindering scalability and effective post-training [6]. 2. **Resource Investment Gap**: The post-training budget for DeepSeek V3.2 exceeds 10% of its pre-training costs, while most open-source models allocate less than 1%, leading to significant performance differences [7]. 3. **AI Agent Capability Lag**: Open-source models show inferior generalization and instruction-following abilities in real-world applications, as evidenced by lower scores in key agent evaluation benchmarks [8]. DeepSeek's Strategic Innovations - DeepSeek has implemented fundamental technical innovations across three core dimensions: 1. **Architectural Changes**: Introduction of the DSA (DeepSeek Sparse Attention) mechanism, which reduces computational complexity from O(L²) to O(L×k), significantly lowering inference costs while maintaining performance [10]. 2. **Increased Resource Allocation**: DeepSeek has made an unprecedented decision to allocate substantial resources for post-training, training expert models in six key areas with a total of 943.7 billion tokens during the pre-training phase [12]. 3. **Enhanced Agent Capabilities**: Development of a systematic task synthesis process, creating over 1,800 diverse environments and 85,000 complex prompts, which has improved performance in agent-related tests [13]. Conclusion - DeepSeek V3.2 demonstrates a viable path for open-source AI to compete with closed-source models through innovative architecture and strategic resource allocation, suggesting that technological innovation may be the key to survival in the competitive AI landscape [14].
每日报告精选-20251205
GUOTAI HAITONG SECURITIES· 2025-12-05 13:30
Group 1: DeepSeek-V3.2 Series Release - The release of DeepSeek-V3.2 marks a significant advancement in open-source large models, achieving performance levels comparable to top closed-source models[3] - The Speciale version of DeepSeek-V3.2 has excelled in international competitions, ranking second in the ICPC and winning gold medals in the IMO, demonstrating its potential to reach human-level intelligence[4] - DeepSeek-V3.2 integrates thinking modes with tool invocation, enhancing the model's generalization and execution capabilities across complex scenarios[5] Group 2: Market Trends and Predictions - The 2025 Winter FORCE Conference is set to focus on Agentic AI, with significant updates expected for the Doubao model family and AI application capabilities[9] - Doubao model's daily token usage surged from 120 billion in May 2024 to over 30 trillion by September 2025, indicating a 253-fold increase in usage[10] - The report predicts that the 2026 monetary policy will emphasize "wide credit" rather than merely "wide loans," aligning with fiscal measures to support economic growth[35] Group 3: Company Coverage and Financial Projections - Faway Automobile Components (600742) is rated "Overweight" with a target price of RMB 14.10, based on stable automotive parts business and expansion into robotics and low-altitude economy[13] - Projected revenues for Faway are RMB 208.72 million, RMB 220.62 million, and RMB 231.65 million for 2025, 2026, and 2027 respectively, with net profits of RMB 6.30 million, RMB 6.99 million, and RMB 7.75 million[13] - The company is actively developing humanoid robots and EVTOL interior designs, leveraging its automotive parts manufacturing expertise[15]
超级大肉!国产GPU第一股上市,最高涨超500%,中一签狂赚27万!股民:我要酸死了...
雪球· 2025-12-05 07:52
↑点击上面图片 加雪球核心交流群 ↑ 午后市场持续拉升,截至收盘,沪指涨0.7%,深成指涨1.08%,创业板指涨1.36%。 沪深两市成交额1.73万亿,较上一个交易日放量1768亿,个股涨多跌少,全市场近4400只个股上涨。 板块方面,保险、贵金属、福建、商业航天等板块涨幅居前,银行、中药、影视院线等板块跌幅居前。 看到这种超级大肉签,不少雪球APP用户表示酸死了... 此外,今天最值得关注的是摩尔线程上市,盘中最高涨超500%,中一签开盘卖出赚约27万。 01 摩尔线程上市 中一签狂赚27万 12月5日,被称为"国产GPU第一股"的摩尔线程登陆科创板。 开盘 摩尔线程竞价高开468%, 一度大涨超500 %,盘中最高价688元, 随后震荡调整。截至收盘,该股报600.50元/股,总市值为2822亿元。 投资者 中一签开盘卖出可以盈利27万左右。 | | | | 卖4 | | -- | | --- | --- | --- | --- | --- | --- | | 114.28 | | 0.00% | 卖3 | | 0 | | | | | 卖2 | | | | | | | 卖1 | 600.50 | | | ...
国泰海通|计算机:DeepSeek-V3.2系列发布:推理能力对标顶尖闭源,开源生态引领应用落地
国泰海通证券研究· 2025-12-04 12:46
Core Insights - The release of DeepSeek-V3.2 and its enhanced version V3.2-Speciale marks a significant advancement in open-source large models, achieving top-tier performance and practicality, particularly in reasoning capabilities and tool integration [2][3]. Group 1: Performance and Innovation - DeepSeek-V3.2 series has reached a breakthrough in core reasoning capabilities, matching the performance of top closed-source models and significantly outperforming some open-source models focused on long contexts [2]. - The Speciale version has excelled in international competitions, achieving gold medals in events like the International Mathematical Olympiad (IMO) and the International Collegiate Programming Contest (ICPC), where it ranked second among human competitors [2]. - The model innovatively integrates thinking modes with tool invocation, enhancing the agent's generalization and execution capabilities in complex scenarios [3]. Group 2: Technical Advancements - DeepSeek-V3.2 is the first open-source model to systematically incorporate chain-of-thought reasoning into the tool invocation process, utilizing a unique large-scale agent training data synthesis method [3]. - The model has undergone reinforcement learning across over 85,000 complex instructions in more than 1,800 environments, achieving the highest level among open-source models in untrained tool invocation assessments [3]. Group 3: Ecosystem and Market Impact - The comprehensive upgrade of DeepSeek-V3.2's open-source and API services is expected to accelerate technological penetration and drive a transformation in industrial application paradigms [4]. - The open strategy, combining performance and ecosystem openness, significantly lowers the application barriers for enterprises and developers, potentially leading to a large-scale, practical deployment of open-source models [4]. - This approach is anticipated to attract numerous developers to build vertical applications based on DeepSeek, forming a robust open-source application ecosystem centered around it [4].
DeepSeek V3.2正式版发布:官方称推理比肩GPT-5
Feng Huang Wang· 2025-12-03 09:04
12月1日,深度求索(DeepSeek)正式发布新一代开源大模型DeepSeek-V3.2及其长思考增强版DeepSeek-V3.2-Speciale。官方网页端、App及API已同步更新 至V3.2版本。 根据官方数据,在公开的推理基准测试中,DeepSeek-V3.2的推理能力达到GPT-5水平,与Gemini-3.0-Pro接近,同时输出长度较Kimi-K2-Thinking显著缩短, 以降低计算开销。V3.2-Speciale版本融合了DeepSeek-Math-V2的定理证明能力,在IMO、CMO、ICPC及IOI等多项国际竞赛中取得金牌成绩,其中ICPC成绩 达到人类选手第二名水平。 新版本首次实现了思考模式与工具调用的融合,支持在思考过程中调用外部工具。通过大规模Agent训练数据合成方法,模型在1800多个环境和超过8.5万条 复杂指令上进行了强化学习训练,提升了泛化能力。官方称其在智能体评测中达到当前开源模型最高水平,进一步缩小了与闭源模型的差距。 此前的实验版本DeepSeek-V3.2-Exp于两个月前发布,经用户反馈测试,其采用的DSA稀疏注意力机制在各项场景中未出现显著性能下降。Sp ...
从开源最强到挑战全球最强:DeepSeek新模型给出了解法
Guan Cha Zhe Wang· 2025-12-02 11:38
Core Insights - DeepSeek has released two official models: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, with the former focusing on balancing reasoning ability and output length for everyday use, while the latter enhances long-form reasoning and mathematical proof capabilities [1][2][4] - The open-source large model ecosystem has seen significant growth, with DeepSeek's advancements posing a challenge to closed-source models, particularly in light of the recent release of Google Gemini 3.0, which has raised the competitive bar [2][15] - DeepSeek's models are positioned to bridge the gap between open-source and closed-source models through innovative architecture and training strategies, despite limitations in computational resources compared to industry giants [8][15][16] Model Performance - DeepSeek-V3.2 has achieved performance levels comparable to GPT-5 and is slightly below Google’s Gemini 3 Pro, demonstrating its effectiveness in reasoning tasks [6][7] - The Speciale version has outperformed Gemini 3 Pro in several reasoning benchmarks, including the American Mathematics Invitational Exam (AIME) and the Harvard-MIT Mathematics Tournament (HMMT) [7][8] - Speciale's design focuses on rigorous mathematical proof and logical verification, making it a specialized tool for complex reasoning tasks [6][8] Technological Innovations - DeepSeek employs a novel DSA (DeepSeek Sparse Attention) mechanism to optimize computational efficiency, allowing for effective long-context processing without sacrificing performance [8][12] - The concept of "Interleaved Thinking" has been integrated into DeepSeek's models, enhancing the interaction between reasoning and tool usage, which is crucial for AI agents [9][12] - The focus on agent capabilities signifies a strategic shift towards creating actionable AI, moving beyond traditional chat-based interactions to more complex task execution [13][14] Industry Context - The competitive landscape is shifting, with DeepSeek acknowledging the widening gap between open-source and closed-source models, particularly in complex task performance [15][16] - DeepSeek aims to address its limitations by increasing pre-training computational resources and optimizing model efficiency, indicating a clear path for future improvements [16][19] - The release of DeepSeek-V3.2 has been seen as a significant achievement in the open-source community, suggesting that the gap with leading closed-source models is narrowing [16][19]
第三届全国工业和信息化技术技能大赛举办,首设生成式AI应用赛
Xin Jing Bao· 2025-11-28 04:55
Group 1 - The third National Industrial and Information Technology Skills Competition was held in Chongqing from November 26 to 28, featuring 408 teams and 834 participants competing in six advanced categories [1][2] - This year's competition introduced a new category for Generative Artificial Intelligence System Application, focusing on practical problem-solving skills in humanoid robot navigation and precise grasping [1] - The event is co-hosted by multiple government bodies, emphasizing the theme of "integration of industry and talent" and targeting key areas such as new energy vehicles, industrial robots, smart chips, industrial big data, digital transformation of manufacturing, and industrial internet [1][2] Group 2 - The competition expanded the number of categories to six, nearly doubling the previous editions, with new categories like Mixed Integrated Circuit Assembly Worker focusing on RISC-V architecture [2] - The event reflects three key improvements: alignment with technological frontiers, enhanced industry characteristics, and a more robust management system [2] - The competition serves as a platform for talent discovery, practical skills training, and promoting the integration of industry and talent, contributing to the goal of new industrialization [3]
AI产业跟踪:阿里首款AI眼镜塞满硬核技术,我国已经成为全球开源AI大模型的最大提供者
GUOTAI HAITONG SECURITIES· 2025-11-24 08:15
Investment Rating - The report does not explicitly state an investment rating for the AI industry Core Insights - The AI industry is witnessing significant advancements, with China emerging as the largest provider of open-source AI models globally, as highlighted by a statement from an academic expert [16] - Major companies like Alibaba and BMW are making strides in AI technology, with Alibaba launching the Qwen App and BMW introducing its self-developed AI platform "GAIA" in China [7][15] - The AI sector is experiencing a surge in applications, including AI-powered educational tools and digital assistants, indicating a trend towards commercialization of AI technologies [10][14] Summary by Sections AI Industry Dynamics - Zhejiang Commercial Bank has formed a strategic partnership with Alibaba to enhance financial services through AI and cloud technology [6] - BMW has launched its self-developed AI platform "GAIA" in China, aiming to democratize AI capabilities across its organization [7] - The Kimi K2 Thinking model has been integrated into the Perplexity AI search application, marking a significant achievement for domestic AI models [8] - The "2025 AI+" conference was held in Beijing, leading to the establishment of the Beijing AI Association to foster collaboration and innovation in the AI sector [9] AI Application Insights - Shanghai Steel Union has developed the "Xiao Gang" digital assistant using AIGC technology, marking a step towards the commercialization of AI models [10] - Youdao has upgraded its audio and video translation capabilities with a new AI workflow system [11] - The "HKChat" app has been launched in Hong Kong, providing comprehensive AI-driven life services [13] - Zebra has introduced its first AI foreign teacher product, enhancing English learning for children [14] AI Large Model Insights - Alibaba's Qwen App has been launched, achieving over 600 million downloads and showcasing competitive performance against top global AI models [15] - China is recognized as the largest provider of open-source AI models, with notable models like Qwen and DeepSeek ranking highly in evaluations [16] - Gartner's report indicates that Huya Engine leads the global challengers quadrant in AI application development platforms, with Alibaba Cloud and Tencent Cloud also recognized [17] Technology Frontiers - Alibaba's first AI glasses, the Quark AI Glasses S1, have been pre-sold over 6,000 units, featuring advanced optical technologies [20] - Ant Group has open-sourced the Awex framework for high-performance reinforcement learning, addressing key challenges in parameter synchronization [22] - The launch of the "Hongjun" humanoid robot by Yifei Technology showcases advancements in robotics with multi-modal capabilities [23] - The CHASING L1 Ultra, a smart pool cleaning robot, represents a significant technological leap in the cleaning industry by integrating AI vision and laser radar [24]