Open Source Model
Search documents
深度| 大模型年终观察,如何定义2025年的"好模型"?
Z Potentials· 2025-12-17 12:00
当跑分不再性感,行业在重新寻找标尺 2025 年的 AI 世界,弥漫着一种 " 跑分疲劳症 " 。但比这更深刻的,是一个正在加速成型的行业共识: 开源模型正从 " 可选项 " 走向 " 默认使用的必选项 " ——Interconnects.ai 在其年度回顾《 2025: Open Models Year in Review 》中用大量篇幅讨论了这一趋势, DeepSeek 、 Qwen 、 Kimi 成为最前线 的开 源 模型。 衡量顶尖模型的标准,正在经历一次深刻的变革。行业心态正从 " 选秀式逐冠军 " ,转向 " 基建式找伙伴 " 。在这个新范式中,模型的基础能力只是入场 券,而由 评测、部署、交付 三个维度构成的 " 信任 " ,才是让 AI 真正融入业务流程的通行证。 本篇年终盘点,将从这三个最务实的维度出发,解构 AI 行业正在形成的 " 信任法则 " 。 信号:从 " 尝鲜 " 到 " 留存 " ,生产力拐点已现 过去,每一轮模型发布都像一场烟火,热度在瞬间冲顶后迅速归零。开发者 " 尝鲜即走 " ,留不下真实使用。 而 2025 年的图表,第一次向我们展示了截然不同的曲线。 权威的 AI 模 ...
ChatGPT三周年遭DeepSeek暴击,23页技术报告藏着开源登顶的全部秘密
36氪· 2025-12-02 09:19
DeepSeek V3.2上新黑科技。 来源| APPSO(ID:appsolution) 封面来源 | unsplash ChatGPT诞生三周年之际,DeepSeek送上「庆生礼物」。 12月1日, DeepSeek一口气发布两款模型:DeepSeek-V3.2和DeepSeek-V3.2-Speciale。这两个模型不仅在推理能力上直逼GPT-5和Gemini-3.0-Pro ,更重 要的是,它们解决了一个困扰开源模型很久的问题: 过去几个月,AI圈出现了一个明显的趋势:闭源模型越跑越快,开源模型却有点跟不上节奏了。DeepSeek团队分析后发现,开源模型在处理复杂任务时有 三个核心瓶颈:架构问题、资源分配以及智能体能力。 针对这三个问题,DeepSeek这次拿出了三个大招。 如果你用过一些AI模型处理超长文档,可能会发现速度越来越慢,甚至直接卡死。这就是传统注意力机制的锅。 怎么让AI既会深度思考,又会熟练使用工具? 新模型省流版如下: DeepSeek-V3.2(标准版) :主打性价比与日常使用,推理能力达到GPT-5水平,比Kimi-K2-Thinking输出更短、更快且更省成本,并首次实现「边思 ...
ChatGPT 三周年遭 DeepSeek 暴击,23 页技术报告藏着开源登顶的全部秘密
3 6 Ke· 2025-12-02 00:16
慢、笨、呆?DeepSeek V3.2 上新黑科技 过去几个月,AI 圈出现了一个明显的趋势:闭源模型越跑越快,开源模型却有点跟不上节奏了。DeepSeek 团队分析后发现,开源模型在处理复杂任务时 有三个核心瓶颈:架构问题、资源分配以及智能体能力。 针对这三个问题,DeepSeek 这次拿出了三个大招。 ChatGPT 诞生三周年之际,DeepSeek 送上「庆生礼物」。 就在刚刚,DeepSeek 一口气发布两款模型:DeepSeek-V3.2 和 DeepSeek-V3.2-Speciale。这两个模型不仅在推理能力上直逼 GPT-5 和 Gemini-3.0-Pro,更 重要的是,它们解决了一个困扰开源模型很久的问题: 怎么让 AI 既会深度思考,又会熟练使用工具? 新模型省流版如下 两个模型的权重都已经在 HuggingFace 和 ModelScope 上开源,你可以下载到本地部署。 如果你用过一些 AI 模型处理超长文档,可能会发现速度越来越慢,甚至直接卡死。这就是传统注意力机制的锅。 传统注意力机制的逻辑是:每个字都要和之前所有的字计算相关性。文档越长,计算量就越大。就像你在一个有 1000 ...
Kimi杨植麟称“训练成本很难量化”,仍将坚持开源策略
Di Yi Cai Jing· 2025-11-11 10:35
Core Insights - Kimi, an AI startup, has released its latest open-source model, Kimi K2 Thinking, with a reported training cost of $4.6 million, significantly lower than competitors like DeepSeek V3 at $5.6 million and OpenAI's GPT-3, which costs billions to train [1][2] - The company emphasizes ongoing model updates and improvements, focusing on absolute performance while addressing user concerns regarding inference length and performance discrepancies [1] - Kimi's strategy includes maintaining an open-source approach and advancing the Kimi K2 Thinking model while avoiding direct competition with major players like OpenAI through innovative architecture and cost control [2][4] Model Performance and Market Position - In the latest OpenRouter model usage rankings, five Chinese open-source models, including Kimi's, are among the top twenty, indicating a growing presence in the international market [2] - Kimi's current model can only be accessed via API due to platform limitations, but the team is utilizing H800 GPUs with InfiniBand technology for training, despite having fewer resources compared to U.S. high-end GPUs [2] - The company plans to balance text model development with multi-modal model advancements, aiming to establish a differentiated advantage in the AI landscape [4]
MiniMax深夜致歉,开源大模型M2登顶引发全球热潮
第一财经· 2025-10-30 07:47
Core Insights - MiniMax has launched its new model MiniMax M2, which is fully open-sourced under the MIT license, allowing developers to download and deploy it via Hugging Face or access it through MiniMax's API [1] - The M2 model has quickly gained traction, achieving significant usage metrics and ranking highly on various platforms, indicating strong market demand [4][5] - M2's performance is comparable to top models like GPT-5, particularly in agent and coding scenarios, marking a significant advancement in open-source models [7] Performance and Metrics - Since the launch of the M2 API and MiniMax Agent, the platform has experienced a surge in traffic, leading to temporary service disruptions, which have since been resolved [4] - M2 ranks 5th globally in OpenRouter usage and 1st among domestic models, also appearing 2nd on the Hugging Face Trending list [5] - M2 has achieved impressive scores in various benchmarks, including 5th globally and 1st among open-source models in the Artificial Analysis (AA) rankings [7] Model Capabilities - M2 excels in balancing performance, speed, and cost, which is crucial for its rapid adoption in the market [10] - The model demonstrates strong capabilities in agent tasks, including complex toolchain execution and deep search, with notable performance in benchmarks like BrowseComp and Xbench-DeepSearch [11] - M2's programming capabilities include end-to-end development processes and effective debugging, achieving high scores in Terminal-Bench and Multi-SWE-Bench tests [10] Evolution from M1 to M2 - M2 is designed to meet the evolving needs of the agent era, focusing on tool usage, instruction adherence, and programming capabilities, contrasting with M1's emphasis on long text and complex reasoning [12][13] - The transition from M1 to M2 involved a shift from a hybrid attention mechanism to a full attention + MoE approach, optimizing for executable agent tasks [15] - M2's pricing strategy is competitive, with input costs at approximately $0.30 per million tokens and output costs at $1.20 per million tokens, significantly lower than competitors [15] Product Ecosystem - Alongside the M2 model, MiniMax has upgraded its Agent product, which now operates on the M2 model and offers two modes: professional and efficient [16] - The launch of M2 and the upgrade of MiniMax Agent are seen as steps towards building a comprehensive ecosystem for intelligent agents, expanding the potential applications of open-source models in enterprise settings [17]
全球开源大模型杭州霸榜被终结,上海Minimax M2发布即爆单,百万Tokens仅需8元人民币
3 6 Ke· 2025-10-28 02:12
Core Insights - The open-source model throne has shifted to Minimax M2, surpassing previous leaders DeepSeek and Qwen, with a score of 61 in evaluations by Artificial Analysis [1][7]. Performance and Features - Minimax M2 is designed specifically for agents and programming, boasting exceptional programming capabilities and agent performance. It operates at twice the reasoning speed of Claude 3.5 Sonnet while costing only 8% of its API price [3][4]. - The model features a high sparsity MoE architecture with a total parameter count of 230 billion, of which only 10 billion are activated, allowing for rapid execution, especially when paired with advanced inference platforms [4][6]. - M2's unique interleaved thinking format enables it to plan and verify operations across multiple dialogues, crucial for agent reasoning [6]. Competitive Analysis - In the Artificial Analysis tests, M2 ranked fifth overall and first among open-source models, evaluated across ten popular datasets [7]. - M2's pricing is significantly lower than competitors, at $0.3 per million input tokens and $1.2 per million output tokens, representing only 8% of Claude 3.5 Sonnet's costs [8][14]. Agent Capabilities - Minimax has deployed M2 on an agent platform for free, showcasing various applications, including web development and game creation [23][30]. - Users have successfully utilized M2 to create complex applications and games, demonstrating its programming capabilities [36][38]. Technical Aspects - M2 employs a hybrid attention mechanism, combining full attention and sliding window attention, although initial plans to incorporate sliding window attention were abandoned due to performance concerns [39][40]. - The choice of attention mechanism reflects Minimax's strategy to optimize performance for their specific use cases, despite ongoing debates in the research community regarding the best approach for long-sequence tasks [47].
投资人查马斯:公司已在使用中国开源大模型
Huan Qiu Wang· 2025-10-11 11:12
Core Insights - The podcast "All in" highlights the competition between Chinese open-source AI models and American closed-source models, emphasizing the shift in demand towards models like Kimi K2 from China due to their superior performance and lower costs compared to OpenAI and Anthropic [1][3] - Chamath, founder of Social Capital, points out that while Anthropic is impressive, it is financially burdensome, indicating a trend where Chinese models are challenging the dominance of American counterparts in the AI space [1] Company Insights - Social Capital, a prominent venture capital firm, is actively transitioning its workload to Chinese AI models, particularly Kimi K2, which is noted for its strong performance and cost-effectiveness [1] - The podcast "All in," founded by influential Silicon Valley figures, has become a significant platform for discussing technology and investment trends, reflecting the growing interest in the capabilities of Chinese AI models [3]
蚂蚁开源2025全球大模型全景图出炉,AI开发中美路线分化、工具热潮等趋势浮现
Sou Hu Cai Jing· 2025-09-14 14:39
Core Insights - The report released by Ant Group and Inclusion AI highlights the rapid development and trends in the AI open-source ecosystem, particularly focusing on large models and their implications for the industry [1] Group 1: Open-source Ecosystem Overview - The 2.0 version of the report includes 114 notable open-source projects across 22 technical fields, categorized into AI Agent and AI Infra [1] - 62% of the open-source projects in the large model ecosystem were created after the "GPT moment" in October 2022, with an average age of only 30 months, indicating a fast-paced evolution in the AI open-source landscape [1] - Approximately 360,000 global developers contributed to the projects, with 24% from the US, 18% from China, and smaller contributions from India, Germany, and the UK [1] Group 2: Development Trends - A significant trend identified is the explosive growth of AI programming tools, which automate code generation and modification, greatly enhancing programmer efficiency [1][2] - These tools are categorized into command-line tools and integrated development environment (IDE) plugins, with the former being favored for their flexibility and the latter for their integration into development processes [1] - The report notes that the average new coding tool in 2025 has garnered over 30,000 developer stars, with Gemini CLI achieving over 60,000 stars in just three months, marking it as one of the fastest-growing projects [1] Group 3: Competitive Landscape - The report outlines a timeline of major large model releases from leading companies, detailing both open and closed models, along with key parameters and modalities [4] - Key directions in large model development include a clear divergence between open-source and closed-source strategies in China and the US, a trend towards scaling model parameters under MoE architecture, and the rise of multi-modal models [4] - The evaluation methods for models are evolving, incorporating both subjective voting and objective assessments, reflecting the technological advancements in the large model domain [4]
The Industry Reacts to gpt-oss!
Matthew Berman· 2025-08-06 19:22
Model Release & Performance - OpenAI released a new open-source model (GPT-OSS) that performs comparably to smaller models like 04 mini and can run on consumer hardware such as laptops and phones [1] - The 20 billion parameter version of GPT-OSS is reported to outperform models two to three times its size in certain tests [7] - Industry experts highlight the model's efficient training, with the 20 billion parameter version costing less than $500,000 to pre-train, requiring 21 million H100 hours [27] Safety & Evaluation - OpenAI conducted safety evaluations on GPT-OSS, including fine-tuning to identify potential malicious uses, and shared the recommendations they adopted or didn't adopt [2][3] - Former OpenAI safety researchers acknowledge the rigor of OpenAI's OSS safety evaluation [2][19] - The model's inclination to "snitch" on corporate wrongdoing was tested, with the 20 billion parameter version showing a 0% snitch rate and the 120 billion parameter version around 20% [31] Industry Reactions & Implications - Industry experts suggest OpenAI's release of GPT-OSS could be a strategic move to commoditize the model market, potentially forcing competitors to lower prices [22][23] - Some believe the value in AI will increasingly accrue to the application layer rather than the model layer, as the price of AI tokens converges with the cost of infrastructure [25][26] - The open-source model has quickly become the number one trending model on Hugging Face, indicating significant community interest and adoption [17][18] Accessibility & Use - Together AI supports the new open-source models from OpenAI, offering fast speeds and low prices, such as 15 cents per million input tokens and 60 cents per million output tokens for the 120 billion parameter model [12] - The 120 billion parameter model requires approximately 65 GB of storage, making it possible to store on a USB stick and run locally on consumer laptops [15] - Projects like GPTOSS Pro mode chain together multiple instances of the new OpenAI GPT-OSS model to produce better answers than a single instance [10]
X @Sam Altman
Sam Altman· 2025-08-05 17:03
Model Release - The company released gpt-oss, an open-source model [1] - The model performs at the level of o4-mini [1] - The model can run on a high-end laptop [1] - A smaller version of the model can run on a phone [1]