Workflow
Open Source Model
icon
Search documents
阶跃星辰发布开源基座模型Step 3.5 Flash,多家头部芯片厂商完成适配
Feng Huang Wang· 2026-02-02 06:32
凤凰网科技讯2月2日,阶跃星辰发布新一代开源Agent基座模型Step3.5Flash。该模型面向实时Agent工 作流场景,采用稀疏MoE架构,总参数量为1960亿,每个token激活约110亿参数,旨在兼顾推理速度与 使用成本。 据官方介绍,在单请求代码类任务中,Step3.5Flash的推理速度最高可达每秒350个token。目前,包括华 为昇腾、沐曦股份、壁仞科技、燧原科技等在内的多家芯片厂商已完成对该模型的适配。 阶跃星辰曾于2025年7月联合多家芯片及基础设施厂商发起"模芯生态创新联盟",旨在通过联合优化提 升算力效率,推动大模型在应用场景中的落地。此次模型发布被视为其在模型与算力协同方向的进一步 实践。 ...
阶跃星辰发布开源基座模型Step 3.5 Flash
Jin Rong Jie· 2026-02-02 02:24
阶跃星辰发布新一代开源Agent基座模型Step 3.5 Flash。该模型面向实时Agent工作流场景,最高推理速 度可达每秒350个token。据悉,Step 3.5 Flash采用稀疏MoE架构,每个token仅激活约110亿个参数(总 计960亿参数)。包括华为昇腾、 沐曦股份、壁仞科技、燧原科技、天数智芯、阿里平头哥在内的多家 芯片厂商,已完成适配。 ...
深度| 大模型年终观察,如何定义2025年的"好模型"?
Z Potentials· 2025-12-17 12:00
Core Insights - The AI industry is transitioning from a "score-based" evaluation to a "trust-based" framework, emphasizing the importance of open-source models as a default choice for businesses [1][2][3] Group 1: Industry Trends - The concept of "score fatigue" is prevalent in the AI sector, leading to a shift towards open-source models like DeepSeek, Qwen, and Kimi as essential tools [1] - The industry mindset is evolving from a "championship-style" competition to a "partnership-based" approach, where foundational capabilities are merely entry tickets, and trust is built through evaluation, deployment, and delivery [2] Group 2: Key Signals - The AI model landscape is showing a significant change, with open-source models capturing over one-third of the total token share by the end of 2025, indicating a stable demand post-launch [5] - The usage of reasoning models has surged, accounting for over 50% of token consumption, reflecting a growing complexity in tasks assigned to AI [8][12] Group 3: Evaluation Metrics - The evaluation of AI models is moving towards a multi-dimensional framework, incorporating both performance and cost metrics to assess value [20] - Kimi K2 Thinking exemplifies this trend by achieving top scores in key evaluations, gaining significant attention and trust from the community [14][18] Group 4: Deployment and Infrastructure - The deployability of models is becoming a critical factor, with advancements in hardware allowing for significant cost reductions and performance improvements [24] - Cloud platforms are enhancing transparency in deployment costs, shifting from estimation to clear pricing models for token usage [24] Group 5: Delivery and Governance - The final step in ensuring trust involves governance, observability, and reproducibility of AI models in enterprise settings [25] - Major cloud providers are integrating top models into their enterprise services, facilitating standardized API access and security measures [26] Group 6: Future Directions - The focus for 2026 will be on operational excellence, emphasizing task completion rates, stability, and alignment with real workloads [31] - Trust in AI models is increasingly seen as a product of engineering rather than belief, highlighting the importance of reliability in achieving productivity [32]
ChatGPT三周年遭DeepSeek暴击,23页技术报告藏着开源登顶的全部秘密
36氪· 2025-12-02 09:19
DeepSeek V3.2上新黑科技。 来源| APPSO(ID:appsolution) 封面来源 | unsplash ChatGPT诞生三周年之际,DeepSeek送上「庆生礼物」。 12月1日, DeepSeek一口气发布两款模型:DeepSeek-V3.2和DeepSeek-V3.2-Speciale。这两个模型不仅在推理能力上直逼GPT-5和Gemini-3.0-Pro ,更重 要的是,它们解决了一个困扰开源模型很久的问题: 过去几个月,AI圈出现了一个明显的趋势:闭源模型越跑越快,开源模型却有点跟不上节奏了。DeepSeek团队分析后发现,开源模型在处理复杂任务时有 三个核心瓶颈:架构问题、资源分配以及智能体能力。 针对这三个问题,DeepSeek这次拿出了三个大招。 如果你用过一些AI模型处理超长文档,可能会发现速度越来越慢,甚至直接卡死。这就是传统注意力机制的锅。 怎么让AI既会深度思考,又会熟练使用工具? 新模型省流版如下: DeepSeek-V3.2(标准版) :主打性价比与日常使用,推理能力达到GPT-5水平,比Kimi-K2-Thinking输出更短、更快且更省成本,并首次实现「边思 ...
ChatGPT 三周年遭 DeepSeek 暴击,23 页技术报告藏着开源登顶的全部秘密
3 6 Ke· 2025-12-02 00:16
慢、笨、呆?DeepSeek V3.2 上新黑科技 过去几个月,AI 圈出现了一个明显的趋势:闭源模型越跑越快,开源模型却有点跟不上节奏了。DeepSeek 团队分析后发现,开源模型在处理复杂任务时 有三个核心瓶颈:架构问题、资源分配以及智能体能力。 针对这三个问题,DeepSeek 这次拿出了三个大招。 ChatGPT 诞生三周年之际,DeepSeek 送上「庆生礼物」。 就在刚刚,DeepSeek 一口气发布两款模型:DeepSeek-V3.2 和 DeepSeek-V3.2-Speciale。这两个模型不仅在推理能力上直逼 GPT-5 和 Gemini-3.0-Pro,更 重要的是,它们解决了一个困扰开源模型很久的问题: 怎么让 AI 既会深度思考,又会熟练使用工具? 新模型省流版如下 两个模型的权重都已经在 HuggingFace 和 ModelScope 上开源,你可以下载到本地部署。 如果你用过一些 AI 模型处理超长文档,可能会发现速度越来越慢,甚至直接卡死。这就是传统注意力机制的锅。 传统注意力机制的逻辑是:每个字都要和之前所有的字计算相关性。文档越长,计算量就越大。就像你在一个有 1000 ...
Kimi杨植麟称“训练成本很难量化”,仍将坚持开源策略
Di Yi Cai Jing· 2025-11-11 10:35
Core Insights - Kimi, an AI startup, has released its latest open-source model, Kimi K2 Thinking, with a reported training cost of $4.6 million, significantly lower than competitors like DeepSeek V3 at $5.6 million and OpenAI's GPT-3, which costs billions to train [1][2] - The company emphasizes ongoing model updates and improvements, focusing on absolute performance while addressing user concerns regarding inference length and performance discrepancies [1] - Kimi's strategy includes maintaining an open-source approach and advancing the Kimi K2 Thinking model while avoiding direct competition with major players like OpenAI through innovative architecture and cost control [2][4] Model Performance and Market Position - In the latest OpenRouter model usage rankings, five Chinese open-source models, including Kimi's, are among the top twenty, indicating a growing presence in the international market [2] - Kimi's current model can only be accessed via API due to platform limitations, but the team is utilizing H800 GPUs with InfiniBand technology for training, despite having fewer resources compared to U.S. high-end GPUs [2] - The company plans to balance text model development with multi-modal model advancements, aiming to establish a differentiated advantage in the AI landscape [4]
MiniMax深夜致歉,开源大模型M2登顶引发全球热潮
第一财经· 2025-10-30 07:47
Core Insights - MiniMax has launched its new model MiniMax M2, which is fully open-sourced under the MIT license, allowing developers to download and deploy it via Hugging Face or access it through MiniMax's API [1] - The M2 model has quickly gained traction, achieving significant usage metrics and ranking highly on various platforms, indicating strong market demand [4][5] - M2's performance is comparable to top models like GPT-5, particularly in agent and coding scenarios, marking a significant advancement in open-source models [7] Performance and Metrics - Since the launch of the M2 API and MiniMax Agent, the platform has experienced a surge in traffic, leading to temporary service disruptions, which have since been resolved [4] - M2 ranks 5th globally in OpenRouter usage and 1st among domestic models, also appearing 2nd on the Hugging Face Trending list [5] - M2 has achieved impressive scores in various benchmarks, including 5th globally and 1st among open-source models in the Artificial Analysis (AA) rankings [7] Model Capabilities - M2 excels in balancing performance, speed, and cost, which is crucial for its rapid adoption in the market [10] - The model demonstrates strong capabilities in agent tasks, including complex toolchain execution and deep search, with notable performance in benchmarks like BrowseComp and Xbench-DeepSearch [11] - M2's programming capabilities include end-to-end development processes and effective debugging, achieving high scores in Terminal-Bench and Multi-SWE-Bench tests [10] Evolution from M1 to M2 - M2 is designed to meet the evolving needs of the agent era, focusing on tool usage, instruction adherence, and programming capabilities, contrasting with M1's emphasis on long text and complex reasoning [12][13] - The transition from M1 to M2 involved a shift from a hybrid attention mechanism to a full attention + MoE approach, optimizing for executable agent tasks [15] - M2's pricing strategy is competitive, with input costs at approximately $0.30 per million tokens and output costs at $1.20 per million tokens, significantly lower than competitors [15] Product Ecosystem - Alongside the M2 model, MiniMax has upgraded its Agent product, which now operates on the M2 model and offers two modes: professional and efficient [16] - The launch of M2 and the upgrade of MiniMax Agent are seen as steps towards building a comprehensive ecosystem for intelligent agents, expanding the potential applications of open-source models in enterprise settings [17]
全球开源大模型杭州霸榜被终结,上海Minimax M2发布即爆单,百万Tokens仅需8元人民币
3 6 Ke· 2025-10-28 02:12
Core Insights - The open-source model throne has shifted to Minimax M2, surpassing previous leaders DeepSeek and Qwen, with a score of 61 in evaluations by Artificial Analysis [1][7]. Performance and Features - Minimax M2 is designed specifically for agents and programming, boasting exceptional programming capabilities and agent performance. It operates at twice the reasoning speed of Claude 3.5 Sonnet while costing only 8% of its API price [3][4]. - The model features a high sparsity MoE architecture with a total parameter count of 230 billion, of which only 10 billion are activated, allowing for rapid execution, especially when paired with advanced inference platforms [4][6]. - M2's unique interleaved thinking format enables it to plan and verify operations across multiple dialogues, crucial for agent reasoning [6]. Competitive Analysis - In the Artificial Analysis tests, M2 ranked fifth overall and first among open-source models, evaluated across ten popular datasets [7]. - M2's pricing is significantly lower than competitors, at $0.3 per million input tokens and $1.2 per million output tokens, representing only 8% of Claude 3.5 Sonnet's costs [8][14]. Agent Capabilities - Minimax has deployed M2 on an agent platform for free, showcasing various applications, including web development and game creation [23][30]. - Users have successfully utilized M2 to create complex applications and games, demonstrating its programming capabilities [36][38]. Technical Aspects - M2 employs a hybrid attention mechanism, combining full attention and sliding window attention, although initial plans to incorporate sliding window attention were abandoned due to performance concerns [39][40]. - The choice of attention mechanism reflects Minimax's strategy to optimize performance for their specific use cases, despite ongoing debates in the research community regarding the best approach for long-sequence tasks [47].
投资人查马斯:公司已在使用中国开源大模型
Huan Qiu Wang· 2025-10-11 11:12
Core Insights - The podcast "All in" highlights the competition between Chinese open-source AI models and American closed-source models, emphasizing the shift in demand towards models like Kimi K2 from China due to their superior performance and lower costs compared to OpenAI and Anthropic [1][3] - Chamath, founder of Social Capital, points out that while Anthropic is impressive, it is financially burdensome, indicating a trend where Chinese models are challenging the dominance of American counterparts in the AI space [1] Company Insights - Social Capital, a prominent venture capital firm, is actively transitioning its workload to Chinese AI models, particularly Kimi K2, which is noted for its strong performance and cost-effectiveness [1] - The podcast "All in," founded by influential Silicon Valley figures, has become a significant platform for discussing technology and investment trends, reflecting the growing interest in the capabilities of Chinese AI models [3]
蚂蚁开源2025全球大模型全景图出炉,AI开发中美路线分化、工具热潮等趋势浮现
Sou Hu Cai Jing· 2025-09-14 14:39
Core Insights - The report released by Ant Group and Inclusion AI highlights the rapid development and trends in the AI open-source ecosystem, particularly focusing on large models and their implications for the industry [1] Group 1: Open-source Ecosystem Overview - The 2.0 version of the report includes 114 notable open-source projects across 22 technical fields, categorized into AI Agent and AI Infra [1] - 62% of the open-source projects in the large model ecosystem were created after the "GPT moment" in October 2022, with an average age of only 30 months, indicating a fast-paced evolution in the AI open-source landscape [1] - Approximately 360,000 global developers contributed to the projects, with 24% from the US, 18% from China, and smaller contributions from India, Germany, and the UK [1] Group 2: Development Trends - A significant trend identified is the explosive growth of AI programming tools, which automate code generation and modification, greatly enhancing programmer efficiency [1][2] - These tools are categorized into command-line tools and integrated development environment (IDE) plugins, with the former being favored for their flexibility and the latter for their integration into development processes [1] - The report notes that the average new coding tool in 2025 has garnered over 30,000 developer stars, with Gemini CLI achieving over 60,000 stars in just three months, marking it as one of the fastest-growing projects [1] Group 3: Competitive Landscape - The report outlines a timeline of major large model releases from leading companies, detailing both open and closed models, along with key parameters and modalities [4] - Key directions in large model development include a clear divergence between open-source and closed-source strategies in China and the US, a trend towards scaling model parameters under MoE architecture, and the rise of multi-modal models [4] - The evaluation methods for models are evolving, incorporating both subjective voting and objective assessments, reflecting the technological advancements in the large model domain [4]