开源模型
Search documents
100万亿Token揭示今年AI趋势!硅谷的这份报告火了
量子位· 2025-12-08 11:36
Core Insights - The report titled "State of AI: An Empirical 100 Trillion Token Study with OpenRouter" analyzes the usage of over 300 models on the OpenRouter platform from November 2024 to November 2025, focusing on real token consumption rather than benchmark scores [3][6][8]. Group 1: Open Source vs. Closed Source Models - Open source models (OSS) have evolved from being seen as alternatives to closed source models to finding their unique positioning, becoming the preferred choice in specific scenarios [9]. - The relationship between open source and closed source models is now more complementary, with developers often using both types simultaneously [10]. - The usage of open source models is expected to reach approximately one-third by the end of 2025, with Chinese models experiencing significant growth from 1.2% to 30% in weekly usage share [12][13]. Group 2: Market Dynamics and Model Diversity - The dominance of DeepSeek as the largest contributor to open source model usage is diminishing as more models enter the market, leading to a diversified landscape [16]. - By the end of 2025, no single model is expected to maintain over 25% of token usage, with the market likely to be shared among 5 to 7 models [17][18]. - The report indicates a shift towards medium-sized models, which are gaining market favor, while small models are losing traction [20][21]. Group 3: Evolution of Model Functionality - Language models are transitioning from dialogue systems to reasoning and execution systems, with reasoning token usage surpassing 50% [22]. - The use of model invocation tools is increasing, indicating a more competitive and diverse ecosystem [29][31]. - AI models are evolving into "intelligent agents" capable of independently completing tasks rather than just responding to queries [43]. Group 4: Usage Patterns and User Retention - The complexity of tasks assigned to AI has increased, with users now requiring models to analyze extensive documents or codebases [35]. - The average input to models has quadrupled, reflecting a growing reliance on contextual information [36]. - The "glass slipper effect" describes how certain users become highly attached to models that perfectly meet their needs upon release, leading to high retention rates [67][70]. Group 5: Regional Insights and Market Trends - The share of paid usage in Asia has doubled from 13% to 31%, indicating a shift in the global AI landscape [71]. - North America's AI market share has declined to below 50%, while English remains dominant at 82%, with Simplified Chinese holding nearly 5% [80]. - The impact of model pricing on usage is less significant than expected, with a 10% price drop resulting in only a 0.5%-0.7% increase in usage [80].
a16z 100万亿Token研究揭示的真相:中国力量重塑全球AI版图
3 6 Ke· 2025-12-08 08:33
Core Insights - The report titled "State of AI: An Empirical 100 Trillion Token Study" by a16z analyzes over 100 trillion tokens from real-world applications on the OpenRouter platform, revealing the actual usage landscape of large language models (LLMs) [3] - The AI field is undergoing three fundamental shifts: moving from single model competition to a diversified ecosystem, transitioning from simple text generation to intelligent reasoning paradigms, and evolving from a Western-centric to a globally distributed innovation landscape [3] Group 1: Key Findings - The rise of open-source models, particularly from China, is notable, with market share increasing from 1.2% at the end of 2024 to nearly 30% in certain weeks by late 2025 [4][9] - Over half of the usage of open-source models is directed towards creative dialogue scenarios such as role-playing and story creation [4] - The volume of tokens processed by reasoning models has reached 50% of the total token volume [4] Group 2: Technological Advancements - The release of OpenAI's reasoning model o1 on December 5, 2024, marks a pivotal point in AI development, shifting from text prediction to machine reasoning [6] - The introduction of multi-step reasoning and iterative optimization in the o1 model significantly enhances capabilities in mathematical reasoning, logical consistency, and multi-step decision-making [6] Group 3: Open-Source Ecosystem - The open-source model ecosystem is becoming increasingly diverse, with no single model expected to dominate more than 25% of the market share by the end of 2025 [11] - The total token usage by various model developers shows a significant shift towards a more balanced distribution among multiple competitors [11][12] Group 4: User Engagement and Application - More than half of the open-source model usage is directed towards role-playing and creative tasks, indicating a strong demand for emotional connection and creative expression [15][17] - Programming-related queries are projected to grow steadily, with their share of total token volume increasing from approximately 11% at the beginning of 2025 to over 50% by the end of the year [17] Group 5: Global Trends - Asia's share of global AI usage has risen from about 13% to 31%, reflecting accelerated adoption of AI technologies and the maturation of local innovation ecosystems [23] - Chinese open-source models like DeepSeek and Qwen are gaining international recognition, contributing to the global AI landscape [24] Group 6: Market Dynamics - The AI market exhibits a complex value stratification rather than a simple cost-driven model, with high-end models maintaining significant usage despite high costs [29][30] - Open-source models are exerting pressure on closed-source providers, compelling them to justify their pricing through enhanced integration and support [32] Group 7: User Retention - The "Cinderella Glass Slipper" effect describes how users become deeply integrated with models that meet their high-value workload needs, leading to strong retention rates [33][35] - The DeepSeek model demonstrates a "boomerang effect," where users return after exploring other options, indicating its unique advantages in certain capabilities [35] Group 8: Future Outlook - The emergence of reasoning as a service is reshaping the AI infrastructure requirements, emphasizing the need for long-term dialogue management and complex functionality [22][36] - The report serves as a reference for future technological evolution, product design, and strategic planning based on real-world data [36]
“美国造个数据中心要三年,中国……”
Guan Cha Zhe Wang· 2025-12-07 13:00
【文/观察者网 陈思佳】近些年,随着人工智能(AI)产业规模持续扩大,对数据中心的需求空前高 涨,英伟达等美国科技巨头掀起建设数据中心的热潮。但由于美国正面临基础设施老化、建设进程缓慢 等问题,数据中心的建设并不顺利。 据美国《财富》杂志网站12月6日报道,英伟达首席执行官黄仁勋近日与美国智库战略与国际问题研究 中心(CSIS)负责人约翰·哈姆雷对话时表示,尽管美国目前仍在AI领域领先,但美国的基础设施建设 能力远不如中国,可能在"AI竞赛"中被中国反超。 黄仁勋将AI产业简化为"五层蛋糕",分别为能源、芯片、基础设施、模型和应用。他指出,在最底层的 能源领域,"中国拥有的能源是美国的两倍"。他表示,美国政府正推动制造业回流,"但没有能源,我 们要如何建设芯片工厂、超级计算机工厂和AI数据中心?" 黄仁勋不忘借机吹捧美国总统特朗普,他宣称,特朗普意识到能源增长的重要性,正在"顶住压力恢复 美国的能源"。但他也承认,美国的能源成本依然远高于中国。 谈及芯片问题,黄仁勋表示,目前美国在这一领域具有优势,英伟达等美国科技企业拥有领先的AI芯 片技术,但中国已展现出巨大的潜力。他说:"我们拥有领先的技术,但不能自满。 ...
观察| 100万亿Tokens的:AI正在发生你看不见的巨变
未可知人工智能研究院· 2025-12-07 03:02
Core Insights - The report reveals that AI is undergoing a significant revolution, characterized by a shift from traditional models to reasoning models that can think and plan in multiple steps [3][11][12]. Group 1: OpenRouter and Its Importance - OpenRouter is likened to "Meituan" in the AI world, connecting over 500 million developers to more than 300 AI models, making its data highly credible [5][6]. - OpenRouter's daily token processing volume has surpassed 1 trillion, indicating a rapid growth from approximately 100 trillion tokens annually from early 2024 to mid-2025, marking a tenfold increase [8][6]. Group 2: Reasoning Revolution - The report identifies a "reasoning revolution," where AI models evolve from simple response machines to complex reasoning machines capable of multi-step thinking [11][12]. - The launch of OpenAI's o1 reasoning model (codename Strawberry) is a pivotal event, as it incorporates internal reasoning processes that enhance its problem-solving capabilities [18][19]. - Users are increasingly engaging in complex tasks, leading to longer prompts and more dialogue rounds, indicating a shift towards training AI for intricate tasks [20][21][23]. Group 3: Agentic AI - Agentic AI represents a transformation where AI can autonomously plan, execute, and verify tasks, moving from passive response to active engagement [27][30]. - The report highlights that agentic reasoning is the fastest-growing behavior on OpenRouter, indicating a shift in user expectations from simple answers to task completion [34][35]. Group 4: Rise of Open Source Models - Open source models, particularly from Chinese teams like DeepSeek R1 and Kimi K2, are rapidly gaining market share, challenging the dominance of closed-source models [44][47]. - DeepSeek R1 offers significant cost advantages, with a cost of $0.003 per 1K tokens compared to $0.03 for GPT-4, making it attractive for developers [52]. Group 5: Real-World AI Usage - The primary applications driving token usage are creative writing and programming, with AI becoming indispensable for developers [71][72]. - Users are not merely relying on AI for content generation but are engaging in co-creation, indicating a shift in the role of AI from a tool to a creative partner [77][78]. Group 6: Model Personality - Users' choices of AI models are influenced by the "personality" of the models, which affects user retention and engagement [88][95]. - The report suggests that models with unique personalities can outperform those with higher benchmark scores in terms of user loyalty [96][100]. Group 7: Implications for the Chinese AI Industry - The success of Chinese models like DeepSeek R1 and Kimi K2 in the global market indicates that they have competitive capabilities [109]. - The report emphasizes the importance of focusing on reasoning and agentic capabilities as key technological directions for the Chinese AI industry [115].
蔡崇信|港大演讲全记录:中国AI必将超越美国,因为有四张底牌
Sou Hu Cai Jing· 2025-12-05 18:41
Group 1: AI Development in China - China is making significant strides in AI, with models like DeepSeek-V3.2 performing at levels comparable to GPT-5 and surpassing other international models in competitions [1] - Alibaba's Qwen3-VL and Qwen2.5-VL models have excelled in spatial reasoning benchmarks, outperforming top models like Gemini 3 and GPT-5.1 [1] - The Chinese government has set ambitious goals for AI adoption, aiming for a 90% penetration rate by 2030, which reflects a pragmatic and goal-oriented approach [21][46] Group 2: Alibaba's Strategic Position - Alibaba has transformed from a B2B e-commerce platform to a leading player in AI and cloud computing, leveraging its infrastructure to support AI applications [9][30] - The company emphasizes an open-source strategy for its AI models, allowing broader access and adoption, which contrasts with the proprietary models of competitors like OpenAI [26][30] - Alibaba's cloud computing business is positioned to benefit from the growing demand for AI infrastructure, as companies increasingly rely on cloud services for AI model training and deployment [30][38] Group 3: Competitive Advantages in AI - China has a significant advantage in energy supply for AI development, with lower electricity costs and substantial investments in clean energy infrastructure [22] - The country produces a large number of STEM graduates, providing a robust talent pool for AI engineering and development [23] - The open-source approach adopted by many Chinese AI companies accelerates innovation and adoption, making AI tools more accessible to a wider audience [26][46] Group 4: Future Outlook - The future of China's economy is closely tied to technological advancements, particularly in AI, which is seen as the primary driver of growth over the next decade [45] - The emphasis on manufacturing and technological self-reliance is expected to sustain China's position as a global manufacturing hub while fostering innovation in high-tech sectors [15][17] - The integration of AI into various industries is anticipated to reshape business practices and consumer behavior, with AI evolving from a tool to a collaborative partner in various applications [34]
AI泡沫要破?巨佬颠覆认知的观点来了!
Ge Long Hui· 2025-12-04 07:29
大模型的决战越来越激烈了!谷歌的崛起令OpenAI感到恐惧,并酝酿新的大动作! OpenAI直接拉响警报,推迟赚钱的广告业务,也要把所有资源梭哈到ChatGPT的改进上。 现在的AI圈子,像是星球大战前夜,由于恐惧,每个人都把手指扣在了扳机上。 兵荒马乱的年代,蔡崇信在香港大学炉边对话中,抛出了非常反直觉的观点: 现在美国人定义谁赢得AI竞赛的方式,纯粹是看大型语言模型,我们不看美国定义的AI竞赛。 当所有人都在盯着谁的模型参数大、谁的算力强时,蔡崇信却认为——胜负手根本不在这里。 如果不看模型,这场万亿赌局的赢家到底看什么?中国手里到底还有没有牌? 看完发现,原来大佬眼里的世界,和我们看到的完全不一样。 1 中国AI的真正优势 现在美国硅谷大模型怎么算输赢?很简单:看谁的"大语言模型"更强、更聪明、参数更多。 今天是OpenAI遥遥领先,明天Anthropic发个新版本追平,后天谷歌又搞个大新闻。大家都在卷模型, 仿佛谁的模型智商高了一点,谁就统治了世界。 但在蔡崇信看来,事实未必如此。 他在演讲中说了这么一句极具穿透力的话: "The winner is not about who has the bes ...
闭源越跑越快之后,DeepSeek V3.2 如何为开源模型杀出一条新路
深思SenseAI· 2025-12-03 09:51
过去一年多里, 大多数权威评测仍然在反复强调同一件事:在最前沿的综合能力上,闭源模型的曲线更陡,开源想在所有维度上追平变得越来越难。 DeepSeek 在技术报告中也承认:开源社区在进步,但 Anthropic 、 Gemini 、 OpenAI 这些闭源模型的性能曲线更陡,差距其实在拉大。在复杂任务上,闭源 系统展现出越来越明显 的优势。 目前开源模型有三个关键问题 : 1. 首先,在架构层面,当前主流仍高度依赖 Vanilla Attention 机制,这在 长序列场景 下会严重限制计算效率。这种低效对模型的 大规模部署 以及有效的后训 练都构成了实质性障碍。 2. 其次,在资源投入上,开源模型在 后训练 阶段普遍面临 算力投入不足 的问题,从而限制了其在高难度任务上的表现。 3. 最后,在 AI Agent 场景中,相比于闭源系统,开源模型在 泛化能力 与 指令跟随能力 方面存在显著滞后,这削弱了其在真实部署中的有效性。 12月1 号, DeepSeek 发布了两款新模型: DeepSeek V3.2 和 DeepSeek V3.2 Speciale ,针对这三个问题, 提出了三个改进 : 1. 引入了 ...
DeepSeek杀出一条血路:国产大模型突围不靠运气
3 6 Ke· 2025-12-03 03:21
进入2025年末,全球大模型赛道的技术焦点几乎被Google重新夺回。Gemini 3 Pro横空出世,在多个权 威基准上超越所有开源模型,重新确立了闭源阵营的技术高地。一时间,业内关于"开源模型是否已到 极限""Scaling Law是否真的撞墙"的质疑声再起,一股迟滞情绪在开源社区弥漫。 但就在此时,DeepSeek没有选择沉默。12月1日,它一口气发布了两款重磅模型:推理性能对标GPT-5 的DeepSeek-V3.2,以及在数学、逻辑和多轮工具调用中表现异常强势的Speciale版本。这不仅是对技术 能力的集中展示,也是在当前算力资源并不占优的前提下,对闭源"新天花板"的正面回应。 这不是一次简单的模型更新。DeepSeek试图在后Scaling时代找出一条全新路径:如何用架构重塑弥补 预训练差距?如何通过"工具使用中的思考链"实现低token高效率的智能体表现?更关键的是,Agent为 何从附属功能变成了模型能力跃迁的核心引擎? 本文将围绕这三条主线展开分析:DeepSeek是如何在技术瓶颈下突破的?为何率先在开源阵营中重注 Agent?而这是否意味着,开源模型仍有穿透闭源护城河的那条路? 这背后的 ...
开源最强!“拳打GPT 5”,“脚踢Gemini-3.0”,DeepSeek V3.2为何提升这么多?
华尔街见闻· 2025-12-02 04:21
Core Insights - DeepSeek has released two official models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, with the former achieving performance levels comparable to GPT-5 and the latter winning gold medals in four international competitions [1][3]. Model Performance - DeepSeek-V3.2 has reached the highest level of tool invocation capabilities among current open-source models, significantly narrowing the gap with closed-source models [2]. - In various benchmark tests, DeepSeek-V3.2 achieved a 93.1% pass rate in AIME 2025, closely trailing GPT-5's 94.6% and Gemini-3.0-Pro's 95.0% [20]. Training Strategy - The model's significant improvement is attributed to a fundamental change in training strategy, moving from a simple "direct tool invocation" to a more sophisticated "thinking + tool invocation" mechanism [9][11]. - DeepSeek has constructed a new large-scale data synthesis pipeline, generating over 1,800 environments and 85,000 complex instructions specifically for reinforcement learning [12]. Architectural Innovations - The introduction of the DeepSeek Sparse Attention (DSA) mechanism has effectively addressed efficiency bottlenecks in traditional attention mechanisms, reducing complexity from O(L²) to O(Lk) while maintaining model performance [6][7]. - The model's architecture allows for better context management, retaining relevant reasoning content during tool-related messages, thus avoiding inefficient repeated reasoning [14]. Competitive Landscape - The release of DeepSeek-V3.2 signals a shift in the competitive landscape, indicating that the absolute technical monopoly of closed-source models is being challenged by open-source models gaining first-tier competitiveness [20][22]. - This development has three implications: lower costs and greater customization for developers, reduced reliance on overseas APIs for enterprises, and a shift in the industry focus from "who has the largest parameters" to "who has the strongest methods" [22].
DeepSeek又上新!模型硬刚谷歌 承认开源与闭源差距拉大
Di Yi Cai Jing· 2025-12-01 23:13
Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which are positioned to compete with leading proprietary models like GPT-5 and Gemini 3.0, showcasing significant advancements in reasoning capabilities [1][4]. Model Overview - DeepSeek-V3.2 aims to balance reasoning ability and output length, making it suitable for everyday applications such as Q&A and general intelligence tasks. It has achieved performance levels comparable to GPT-5 and is slightly below Google's Gemini 3 Pro in public reasoning tests [4]. - DeepSeek-V3.2-Speciale is designed to push the limits of reasoning capabilities, integrating enhanced long-thinking features and theorem-proving abilities from DeepSeek-Math-V2. It has surpassed Gemini 3 Pro in several reasoning benchmarks, including prestigious math competitions [4][5]. Benchmark Performance - In various benchmarks, DeepSeek models have shown competitive results: - AIME 2025: DeepSeek-V3.2 scored 93.1, while GPT-5 and Gemini-3.0 scored 94.6 and 95.0 respectively [5]. - Harvard MIT Math Competition: DeepSeek-V3.2-Speciale scored 92.5, outperforming Gemini 3 Pro's 97.5 [5]. - International Math Olympiad: DeepSeek-V3.2-Speciale scored 78.3, close to Gemini 3 Pro's 83.3 [5]. Limitations and Future Plans - Despite these achievements, DeepSeek acknowledges limitations compared to proprietary models, including narrower world knowledge and lower token efficiency. The team plans to enhance pre-training and optimize reasoning chains to improve model performance [6][7]. - DeepSeek has identified three key areas where open-source models lag behind proprietary ones: reliance on standard attention mechanisms, insufficient computational resources during post-training, and gaps in generalization and instruction-following capabilities [7]. Technological Innovations - DeepSeek has introduced a sparse attention mechanism (DSA) to reduce computational complexity without sacrificing long-context performance. This innovation has been integrated into the new models, contributing to significant performance improvements [7]. Availability - The official website, app, and API for DeepSeek-V3.2 have been updated, while the enhanced Speciale version is currently available only through a temporary API for community evaluation [8]. Community Reception - The release has been positively received in social media, with users noting that DeepSeek's models have effectively matched the capabilities of GPT-5 and Gemini 3 Pro, highlighting the importance of rigorous engineering design over sheer parameter size [9].