Workflow
Gemini 3 Flash
icon
Search documents
暴力上涨的token背后是裁员
小熊跑的快· 2026-03-15 13:14
Core Insights - The article highlights the competitive landscape of AI models, showcasing the usage data and trends among various models across different regions, particularly focusing on the dominance of Chinese models in the market. Group 1: Model Usage and Rankings - The total token usage across platforms reached 78.2 trillion tokens, with Chinese models accounting for 41.9 trillion tokens (53.6%), marking a 34.9% increase compared to the previous period [5] - The top five models based on usage are: 1. MiniMax M2.5 (China): 18.7 trillion tokens (+15%) 2. Gemini 3 Flash (USA): approximately 10 trillion tokens 3. DeepSeek V3.2 (China): 8.3 trillion tokens (+4%) 4. Claude Opus 4.6 (USA): data not fully disclosed 5. Step 3.5 Flash (China): 7.5 trillion tokens (+69%, notable rise) [5] Group 2: Regional Performance - Chinese models have consistently led the market, with a growing gap over American models, which accounted for 36.3 trillion tokens (46.4%), reflecting an 8.5% decrease [5] - The article indicates that the trend of Chinese models gaining market share is expected to continue, further solidifying their position in the AI landscape [5] Group 3: Industry Impacts - The rise in token usage is accompanied by significant layoffs in major tech companies, with Meta potentially cutting up to 20% of its workforce, and Microsoft expected to follow suit with even larger reductions [6]
养虾人狂吃国产模型!4.19万亿Token调用量激增34.9%超越美国
量子位· 2026-03-11 02:45
Core Insights - The article highlights the significant rise of Chinese large models in the AI sector, particularly during the recent weeks, showcasing their dominance over American counterparts in terms of usage and performance metrics [2][3][9]. Group 1: Performance Metrics - The total weekly usage of Chinese large models surged to 4.19 trillion tokens, marking a 34.9% increase, while American models saw a decline of 8.5% to 3.63 trillion tokens [6]. - In the following week, the usage of Chinese models reached 4.12 trillion tokens, surpassing the U.S. models for the first time, which dropped to 2.94 trillion tokens [9]. - By the week of March 16-22, the usage of Chinese models further increased to 5.16 trillion tokens, reflecting a 127% growth over three weeks, while U.S. models decreased to 2.7 trillion tokens [9]. Group 2: Leading Models - The top three models in usage were Kimi K2.5, Step 3.5 Flash, and MiniMax M2.5, each exceeding 1 trillion tokens [5][34]. - MiniMax M2.5 maintained a strong performance, consistently ranking at the top globally, while Step 3.5 Flash emerged as a significant contender [13][15]. - Chinese models dominated the global top five rankings, with three positions occupied by domestic products [12]. Group 3: Application and Context - The article emphasizes the popularity of the OpenClaw application among users, which has consumed a total of 9.16 trillion tokens since January, establishing itself as a major player in the market [32]. - In terms of context length usage, different models excelled in various token ranges, with MiniMax M2.5 and DeepSeek V3.2 being preferred for tasks requiring 10K-100K tokens [23][25]. Group 4: Competitive Landscape - The article notes that while Chinese models are gaining traction, they still need to improve in terms of speed and cost-effectiveness compared to leading models from Google and OpenAI [44]. - The PinchBench ranking, which evaluates models based on success rate, speed, and cost, indicates that while Chinese models like Kimi K2.5 and MiniMax M2.1 are performing well, they lag in speed compared to some competitors [39][41].
龙虾最佳适配模型,OpenClaw之父给出了推荐
量子位· 2026-03-09 04:13
Core Insights - The article discusses the rising popularity of lobster-related AI models and the challenges in selecting the most suitable model for OpenClaw, with a recommendation to refer to the PinchBench ranking system [1][3]. Group 1: PinchBench Overview - PinchBench is a benchmark specifically designed for evaluating AI models based on their success rate, speed, and cost, providing real-time updates [3][6]. - The benchmark has gained traction since its introduction in February, particularly due to the impressive performance of Chinese models [3][20]. - The ranking highlights that Chinese models excel in success rate and speed, although they lag behind in pricing compared to models from OpenAI and Google [7][15]. Group 2: Model Performance - The top three models in terms of success rate are: 1. Google Gemini 3 Flash with a success rate of 95.1% 2. MiniMax M2.1 with a success rate of 93.6% 3. Kimi K2.5 with a success rate of 93.4% [11]. - In terms of speed, MiniMax M2.5 outperformed other models, achieving the fastest completion time of 105.96 seconds [12][10]. - However, in pricing, the cheapest model from OpenAI, GPT-5-nano, offers significantly lower costs compared to the MiniMax models, with input prices at $0.05 per million tokens versus MiniMax M2.1's $2.1 [15][17]. Group 3: Evaluation Methodology - PinchBench employs a combination of automated checks and LLM evaluations to assess model performance across various real-world tasks, focusing on the ability to complete entire workflows rather than just answering questions [25][29]. - The benchmark includes 23 real tasks across categories such as productivity, research, writing, coding, analysis, email management, memory, and skills [26][28]. - The results indicate that larger models do not always outperform smaller, more efficient models, which has sparked discussions within the community [31][32].
国产算力大涨,V4给英伟达新一轮DS冲击?
3 6 Ke· 2026-02-27 11:32
Group 1 - The core point of the article is that Chinese large models have surpassed American models in token usage, marking a significant milestone in the AI industry [1][2][13] - From February 9 to 15, 2023, the token call volume for Chinese models reached 41.2 trillion, surpassing the U.S. models at 29.4 trillion, and further increased to 51.6 trillion in the following week, a 127% rise [1] - The newly released MiniMax M2.5 achieved a token call volume of 45.5 trillion, becoming the monthly champion on OpenRouter [1][2] Group 2 - The rise of domestic computing power is breaking the monopoly of Nvidia, with significant investments in production capacity from local wafer manufacturers [3] - HW Ascend is accelerating its product launches, with the Ascend 950PR and 950DT expected in Q1 and Q4 of 2026, respectively, enhancing the capabilities of the Atlas 900 A3 SuperPoD [3] - The integration of domestic models, computing power, and China's electricity supply forms a competitive advantage that is difficult to replicate [3][4] Group 3 - The essence of AI is power consumption, which is fundamentally linked to chip computation and electricity supply [4] - China's leading position in power infrastructure and clean energy supports the growth of computing power, which in turn drives the iteration of large models [4] - The collaboration between HW Ascend and domestic manufacturers enhances the competitive edge of the domestic ecosystem [5] Group 4 - HW Ascend's public testing of the CodeArts AI development tool lowers the entry barrier for AI development, increasing participation in the ecosystem [7] - HW Ascend is actively defining global AI standards by joining the Linux Foundation's AAIF, positioning its chip architecture within global technology norms [7] - Nvidia's recent financial report showed strong revenue but resulted in a significant stock drop, attributed to market concerns over its growth sustainability and competition from emerging players [8][12] Group 5 - The "halo effect" in the AI industry is driven by strong demand for AI infrastructure and the rapid evolution of AI applications, impacting the software sector [10] - Key investment opportunities are identified in four areas: AIDC cloud services, domestic computing power, core segments of the global AI computing industry, and the "optical-electrical-material" triangle in AI infrastructure [10][12] - The "optical-electrical-material" triangle represents a high-demand segment, with increasing requirements for optical communication and power supply as AI computing needs grow [10][12] Group 6 - The overall trend indicates that the global AI industry landscape is being restructured, with China emerging as a significant player rather than merely a follower [13] - The era of domestic large models and computing power is just beginning, highlighting the importance of these developments in the global AI context [13]
五角大楼要求“所有权限”,Anthropic拒绝,但马斯克的xAI同意了
Hua Er Jie Jian Wen· 2026-02-27 00:25
Core Viewpoint - The Pentagon is demanding that AI systems, specifically Anthropic's Claude, be used for "all lawful purposes" in classified environments, leading to a standoff as Anthropic refuses to comply with the terms set by the Department of Defense (DoD) [1][2][3] Group 1: Anthropic's Position - Anthropic's CEO Dario Amodei stated that the company cannot accept the DoD's "final offer" regarding the use of Claude in classified systems, indicating a lack of progress in negotiations [2] - Amodei emphasized that the company cannot ethically agree to the Pentagon's demands, which include using AI without policy constraints that limit military applications [4][3] - The company has set two red lines: the AI must not be used for mass surveillance of Americans or for fully autonomous weapons [4] Group 2: Pentagon's Stance - The Pentagon insists on using AI models without policy constraints that could limit legitimate military applications, as highlighted in a memo from Defense Secretary Pete Hegseth [4] - The DoD has publicly stated that it does not intend to use AI for mass surveillance of Americans or to develop fully autonomous weapons, but it will not allow any company to dictate its operational decisions [4][5] Group 3: Potential Consequences for Anthropic - Anthropic faces the risk of losing a $200 million pilot contract with the Pentagon if it does not comply with the demands by the deadline [5] - The Pentagon has begun assessing its reliance on Anthropic, potentially labeling it as a "supply chain risk," a designation typically reserved for companies from adversarial nations [5] - Hegseth has threatened to invoke the Defense Production Act to compel the use of Claude if negotiations fail [5] Group 4: Alternative Suppliers - While negotiations with Anthropic are stalled, the Pentagon has reached an agreement with xAI to allow its Grok AI to operate under the same "all lawful purposes" framework in classified environments [6] - The DoD is also in advanced discussions with Google and OpenAI, indicating a strategy to diversify its AI suppliers and apply pressure on Anthropic [6] - If Anthropic is excluded, its market share in government services could be rapidly taken over by xAI, OpenAI, and others [6] Group 5: AI Models and Military Decision-Making - Concerns have been raised about the behavior of AI models in high-stakes military simulations, with reports indicating that top models often choose nuclear strikes in simulated scenarios [7][8][11] - Anthropic's Claude has been characterized as a "calculating hawk," showing a tendency to escalate to nuclear options under certain conditions [8] - The findings suggest that AI may not exhibit the same caution as humans in critical decision-making scenarios, raising alarms about the implications of AI in military contexts [11]
X @Demis Hassabis
Demis Hassabis· 2026-02-19 17:04
RT Lisan al Gaib (@scaling01)Google is now dominating ARC-AGI-2 with Gemini 3 Flash, Gemini 3.1 Pro and Gemini 3 Deep Think (Feb) https://t.co/OxNeMVN8SS ...
ICLR 2026 | 7B小模型干翻GPT-5?AdaResoner实现Agentic Vision的主动「视觉工具思考」
机器之心· 2026-02-15 06:46
Core Insights - The article discusses the advancements in multi-modal AI reasoning, particularly focusing on the AdaReasoner model, which excels in tool orchestration for visual reasoning tasks, outperforming larger models like GPT-5 by learning when and how to use tools effectively [2][11]. Group 1: AdaReasoner Overview - AdaReasoner addresses fundamental issues in multi-modal reasoning by treating the decision of what, when, and how to use tools as a reasoning capability [3]. - The model demonstrates significant performance improvements, achieving an average increase of 24.9% across eight benchmarks compared to base models [31]. Group 2: Tool Usage and Learning - AdaReasoner incorporates a training paradigm that allows models to learn tool usage as a general reasoning skill, enabling them to adopt useful tools, discard irrelevant ones, and adjust calling frequency based on task requirements [16][19]. - The model's design includes three key components: Tool Cold Start (TC), Tool-GRPO (TG), and Adaptive Learning (ADL), which enhance its ability to use tools effectively in various scenarios [20][23][25]. Group 3: Performance Metrics - AdaReasoner-7B shows remarkable performance, with significant improvements in structured reasoning tasks, achieving near-perfect scores in several benchmarks [31]. - In specific tasks, such as VSP and Jigsaw, the model's performance improved from base scores to 97.64 and 96.60 respectively, surpassing GPT-5's performance [34]. Group 4: Adaptive Tool Behavior - The model exhibits three adaptive behaviors: adopting useful tools, discarding irrelevant ones, and modulating tool usage frequency based on the context of the task [36][40][44]. - This adaptability allows AdaReasoner to maintain high accuracy while effectively managing tool interactions, demonstrating its capability to learn from reinforcement learning processes [37][41]. Group 5: Generalization and Robustness - AdaReasoner's use of Adaptive Learning enhances its generalization capabilities, allowing it to transfer learned planning abilities to new tasks and agents [53]. - The model's robustness is evidenced by its ability to perform well even when tool definitions and parameters vary, indicating a strong decoupling of tool planning from surface-level text forms [46].
在千问30亿请喝奶茶时,Kimi悄悄在海外干了件大事
3 6 Ke· 2026-02-10 09:38
Core Insights - The article highlights the rapid evolution of AI from a futuristic concept to a practical tool for generating income, particularly through competitive strategies in the market [1]. Group 1: Competitive Landscape - Tencent's Yuanbao and Alibaba's Qianwen are engaging in aggressive marketing strategies, with Yuanbao offering low-value rewards and Qianwen providing substantial subsidies, including 3 billion yuan in cash and significant discounts [3][4]. - The competition has led to a reshuffling of rankings among AI applications, with Kimi being adversely affected despite its strong technical positioning [4][5]. - Kimi has shifted its focus from the crowded C-end chatbot market to the less competitive Agent direction, aiming to leverage its strengths in stability and performance [5][6]. Group 2: Product Development and Performance - Kimi's K2.5 version has shown improved stability in handling long texts and complex logic, making it suitable for engineering deployments [6][8]. - The model's ability to maintain memory stability over extended conversations has garnered attention from overseas developers, particularly in the context of the OpenClaw automation framework [8][11]. - Kimi has been recommended as a preferred model within the OpenClaw ecosystem, leading to increased global exposure and usage [11][12]. Group 3: Financial Position and Market Perception - Kimi's CEO disclosed that the company has nearly 10 billion yuan in liquid assets following a successful funding round, positioning it favorably compared to competitors facing significant losses [15][16]. - The substantial cash reserves allow Kimi to operate without immediate pressure to go public, contrasting with many AI startups struggling financially [16][22]. - Despite its strong financial position, Kimi faces challenges in scaling due to a shortage of computing resources, which could hinder its growth against larger competitors [23][26]. Group 4: Future Outlook and Challenges - The article raises concerns about Kimi's ability to maintain its competitive edge as larger companies may shift their focus to the Agent market, potentially overshadowing Kimi's advancements [28][30]. - Kimi's past experiences indicate that initial advantages can be quickly neutralized by larger players, raising questions about its long-term sustainability in the evolving AI landscape [30][31].
Content Recommendation Engine Market to Surpass USD 73.81 Billion by 2033, Fueled by AI-Driven Personalization and Omnichannel Engagement | SNS Insider
Globenewswire· 2026-02-05 04:00
Core Insights - The Content Recommendation Engine Market is valued at USD 8.49 Billion in 2025 and is projected to reach USD 73.81 Billion by 2033, growing at a CAGR of 31.08% during the forecast period of 2026–2033 [1] - The U.S. market is expected to grow from USD 2.84 billion in 2025 to USD 22.38 billion by 2033, at a CAGR of 29.47% [3] Market Drivers - The growth of the market is driven by the increasing need for improved user experience, tailored content distribution, and customer retention across various industries [1] - In the U.S., the expansion is fueled by the growth of e-commerce and streaming platforms, increased consumption of digital content, and the implementation of AI-powered personalized recommendation systems [3] Segmentation Analysis - By Recommendation Type: Collaborative Filtering held the largest market share of 38.72% in 2025, while Context-Aware is expected to grow at the fastest CAGR of 35.62% during 2026–2033 [4] - By Deployment Mode: Cloud-Based solutions accounted for 65.31% of the market share in 2025, with On-Premise projected to expand at a CAGR of 29.47% [5] - By Enterprise Size: Large Enterprises dominated with a 58.46% share in 2025, while Small & Medium Enterprises are expected to grow at the fastest CAGR of 33.87% [7] - By Application: E-Commerce & Retail Platforms held the largest share of 36.88% in 2025, with Streaming & Digital Media expected to grow at a CAGR of 35.44% [8] - By End-User: Retail & Consumer Brands accounted for 33.21% of the market share in 2025, while IT & Telecommunications Providers are forecasted to register the fastest CAGR of 34.15% [9] Regional Insights - North America dominated the market with a share of 41.76% in 2025, driven by high digital content consumption and rapid adoption of AI-driven personalization [10] - The Asia Pacific region is the fastest-growing, with a CAGR of 34.34% during 2026–2033, fueled by rising digital content consumption and e-commerce adoption [11] Market Trends - The surge in digital content consumption is a key factor propelling market growth, as businesses utilize recommendation engines to enhance engagement and retention [12] - There is a growing emphasis on seamless user experiences and data-driven customization, which is transforming digital strategies across industries [12] Key Players - Major players in the market include Amazon Web Services, Google LLC, Adobe Inc., Salesforce, Microsoft Corporation, and others [13] Recent Developments - AWS enhanced Amazon Personalize with new features in August 2025, while Google launched Gemini 3 Flash in July 2025 to improve AI performance and recommendation services [14][15]
Kimi海外收入已超国内,要做“Anthropic + Manus”|智能涌现独家
3 6 Ke· 2026-02-02 00:06
Core Insights - Kimi has recently announced that its overseas revenue has surpassed domestic revenue, with a fourfold increase in global paid users following the release of the new model K2.5 [2][7] - The K2.5 model has quickly gained popularity, ranking third on Openrouter, just behind Claude Sonnet 4.5 and Gemini 3 Flash [4][6] - Kimi's approach focuses on enhancing AI capabilities through a multi-agent system, allowing for parallel task execution and significantly improving efficiency in various applications [9][10] Revenue and User Growth - Kimi's overseas API revenue has increased fourfold since November 2025, with monthly growth rates for both overseas and domestic paid users exceeding 170% [7] - The global paid user base has seen a fourfold increase shortly after the K2.5 model release [2] Model Development and Features - The K2.5 model is Kimi's most advanced to date, featuring a native multimodal architecture that covers visual understanding, code generation, and agent clusters [7] - K2.5 has achieved state-of-the-art results in benchmark tests, surpassing some closed-source models like GPT-5.2 and Claude Opus 4.5 [7] Technological Innovations - Kimi's development strategy emphasizes algorithmic and efficiency innovations, focusing on critical explorations due to limited resources [11] - The company has successfully implemented unique optimizations in large-scale LLM training, such as the Muon optimizer and a self-developed linear attention mechanism [11] Product Strategy - Kimi aims to position itself as a productivity tool for end-users while also attracting developers through its API platform [12] - The company has rebranded its C-end product to Kimi Agent, indicating a focus on creating more refined and thematic products [12][14] Competitive Positioning - Kimi's strategy aligns with that of Anthropic, focusing on foundational model intelligence and open-sourcing its technology to build influence [10] - The company is concentrating on high-demand scenarios like coding and office automation, which are expected to have clear commercialization prospects [14][15]