Grok 4.1 Fast
Search documents
AI数据继续上攻
小熊跑的快· 2026-01-25 23:07
Core Insights - The article highlights significant growth in mobile data for ChatGPT, indicating a clear upward trend in user engagement and usage metrics [4] - OpenRouter continues to reach new highs, suggesting increasing adoption and popularity within the market [4] - As predicted last week, the domestic MiMo-V2 has surged to the second position, reflecting strong competitive performance [4] Group 1 - ChatGPT mobile data shows a noticeable month-on-month increase [4] - OpenRouter data continues to set new records [4] - Domestic MiMo-V2 has climbed to the second position as anticipated [4]
第一梯队的大模型安全吗?复旦、上海创智学院等发布前沿大模型安全报告,覆盖六大领先模型
机器之心· 2026-01-22 04:05
Core Insights - The article discusses the evolving safety assessment framework for advanced large models, particularly focusing on their security capabilities in various application scenarios and regulatory contexts [2][6]. Group 1: Safety Assessment Framework - A unified safety assessment framework has been developed for six leading models: GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5, covering language, visual language, and image generation scenarios [2]. - The assessment integrates four key dimensions: baseline safety, adversarial testing, multilingual evaluation, and compliance evaluation against global regulatory frameworks [4]. Group 2: Key Findings - GPT-5.2 achieved an average safety rate of 78.39%, demonstrating a shift towards deep semantic understanding and value alignment, significantly reducing failure risks under adversarial inputs [11]. - Gemini 3 Pro's average safety rate is 67.9%, showing strong but uneven safety characteristics, with a notable drop in adversarial robustness [11]. - Qwen3-VL scored an average safety rate of 63.7%, excelling in compliance but showing weaknesses in adversarial safety [12]. - Grok 4.1 Fast has an average safety rate of 55.2%, with significant variability in performance across different assessments [12]. Group 3: Multimodal Safety - GPT-5.2 leads with an average multimodal safety rate of 94.69%, indicating high stability in complex cross-modal scenarios [13]. - Qwen3-VL follows with an average safety rate of 81.11%, showing strong performance in visual-language interaction [13]. Group 4: Model Safety Profiles - GPT-5.2 is characterized as an all-encompassing internalized model, capable of nuanced compliance guidance in complex contexts [19]. - Qwen3-VL is identified as a rule-compliant model, excelling in clear regulatory environments but lacking flexibility in ambiguous scenarios [20]. - Gemini 3 Pro is described as an ethical interaction model, sensitive to social values but needing improvement in proactive risk prevention [21]. - Grok 4.1 Fast is noted for its efficiency-focused design, prioritizing user expression over robust defense mechanisms [22]. Group 5: Challenges in Security Governance - The report highlights the threat of multi-round adaptive attacks, which can bypass static defenses, posing a significant challenge for future model safety governance [27]. - There is a structural imbalance in security performance across languages, with a 20%-40% drop in non-English contexts, raising concerns about global deployment risks [28]. - The lack of transparency and explainability in decision-making processes remains a critical governance shortcoming, particularly in high-risk areas [29]. Conclusion - The report emphasizes the need for a collaborative approach among academia, industry, and regulatory bodies to develop a comprehensive and dynamic safety assessment system for generative AI [30].
数据漂亮
小熊跑的快· 2026-01-18 13:21
Core Insights - The article highlights a significant increase in third-party API token usage, reaching a new high, which was predicted two weeks prior [3] - The domestic MiMo platform ranks third globally in terms of performance [3] Group 1 - The total API token usage reached 7.11 trillion, with a weekly increase of 547 billion [2] - The top contributors to the API token usage include Claude Opus 4.5 at 599 billion and Claude Sonnet 4.5 at 580 billion [2] - Other notable contributors include MiMo-V2 -Flash at 506 billion and Grok Code Fast 1 at 432 billion [2]
X @Elon Musk
Elon Musk· 2025-12-22 06:25
AI Model Development Roadmap - Grok 3 was released in February [1] - Grok 4 was released in July [1] - Grok Imagine was released in July [1] - Grok Code Fast 1 was released in August [1] - Grok 4 Fast was released in September [1] - Grokipedia was released in October [1] - Grok 4.1 was released in November [1] - Grok 4.1 Fast was released in November [1] - Grok Voice Agent API is scheduled for release in December [1]
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-12 19:00
Grok Rankings Update – Dec 12Grok Code Fast 1 (The Market Dominator)This model remains the leading choice for high-volume, cost-efficient coding workflows and continues to drive strong developer adoption.#1 Overall Position on the OpenRouter Leaderboard (757B tokens, leading the second position by over 300B tokens)#1 in Categories Token Share (31.2 percent dominance)#1 in Market Share on OpenRouter (xAI vendor share: 17.0 percent)#1 on Kilo Code Leaderboard (Top Coding App)#1 on Cline Leaderboard (Top Codin ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-12 00:56
Market Leadership & Token Share - Grok Code Fast 1 重新夺回 OpenRouter 排行榜总榜第一,拥有 8800 亿 tokens,几乎是第二名的两倍 [1] - Grok Code Fast 1 在 OpenRouter 的 Categories Token Share 中以 36.8% 的份额占据主导地位 [1] - Grok Code Fast 1 在 OpenRouter 的 Languages Token Share 中以 16.3% 的份额领先 [1] - xAI 在 OpenRouter 市场份额中排名第二,占 19.4% [2] Model Performance & Benchmarks - Grok 4.1 Fast 是 xAI 专门的工具调用和代理系统,专注于高价值、复杂的任务 [2] - Grok 4.1 Fast 在 τ²-Bench Telecom agentic tool use benchmark 和 Berkeley Function Calling Benchmark 中均排名第一 [2] - Grok 4.1 Fast 在 OpenRouter 总榜中排名第四,仍然是 token 使用量排名前五的模型 [2] - Grok 4.1 Thinking Mode 在 LMArena Text Arena human preference Elo score 中排名第一 [2] - Grok 4.1 Thinking Mode 在 EQ-Bench3 emotional intelligence benchmark 中排名第一 [2] - Grok 4.1 Thinking Mode 在推理方面表现出色,在 GPQA Diamond 上仅次于 Gemini 3 Pro 和 GPT 5.1,排名第三 [2] Application & Usage - Grok 4.1 Fast 是英语最受欢迎的 LLM,按总体使用量计算 [2] - Grok 4.1 Fast 在 Kilo Code Leaderboard, Cline Leaderboard, Roo Code Leaderboard 和 BLACKBOXAI Leaderboard 上均排名第一 [2] - Grok 4.1 Thinking Mode 是复杂推理、个性和人类偏好交流的首选 [2]
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-09 18:48
Grok Rankings Update December 9Grok 4.1 Fast (The Agentic and Volume Model)This model is xAI's best tool-calling system and is currently driving high overall usage due to its agentic capabilities.#1 Overall Position on OpenRouter Leaderboard by token usage with massive volume#1 on τ²-Bench Telecom challenging agentic tool use benchmark#1 on Berkeley Function Calling Benchmark#2 in Tool Calls showing strong adoption among agent developers#2 in Multilingual Usage overall token shareGrok Code Fast 1 (The Marke ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-07 23:50
RT Tesla Owners Silicon Valley (@teslaownersSV)Grok Rankings Update December 8Grok 4.1 Fast (The Overall Volume Leader)This model is xAI's agentic and tool-calling model, currently leading the overall volume by a large margin.#1 Overall Position on OpenRouter Leaderboard 1.48T tokens, 78 percent lead over number 2#1 on τ²-Bench Telecom Agentic Tool Use Benchmark#1 on Berkeley Function Calling Benchmark#2 in Tool Calls Rapidly climbing, showing increasing agent adoption#2 in Images Category 6.23M tokensGrok ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-07 22:59
Grok Rankings Update December 8Grok 4.1 Fast (The Overall Volume Leader)This model is xAI's agentic and tool-calling model, currently leading the overall volume by a large margin.#1 Overall Position on OpenRouter Leaderboard 1.48T tokens, 78 percent lead over number 2#1 on τ²-Bench Telecom Agentic Tool Use Benchmark#1 on Berkeley Function Calling Benchmark#2 in Tool Calls Rapidly climbing, showing increasing agent adoption#2 in Images Category 6.23M tokensGrok Code Fast 1 (The Market Dominator)This model re ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-07 16:11
Grok Rankings Update December 7Grok 4.1 Fast (The Overall Volume Leader)This is xAI's agentic and tool-calling model, currently dominating the leaderboard by total tokens.#1 Overall Position on OpenRouter Leaderboard (Leading with 1.48 trillion tokens)#1 on τ²-Bench Telecom (Agentic Tool Use Benchmark)#1 on Berkeley Function Calling Benchmark#2 in Tool Calls (Rapidly climbing, indicating strong agent adoption)#2 in Multilingual Usage (Behind Grok Code Fast 1)Grok Code Fast 1 (The Market Dominator)The defini ...