Grok 4.1 Fast
Search documents
低成本叫板GPT-5.1,马斯克杀入智能体
3 6 Ke· 2025-11-20 08:56
该模型在人工智能分析智能指数(AII)中跃升4位,达到第六位,仅次于第五位的Grok 4。其中,其在智能体调用测评²-Bench Telecom排行榜上以93.3% 的得分位居榜首,以更低成本超越了GPT-5.1(high)、Gemini 3 Pro等模型的性能表现,比Grok 4 Fast提高了27分。xAI还提到,Grok 4.1 Fast在事实性方 面更准确,幻觉率比Grok 4 Fast降低了一半。 ▲AII指数情况(图源:Artificial Analysis) 智东西11月20日报道,今日,马斯克的xAI公司推出xAI API的两大更新:快速、低成本、以智能体为中心的新模型Grok 4.1 Fast和智能体工具xAI Agent Tools API。 Grok 4.1 Fast是其迄今为止性能最佳的工具调用模型,拥有支持200万token上下文的窗口,它能够准确快速地进行推理并完成智能体任务,尤其擅长处理 客户支持和财务等复杂的实际应用场景。 ▲基于Grok 4.1 Fast搭建支持用户改预定的应用(图源:xAI) Agent Tools API使智能体能够访问实时X数据、网络搜索、远程代码执行等 ...
X @Elon Musk
Elon Musk· 2025-11-20 07:51
AI Model Performance Improvement - Grok 4.1 Fast demonstrates an approximate 3X increase in long-context accuracy, addressing the challenge of AI models losing focus in extended conversations [1] - Accuracy improved from 21% to 57%, representing an approximate 2.8x increase [2] - Reliability increased from 22% to 67%, showing an approximate 3x improvement [2] Key Features of Grok 4.1 Fast - Grok 4.1 Fast is specifically trained with "Long-Horizon" learning to maintain focus across 2 million-token conversations [1] - Grok 4.1 Fast aims to provide more reliable performance in tasks such as building support agents, debugging codebases, or writing novels [1]
X @Elon Musk
Elon Musk· 2025-11-20 07:50
AI Model Performance - Grok 4.1 Fast is dominating agentic workflows with a 93% accuracy score on the ²-Bench for Telecom (Agentic Tool Use) [1] - Grok 4.1 Fast is speed-running the entire industry right now [1] Product Development - The Grok 4.20 upgrade, which is a major improvement, might be ready by Christmas [1] Technology Focus - Tool calling is the whole game for AI agents [1]
反超Gemini 3,马斯克放出Grok4.1快速推理版,还曝出了新一轮150亿美元融资
3 6 Ke· 2025-11-20 07:09
Grok4.1刚刷完榜就被Gemini 3反超,马斯克确实坐不住了! 虽然大大方方地给Gemini 3送去了祝福,但另一边老马却开始紧急筹钱—— 据华尔街日报最新爆料,xAI正计划新一轮150亿美元(约1067亿人民币)融资,公司估值也将来到2300亿美元(约1.6万亿人民币)。 2300亿,这个数字比马斯克今年3月披露的1130亿美元估值(xAI与X合并后估值),翻了一倍不止。 以至于网友们纷纷感慨,马斯克还是太超前了,我等凡人不懂~ 甚至,对比起OpenAI这样的增长怪兽,xAI的估值飙升速度也是让人咂舌。 毕竟OpenAI手上有着ChatGPT这样的全球爆款应用,每个月光是订阅费就能为OpenAI带来2亿多美元收入。 而xAI的核心产品Grok,目前仍深度捆绑在X(原Twitter)的生态之内,其用户规模和商业影响力,与前者显然不在一个量级。 AI热潮,由此可见一斑。 一览xAI融资情况 还是先说说xAI此轮曝出的新融资。 据华尔街日报透露,相关融资细节是由马斯克的财富经理Jared Birchall曝出的,但目前尚不清楚2300亿美元是投前还是投后估值,以及也没有说明资金用 途。 有意思的是,大约上周 ...
狙击Gemini 3!OpenAI发布GPT-5.1-Codex-Max
量子位· 2025-11-20 07:01
Core Insights - The article discusses the competitive landscape of AI programming models, highlighting the release of OpenAI's new model, GPT-5.1-Codex-Max, which aims to outperform Gemini 3 and other models in the market [1][34]. Model Performance - GPT-5.1-Codex-Max has achieved a new state-of-the-art (SOTA) in METR, indicating its ability to complete software engineering tasks with a 50% success rate in a time frame that previously required human intervention of 2 hours and 42 minutes, now reduced by 25 minutes compared to its predecessor [11][12]. - The new model demonstrates improved efficiency in task execution, particularly in software engineering tasks such as PR creation and code review, and is the first OpenAI model capable of operating in a Windows environment [16][18]. Long-Running Tasks - GPT-5.1-Codex-Max can operate independently for over 24 hours, processing millions of tokens continuously, which is a significant advancement for handling long-duration tasks without losing context [25][21]. - The model's ability to compress dialogue when approaching context window limits allows it to maintain coherence over extended tasks, making it suitable for analyzing lengthy documents without information loss [22][27]. Competitive Landscape - The article notes that other AI models, such as Claude, are also evolving, with Claude Code being faster in execution compared to OpenAI's offerings [32][31]. - The rapid advancements in AI programming models indicate a highly competitive environment, with multiple companies releasing new versions and features in quick succession [34][13]. Additional Releases - OpenAI has also introduced GPT-5.1 Pro, which reportedly excels in instruction following, although details are limited [36][38].
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-11-20 04:23
Summary written by GrokTesla Owners Silicon Valley (@teslaownersSV):Summary of the xAI Announcement (Grok 4.1 Fast Release)xAI just dropped Grok 4.1 Fast — a new high-performance model optimized for tool-calling and agentic workflows. Key highlights:- Massive 2M token context window (perfect for long conversations, deep research, or complex https://t.co/ngsUHuWqrS ...
X @Elon Musk
Elon Musk· 2025-11-20 04:23
RT TΞTSUØ (@tetsuoai)Grok 4.1 Fast just dropped on the API 🔥Until Dec 3rd:✦ Grok 4.1 Fast = FREE on OpenRouter✦All Agent Tools = 100 % FREEThis is the agent model we've all been waiting for:✦ Tool calling: #1 on Berkeley leaderboard✦ Real-world: 100% on τ²-bench Telecom✦ Context: 2M tokens, stays solid on long tasks✦ Hallucinations: cut in half vs Grok 4 Fast✦ Still matches Grok 4 speed✦ Price: $0.20/M input, $0.50/M outputWho’s building? 👀 ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-11-20 04:22
RT Tesla Owners Silicon Valley (@teslaownersSV)🚀 BIG NEWS from xAI!Grok 4.1 Fast just dropped and it's absolutely dominating the leaderboards:- #1 on GPQA Diamond (88.2%)- #1 on AIME 2025 (94.5%)- #1 on Harvard-MIT Math (96.7%)- #1 on USAMO25 (61.9%)- #1 on ARC-AGI-2 (44.4% – nearly double the next best!)- #1 on Humanity’s Last Exam (24.9%)- #1 on LiveCodeBench (79.4%)- #1 overall on Artificial Analysis Intelligence IndexThis is the smartest model in the world right now. Period.Available today for SuperGrok ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-11-20 00:43
🚀 BIG NEWS from xAI!Grok 4.1 Fast just dropped and it's absolutely dominating the leaderboards:- #1 on GPQA Diamond (88.2%)- #1 on AIME 2025 (94.5%)- #1 on Harvard-MIT Math (96.7%)- #1 on USAMO25 (61.9%)- #1 on ARC-AGI-2 (44.4% – nearly double the next best!)- #1 on Humanity’s Last Exam (24.9%)- #1 on LiveCodeBench (79.4%)- #1 overall on Artificial Analysis Intelligence IndexThis is the smartest model in the world right now. Period.Available today for SuperGrok & Premium+ users on https://t.co/KaH5w8Ke4N an ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-11-20 00:10
Summary of the xAI Announcement (Grok 4.1 Fast Release)xAI just dropped Grok 4.1 Fast — a new high-performance model optimized for tool-calling and agentic workflows. Key highlights:- Massive 2M token context window (perfect for long conversations, deep research, or complex multi-turn tasks).- Trained with long-horizon reinforcement learning on diverse simulated environments → state-of-the-art on real-world enterprise tasks like customer support, coding agents, and research.- Blazing-fast inference + very c ...