Workflow
刚刚,马斯克发布Grok 4!全榜第一,年费飚到2万+
创业邦·2025-07-10 10:43

Core Viewpoint - The article discusses the launch of xAI's next-generation AI model, Grok 4, which is claimed to surpass human reasoning capabilities and achieve unprecedented performance in various benchmarks [3][4][6]. Group 1: Model Capabilities - Grok 4 reportedly scores full marks on the SAT and near-perfect scores on the GRE, showcasing its advanced reasoning abilities [6]. - The model's reasoning capability has improved tenfold compared to its predecessor, Grok 2, due to enhanced computational power and reinforcement learning training [9]. - Grok 4 achieved a score of 73 in overall performance, surpassing other leading models like o3 and Gemini 2.5 Pro [22]. Group 2: Benchmark Testing - In the Humanities Last Exam (HLE), Grok 4 scored 35% initially, which improved to 45% with reasoning techniques, while other state-of-the-art (SOTA) models scored a maximum of 41% [12][13]. - Grok 4 Heavy achieved a score of 44.4% in HLE, and with more time and tool usage, it could potentially reach 50.7% [16]. - Grok 4 also excelled in various other benchmarks, including GPQA and AIME25, achieving the latest SOTA results [18]. Group 3: Technological Advancements - Grok 4's voice capabilities have doubled in speed compared to its predecessor, with a tenfold increase in daily user engagement [31]. - The model can now support five different voices and has introduced new characters, Eve and Sal, in its iOS version [33]. - Grok 4 has shown significant improvements in general reasoning capabilities, achieving a score of 15.9% in the ARC-AGI benchmark, nearly doubling the previous commercial SOTA [35]. Group 4: Future Developments - xAI plans to release additional models, including code models and multimodal agents, aiming for a rapid monthly release schedule [47]. - Grok 4 is available for paid use, with pricing set at $300 per year for SuperGrok and $3,000 for SuperGrok Heavy [50].