Workflow
AI产业跟踪:马斯克发布Grok-4,大模型持续突破获得新发展
Changjiang Securities·2025-07-17 14:45

Investment Rating - The industry investment rating is "Positive" and maintained [6] Core Insights - On July 10, xAI released Grok-4, which includes Grok 4 (single-agent version) and Grok 4 Heavy (more powerful multi-agent version), priced at $30/month (SuperGrok) and $300/month (SuperGrok Heavy) respectively. It is currently available through xAI API and X platform, supporting a context window of 256k tokens. The multi-agent collaboration, deep tool integration, and interdisciplinary capabilities of Grok 4 have disrupted traditional testing limits, shifting future challenges from "passing human exams" to "inventing new technologies through physical validation" [2][3] Summary by Sections Event Description - On July 10, xAI launched Grok-4, which consists of Grok 4 (single-agent) and Grok 4 Heavy (multi-agent), with pricing set at $30/month and $300/month respectively. The product is accessible via xAI API and X platform, supporting a context window of 256k tokens [3] Event Commentary - Grok-4 has achieved significant breakthroughs in multiple benchmark tests, outperforming previous models. For instance, Grok4 scored 25.4% and 44.4% in the Humanities Last Exam (HLE), surpassing the Gemini 2.5pro records. In academic benchmarks, Grok4 achieved scores of 87.5% and 88.9% in GPQA, exceeding Gemini 2.5pro's 86.4%. Additionally, Grok4 scored 96.7% in the Harvard-MIT Math Tournament, significantly higher than Gemini 2.5pro's 82.5% [8] - The training scale has seen a leap, with the establishment of a top-tier supercomputing cluster (100,000 H100 GPUs) supporting training. The training volume from Grok 2 to Grok 4 increased by 100 times, with reinforcement learning (RL) computing power exceeding that of any other model on the market by over 10 times. This has led to a qualitative change in model inference capabilities [8] - Productization progress indicates significant improvements in voice mode, with a 50% reduction in latency and a tenfold increase in active users within eight weeks. The SuperGrok Heavy is now available, allowing users to deploy multi-agent research assistants [8] - The model's capabilities are expected to continue strengthening throughout 2025, with upcoming releases planned for coding models, multi-modal agents, and video generation models. The current focus is on optimizing visual capabilities and supporting enterprise-level physical simulation toolchains [8]