Frontier AI
Search documents
Kimi K2 Thinking is CRAZY... (HUGE UPDATE)
Matthew Berman· 2025-11-07 21:36
Model Performance & Benchmarks - Kimmy K2 Thinking outperforms GPT-5 on the "Humanity's Last Exam" benchmark with a score of 44.9% compared to GPT-5's 41.7% [1] - In agentic search for Browse Comp, Kimmy K2 Thinking scores 60.2% versus 54.9% for GPT-5 [1][2] - Kimmy K2 Thinking achieves 83.1% on Live Codebench v6, a competitive programming benchmark [1] - The model can execute 200 to 300 sequential tool calls without human interference [1][2] - Kimmy K2 Thinking significantly outperforms the human baseline of 29.2% on browse comp with a score of 60.2% [2] Model Architecture & Training - The base Kimmy K2 model used 2.8 million H800 hours with 14.8 trillion tokens, costing approximately $5.6 to $6 million [3] - Kimmy K2 Thinking has a trillion parameters with 384 experts, while 32 billion parameters are active during inference [5][6] - Kimmy K2 Thinking has a vocabulary size of 160,000 [5] Market & Industry Impact - China is emerging as a key player in open-source, open-weights frontier AI models [9][10] - The cost of training frontier models is decreasing rapidly [3][4] Use Cases & Capabilities - Kimmy K2 Thinking can solve PhD-level mathematics problems using 23 tool calls in its chain of thought [1] - The model can create component-heavy websites and math explainer visualizations from single prompts [1] - Kimmy K2 Thinking can analyze the relationship between population density and healthcare facility accessibility, generating interactive maps and charts [11][12][13][14][15]
X @xAI
xAI· 2025-09-25 16:02
Announcing an expansion to xAI For Government – making industry leading Frontier AI accessible to United States Federal Government users.1) All federal agencies and departments will get access to our Frontier AI models (Grok 4, Grok 4 Fast) for $0.42 per department for a period of 18 months starting today.2) We are committing a team of Grok Engineers to help the government harness our AI to its fullest potentialWe’re also growing our team and are hiring mission driven engineers who want to join the cause. ...