Numina Math 7B
Search documents
Kimi新模型数学反超DeepSeek!北大校友刘征瀛等领衔
量子位· 2025-07-11 07:20
Core Insights - The new Kimi model has surpassed DeepSeek-Prover-V2 in theorem proving, achieving state-of-the-art (SOTA) performance with 72 billion parameters compared to DeepSeek's 671 billion parameters [1][4]. Group 1: Model Development - The Kimi model is a collaboration between the Numina organization and the Kimi team, which previously won the progress award in the AI-MO competition [2][36]. - The Kimi theorem proving model is based on Qwen2.5-72B and utilizes the Kimi k1.5 reinforcement learning training process [8]. - Two simplified versions of the model, Kimina-Prover-Distill-8B and 1.7B, are also developed based on Qwen3-8B and Qwen3-1.7B respectively [10]. Group 2: Technical Innovations - The model introduces two major technical innovations: a trainable agent proof framework and a targeted error correction method [9][12]. - The testing-time reinforcement learning (TTRL) search framework allows the model to autonomously discover, combine, and reuse multiple intermediate lemmas, enhancing its problem-solving capabilities [13][24]. - The TTRL framework includes three components: reinforcement learning training, sub-lemma generation, and negation filtering [19]. Group 3: Performance Metrics - In the miniF2F benchmark test, Kimina-Prover achieved a pass rate of 84.0% at pass@32, which increased to 86.4% after an additional round of error correction [31]. - The final pass rate reached 92.2% after applying the complete TTRL search framework [33]. - Comparative performance metrics show that Kimina-Prover-72B achieved a pass rate of 63.9% at pass@1 and 87.7% at pass@1024, outperforming DeepSeek-Prover-V2 in several categories [34]. Group 4: Team and Support - The Numina team is a non-profit organization focused on advancing human and AI mathematics, supported by various institutions including MistralAI and Meta [36][37]. - The project involved 16 team members, including researchers from diverse backgrounds, contributing to the development of the Kimi model [39][40].