错误修复机制 - filings, earnings calls, financial reports, news

错误修复机制

Search documents

量子位· 2025-07-11 07:20

Core Insights - The new Kimi model has surpassed DeepSeek-Prover-V2 in theorem proving, achieving state-of-the-art (SOTA) performance with 72 billion parameters compared to DeepSeek's 671 billion parameters [1][4]. Group 1: Model Development - The Kimi model is a collaboration between the Numina organization and the Kimi team, which previously won the progress award in the AI-MO competition [2][36]. - The Kimi theorem proving model is based on Qwen2.5-72B and utilizes the Kimi k1.5 reinforcement learning training process [8]. - Two simplified versions of the model, Kimina-Prover-Distill-8B and 1.7B, are also developed based on Qwen3-8B and Qwen3-1.7B respectively [10]. Group 2: Technical Innovations - The model introduces two major technical innovations: a trainable agent proof framework and a targeted error correction method [9][12]. - The testing-time reinforcement learning (TTRL) search framework allows the model to autonomously discover, combine, and reuse multiple intermediate lemmas, enhancing its problem-solving capabilities [13][24]. - The TTRL framework includes three components: reinforcement learning training, sub-lemma generation, and negation filtering [19]. Group 3: Performance Metrics - In the miniF2F benchmark test, Kimina-Prover achieved a pass rate of 84.0% at pass@32, which increased to 86.4% after an additional round of error correction [31]. - The final pass rate reached 92.2% after applying the complete TTRL search framework [33]. - Comparative performance metrics show that Kimina-Prover-72B achieved a pass rate of 63.9% at pass@1 and 87.7% at pass@1024, outperforming DeepSeek-Prover-V2 in several categories [34]. Group 4: Team and Support - The Numina team is a non-profit organization focused on advancing human and AI mathematics, supported by various institutions including MistralAI and Meta [36][37]. - The project involved 16 team members, including researchers from diverse backgrounds, contributing to the development of the Kimi model [39][40].

测试时强化学习（TTRL）搜索框架

错误修复机制

Artificial Intelligence

Artificial Intelligence

Kimina-Prover

DeepSeek-Prover-V2

Numina Math 7B