数学定理证明 - filings, earnings calls, financial reports, news

数学定理证明

Search documents

12.1万高难度数学题让模型性能大涨，覆盖FIMO/Putnam等顶级赛事难度，腾讯上海交大出品

量子位· 2025-06-06 00:58

DeepTheorem团队投稿量子位 | 公众号 QbitAI 12.1万道IMO级难度数学"特训题"，让AI学会像人类一样推导数学证明！ "特训"过后，模型定理证明性能大涨，7B模型性能比肩或超越现有的开源模型和Claude3.7等商业模型。 "特训题"为 Deep Theore m ，是首个基于自然语言的数学定理证明框架与数据集，由腾讯AI Lab与上海交大团队联合推出。团队表示，定理证明是数学前沿的重要组成部分，但当前大语言模型（LLM）在数学推理，特别是通过强化学习（RL）进行训练时，往往需要可以自动验证的答案，导致大模型无法像数学家那样通过自然语言进行定理证明。图（b）展示经过强化学习训练的DeepTheorem-7B模型性能，比肩或超越现有的开源模型和商业模型（Gemini2.0-flash， Qwen2.5-72B- Instruct， Claude3.7 等），仅次于o1、o3以及Gemini2.5-pro强推理模型。 DeepTheorem-121K 1、规模与难度：专为"极限挑战"而生 DeepTheorem训练集的显著特点是其大规模与高难度。其包含121K ...

TENCENT(HK:00700)

数学定理证明

强化学习

Artificial Intelligence

DeepTheorem

数学定理证明

强化学习

Artificial Intelligence

DeepTheorem

DeepSeek新数学模型刷爆记录！7B小模型自主发现671B模型不会的新技能

量子位· 2025-05-01 03:53

Core Insights - DeepSeek has launched a new model, DeepSeek-Prover-V2, which significantly improves performance in mathematical theorem proving, achieving a record of solving 49 problems in the Putnam test [2][36]. - The model demonstrates unique capabilities, particularly in solving problems that previous models, including a larger 671B model, could not address [9][10]. Model Development - DeepSeek-Prover series has evolved through several iterations: Prover-V1, Prover-V1.5, and now Prover-V2, with each version introducing enhancements in methodology and model architecture [11][12][14]. - Prover-V2 integrates a unified approach for formal and informal mathematical proofs, utilizing a more advanced base model, DeepSeek-V3, which enhances context handling and natural language reasoning [15][18]. Training Methodology - The training process for Prover-V2 involves a two-phase approach, focusing on both non-CoT and CoT (Chain of Thought) generation modes to improve reasoning capabilities [28][29]. - The model employs a unique reinforcement learning strategy, utilizing a binary reward system based on the correctness of generated proofs, which enhances its ability to connect informal reasoning with formal proof construction [32][33]. Performance Metrics - Prover-V2 achieved an impressive pass rate of 88.9% on the miniF2F test and solved 49 problems in the Putnam test, showcasing its advanced capabilities compared to previous models [36][40]. - The model's performance is further validated through a benchmark dataset, ProverBench, which includes 325 formalized problems from various mathematical domains [38][39]. Community Response - The release of Prover-V2 has garnered significant attention within the research community, with rapid engagement on platforms like GitHub and social media, indicating strong interest and validation of its contributions to formal mathematics [51][57]. - Notable figures in the field have praised the advancements made by DeepSeek, highlighting the competitive landscape of formal mathematics and the model's impact on the state-of-the-art [59].

数学定理证明

强化学习

Artificial Intelligence

Artificial Intelligence

DeepSeek-Prover-V2

DeepSeek-Prover-V1

DeepSeek-Prover-V1.5

AI数学天花板来了？DeepSeek新模型低调开源，网友直呼：R2指日可待！

Hua Er Jie Jian Wen· 2025-04-30 12:52

就在所有人都在期待DeepSeek官宣R2大模型之际，公司却出其不意地在"五一"前夕投下了另一枚技术炸弹。 4月30日，DeepSeek在Hugging Face平台上悄然开源了其最新模型——DeepSeek-Prover-V2-671B，一个专注于数学定理证明的大语言模型，专门针对形式化数学证明任务进行优化。 DeepSeek-Prover-V2-671B使用了DeepSeek-V3架构，参数高达6710亿，采用MoE（混合专家）模式，具有61层Transformer层，7168维隐藏层。 | Hugging Face Q. Search models, datasets, users ... | | Models | ■ Datasets ■ Spaces Posts | Docs | Enterprise | Pricing | VII | Log In Sign Up | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | < deepseek-ai/DeepSeek-Prover-V2-671B = 0 Wke 152 | Follo ...

Seek .(US:SKLTY)

Artificial Intelligence

数学定理证明

Artificial Intelligence

DeepSeek-Prover-V2-671B

DeepSeek R2

Artificial Intelligence

数学定理证明

Artificial Intelligence

DeepSeek-Prover-V2-671B

DeepSeek R2