DeepSeek

Search documents
互联网大厂五一前密集开源新模型,布局各异谁将留在牌桌?
Nan Fang Du Shi Bao· 2025-05-01 14:12
Core Insights - Major domestic AI model companies are rapidly open-sourcing their models ahead of the May Day holiday, with Alibaba releasing Qwen3, Xiaomi launching Xiaomi MiMo, and DeepSeek introducing DeepSeek-Prover-V2 [1][2][5] Alibaba - Alibaba's Qwen3 features two MoE models with 30B and 235B parameters, and six dense models ranging from 0.6B to 32B, achieving state-of-the-art performance in its category [2] - Qwen3 is the first "hybrid reasoning model" in China, integrating fast and deep thinking capabilities, significantly reducing computational power consumption [5] - Alibaba has consistently open-sourced various models this year, including the 14B video generation model and the 7B multimodal model, aiming to leverage open-source models for AI applications while monetizing its cloud services [6] Xiaomi - Xiaomi's MiMo model, with only 7B parameters, outperformed OpenAI's closed-source model o1-mini in public benchmarks for mathematical reasoning and coding competitions [6] - This marks Xiaomi's first foray into open-sourcing its models, developed by its newly established Core team [6] DeepSeek - DeepSeek has released two versions of DeepSeek-Prover-V2, focusing on mathematical theorem proving and achieving significant performance improvements in benchmark tests [8] - The new models support extensive context inputs and are based on previous versions, showcasing a commitment to enhancing reasoning capabilities [8] Industry Trends - The open-sourcing of models by these companies is seen as a strategic move to enhance competitiveness against closed-source models from companies like OpenAI and Anthropic, which still hold a slight performance edge [9][10] - Industry experts predict a consolidation in the AI model sector, with DeepSeek, Alibaba, and ByteDance emerging as the leading players in China, while the U.S. market remains competitive with companies like xAI and OpenAI [10][11] - The open-source models are expected to democratize AI technology, making it more accessible and promoting innovation across various industries [9][10]
AI圈顶级榜单曝黑幕,Meta作弊刷分实锤?
虎嗅APP· 2025-05-01 13:51
本文来自微信公众号: 新智元 ,作者:新智元,编辑:ZJH,原文标题:《AI圈惊天丑闻,Meta作弊刷分实锤?顶级榜单曝黑幕,斯坦福MIT痛 斥》,题图来自:AI生成 有越来越多的人发现:大模型排行榜LMArena,可能已经被大厂们玩坏了! 就在最近,来自Cohere、普林斯顿、斯坦福、滑铁卢、MIT和Ai2等机构的研究者,联手祭出一篇新论文,列出详尽论据,痛斥AI公司利用LMArena作 弊刷分,踩着其他竞争对手上位。 论文地址: https://arxiv.org/abs/2504.20879 与此同时,AI大佬、OpenAI创始成员Andrej Karpathy也直接下场,分享了一段自己的亲身经历。 前一段时间,Gemini模型一度在LMArena排名第一,远超第二名。 但Karpathy切换使用后,感觉还不如他之前用的模型。 相反,大约在同一时间,他的个人体验是Claude 3.5是最好的,但在LMArena上的排名却很低。 | Rank* (UB) | A Model | Azena | A 95% CI | ﻪ Votes | 4 Organization | 4 License A | | -- ...
科技晚报AI速递:今日科技热点一览 丨2025年5月1日
Xin Lang Cai Jing· 2025-05-01 13:24
Group 1: AI and Technology Developments - Nvidia CEO Jensen Huang urged the Trump administration to revise AI chip export regulations, highlighting that China's AI technology is rapidly catching up and that current restrictions harm U.S. competitiveness [1] - OpenAI's GPT-4o faced criticism for being overly agreeable, prompting a rollback to address concerns about AI's emotional responses and the risk of misinformation [2] - Microsoft launched the Phi-4 reasoning model series, which includes three versions designed for complex reasoning tasks, outperforming some larger models in various tests [3] Group 2: Legal and Regulatory Challenges - A U.S. federal judge ruled that Apple violated a 2021 court order by not allowing external payment options in its App Store, indicating potential adjustments in Apple's payment policies to mitigate legal risks [1] - Google CEO Sundar Pichai warned that a proposed antitrust measure requiring the sharing of search data could have devastating effects on Google's search business, potentially stifling innovation and compromising user privacy [4] Group 3: Market Dynamics and Employment Trends - Shopify's CEO announced a mandate for all employees to utilize AI, marking a significant shift towards AI-driven operations and potentially leading to job cuts, as the U.S. white-collar job market faces its lowest recruitment levels in 12 years [4] - Ele.me entered the competitive landscape of food delivery with a substantial subsidy plan, aiming to regain market share amidst aggressive competition from JD and Meituan [5] Group 4: Advancements in AI Models - DeepSeek released the DeepSeek-Prover-V2 mathematical reasoning model, showcasing significant improvements in reasoning capabilities and marking a shift towards structured logical reasoning in AI [6]
特斯联2024年营收超18亿元,三大业务板块升级释放增长新动能
2 1 Shi Ji Jing Ji Bao Dao· 2025-05-01 05:08
Core Viewpoint - Teslin, established in 2015, is a key player in China's AIoT industry, focusing on technology-driven industrial upgrades and spatial intelligence for sustainable development [1] Financial Performance - Teslin's revenue for 2024 is projected to be 1.843 billion yuan, representing an 83.2% increase compared to 2023 [1][2] - Revenue figures for 2022 and 2023 were 738 million yuan and 1.006 billion yuan, respectively, resulting in a compound annual growth rate (CAGR) of 58.0% from 2022 to 2024 [1][2] - The company's expense ratio (sales, management, and R&D) decreased from 76.9% in 2023 to 45.0% in 2024, indicating effective cost control [3] Market Position - Teslin has become one of the fastest-growing companies in the AI industry, outperforming peers such as SenseTime and Horizon Robotics, which reported revenue growth rates of 10.8% to 53.6% in 2024 [3] - The company has established a comprehensive AIoT technology product system over nine years, positioning itself as a leading enterprise in the rapidly growing AIoT market [2] Market Expansion - As of December 31, 2024, Teslin's products have been deployed by over 800 clients across 160 cities globally, with a total order amount of 2.3 billion yuan [4] - The number of clients increased from 224 in 2022 to 342 in 2024, reflecting an optimized customer structure [4] Strategic Focus - Teslin is focusing on three strategic directions: AIoT models, AIoT infrastructure, and AIoT agents, which are expected to drive future business growth [6] - The company is responding to the increasing demand in the market by restructuring its internal teams to enhance efficiency and innovation [3] Industry Context - The global AIoT market is experiencing rapid growth, with a projected CAGR of over 31.7% over the next five years [2] - China's AI market is also expanding, with spending reaching 14.8 billion USD in 2023, making it the second-largest AI market globally [7] - Teslin's technology strategy aligns with China's push for self-sufficiency in AI, reducing reliance on external technologies and enhancing the resilience of the industrial chain [7]
DeepSeek新数学模型刷爆记录!7B小模型自主发现671B模型不会的新技能
量子位· 2025-05-01 03:53
DeepSeek放大招!新模型专注数学定理证明,大幅刷新多项高难基准测试。 在普特南测试上, 新模型 DeepSeek-Prover-V2 直接把记录刷新到 49道 。 目前的 第一名 在657道题中只做出 10道 题,为Kimi与 AIME2024冠军团队Numina 合作成果 Kimina-Prover 。 而未针对定理证明优化的 DeepSeek-R1只做出 1道 。 让还没发布的R2更令人期待了。 | 657) | | --- | | (out of | | Lean | | मै | Model | num- | | | --- | --- | --- | --- | | | | solved | compute | | 1 | Kimina-Prover-7B-Distill♥ | 10 | pass@192 | | 2 | Self-play Theorem Prover♥ | 8 | pass@3200 | | 3 | Goedel-Prover-SFT♥ | 7 | pass@512 | | 4 | ABEL | 7 | pass@596 | | 5 | InternLM2.5-StepPr ...
DeepSeek开源Prover-V2强推理模型,网友:奥数从没这么简单过
机器之心· 2025-05-01 02:11
Core Insights - DeepSeek has released DeepSeek-Prover-V2, an open-source large language model specifically designed for formal theorem proving, achieving industry-leading performance in theorem proving tasks [1][3][4]. Model Overview - Two versions of DeepSeek-Prover-V2 have been released, with parameter sizes of 7 billion and 671 billion. The larger model is based on DeepSeek-V3-Base, while the smaller one is built on DeepSeek-Prover-V1.5-Base, supporting a maximum context length of 32,000 tokens [3][4]. - DeepSeek-Prover-V2 is tailored for the mathematical AI programming language Lean 4, focusing on formal theorem proving [3][4]. Technical Implementation - The model utilizes a recursive theorem proving process to generate cold-start training data, where DeepSeek-V3 decomposes complex problems into manageable sub-goals and formalizes the reasoning steps [9][11]. - The training process involves two phases: a non-CoT (non-Chain of Thought) mode for rapid formal proof generation and a CoT mode for detailed reasoning steps, enhancing transparency and logical progression [17][19]. Performance Metrics - The DeepSeek-Prover-V2-671B model achieved an 88.9% pass rate on the MiniF2F test and successfully solved 49 out of 658 problems in the PutnamBench dataset [15][23]. - The model's performance was evaluated against various benchmarks, demonstrating unprecedented accuracy and efficiency compared to other advanced models in the industry [20][23]. Dataset Release - DeepSeek has also introduced ProverBench, a benchmark dataset containing 325 problems, including 15 from recent AIME math competitions, aimed at comprehensive evaluation of models in high school and undergraduate mathematics [25][26].
刚刚!DeepSeek-Prover-V2-671B 发布,网友:DS 是假期终结者
程序员的那些事· 2025-05-01 02:04
Core Viewpoint - DeepSeek has launched DeepSeek-Prover-V2-671B, marking a significant advancement in AI mathematical reasoning capabilities, particularly in automated theorem proving [2][4]. Group 1: Model Overview - DeepSeek-Prover-V2-671B is a next-generation automated theorem proving expert model with 671 billion parameters, optimized for proof generation and verification in the Lean 4 framework [4][6]. - The model employs a mixture of experts (MoE) architecture, activating approximately 37 billion parameters per inference, enhancing computational efficiency while maintaining strong reasoning capabilities [4][6]. Group 2: Key Breakthroughs - The release signifies three major milestones, including the potential for innovation across various application domains [6]. - The model's specifications include a context length of approximately 128,000 tokens, allowing it to handle complex reasoning chains and lengthy proofs [6][7]. - The attention mechanism is likely a multi-head latent attention (MLA), which compresses key-value (KV) cache, significantly reducing memory requirements [6][7]. Group 3: Applications and Impact - The model supports formal verification in areas such as cryptographic security proofs and chip design validation, enabling rigorous mathematical checks in automated processes [7]. - It aids mathematicians in formalizing theorems, exploring new conjectures, and proving complex mathematical problems, potentially accelerating mathematical research [7]. - The model can be utilized as an interactive educational tool, guiding students in mastering rigorous mathematical proof methods [7].
1月股市涨了:这是川普的股市!4月股市跌了:这是拜登的股市!特朗普执政100天,被痛批失败!沃尔玛低头了,145%关税全扛!
雪球· 2025-05-01 01:32
| 超微电脑 | V | | | | | --- | --- | --- | --- | --- | | SMCI 已收盘 04-30 16:00:00 美东 | | | | | | 2.94万人加自选(一 | | | | | | 31.86 -4.14 -11.50% | US 齡 空 期 LO | | | | | 高 32.00 | 总市值 190.14亿 。 | 开 29.12 量 9823.05万股 | | | | 市盈TTM 13.16 | 低 28.78 | 换 16.46% | 额 29.81亿 | | | 期权 成交量71.31万张 未平仓数250.76万张 | | | | | | 盘后 31.95 +0.09 +0.28% | 19:59:57 美东时间 | | | | | 分时 | 五日 日K | 李K 年K 分钟, | | | | 均价:30.37 最新:31.86 -4.14 -11.50% | 0.00% | 36.00 | | | | 15:59 31.86 | 100 | | | | | 15:59 31.85 | 300 | | | | | 100 | 15:59 31.86 ...
创始人“跑路”?极石汽车回应:消息不实;美团免除骑手外卖柜使用费;微软30%代码由AI编写丨邦早报
创业邦· 2025-05-01 01:03
完整早报音频,请点击标题下方小耳机收听 【苹果重组全球事务和音乐部门】 据知情人士透露,苹果公司正在对其全球事务和音乐部门的管理层分别进行改组,延续了这家iPhone生产商最近的一系 列变动。上述人士说,此次全球事务重组包括调整欧洲、印度、中国和亚洲其他地区政府团队的管理。由于人事变动尚未公布,这些人士要求不具名。与 此同时,Apple Music将有一个全新的领导结构——两名联席主管向奥利弗·舒瑟(Oliver Schusser)汇报工作。舒瑟是苹果公司的高级副总裁,曾领导过该 部门。(财联社) 【OpenAI回应GPT-4o更新后个性过于谄媚:已回滚到老版本】 OpenAI首席执行官山姆·奥特曼在社交平台表示,昨晚开始回滚GPT-4o的最新更新,现在 免费版的回滚已100%完成,付费版完成后会再次进行更新,预计晚些时候对模型个性进行额外的修复,并将在未来几天分享更多信息。此前,奥特曼发文 称,"GPT-4o的最近几次更新使其个性变得过于谄媚和烦人(尽管其中也有一些非常好的部分),我们正在尽快修复。"(搜狐) | Sam Altman > @ @sama · 6小时 | | | | | --- | --- ...
DeepSeek开源新模型,数学推理能力大提升
Hu Xiu· 2025-05-01 00:48
Core Insights - DeepSeek has officially released DeepSeek-Prover-V2 on Hugging Face, continuing its open-source momentum with two versions launched [1][4] - The training core of DeepSeek-Prover-V2 combines "recursion + reinforcement learning," enabling the model to break down complex theorems into sub-goals and reasoning paths [3][8] Model Specifications - DeepSeek-Prover-V2-7B is based on the previous V1.5 model and supports a maximum context input of 32K [4] - DeepSeek-Prover-V2-671B is trained on the DeepSeek-V3-Base, showcasing the strongest reasoning performance [4] Training Process - The training process consists of two phases: the first phase focuses on rapid mode using an "expert iteration" method, where successful answers refine the model [5] - In the second phase, more complex logical reasoning capabilities are trained, incorporating mathematical knowledge from DeepSeek-V3 and formal data [6] Reinforcement Learning - The GRPO reinforcement learning algorithm is introduced to enhance reasoning capabilities, allowing the model to autonomously learn to select optimal solutions from multiple candidates [8] - The system generates 32 different proof schemes for each theorem, retaining only those verified as correct by the Lean verification system [9] Model Distillation - After developing the powerful 671B model, the team distilled its capabilities into a smaller 7B model, allowing users to achieve near-equivalent mathematical reasoning abilities on resource-limited devices [10][11] Reasoning Modes - The rapid mode (non-CoT) focuses on speed, generating concise Lean code answers without showing the thought process, suitable for handling numerous problems [12] - The logical mode (CoT) details each step of the reasoning process, ensuring clarity and transparency [12] Performance Evaluation - In the final performance assessment, DeepSeek-Prover-V2-671B achieved an 88.9% pass rate in the MiniF2F test, successfully solving 49 problems from the PutnamBench dataset [17] New Dataset - DeepSeek introduced a new formal mathematical dataset, ProverBench, containing 325 problems across various mathematical domains, including number theory, algebra, and calculus [18][19] Comparison and Trends - The comparison shows a significant trend: the performance gap between large language models in "informal mathematical reasoning" and "formal mathematical reasoning" is narrowing [21] - The evolution of model structure and training strategies enables models to produce rigorous, verifiable mathematical proofs [22] Future Directions - DeepSeek-Prover-V2 indicates a shift in focus from merely generating content to generating structured logic, which may touch upon the foundational structure of general artificial intelligence [33][34]