Workflow
Seek .(SKLTY)
icon
Search documents
宝马中国宣布接入DeepSeek,宝马妥协了?
3 6 Ke· 2025-05-02 02:21
Core Viewpoint - BMW China is embracing local AI technology by integrating DeepSeek, marking a significant step in its digital transformation strategy and enhancing its AI capabilities in the Chinese market [1][3][6] Group 1: BMW's AI Integration - BMW has announced the integration of DeepSeek into its operations, which will enhance the BMW Intelligent Personal Assistant and improve human-machine interaction in new models starting from Q3 2025 [1][2] - The collaboration with DeepSeek follows BMW's earlier partnership with Alibaba to develop AI language models, showcasing BMW's commitment to local AI ecosystem development [1][3] Group 2: Strategic Importance of Local AI - This move signifies BMW's recognition of the importance of local AI technologies and its willingness to adapt to the rapidly evolving Chinese automotive market [3][4] - BMW's previous initiatives, such as the launch of a 360-degree AI strategy and the development of intelligent systems like "Car Expert" and "Travel Companion," reflect its ongoing efforts to enhance its smart vehicle offerings [3][4] Group 3: Challenges and Opportunities - Despite its historical strengths in manufacturing and brand image, BMW faces challenges in keeping pace with the increasing demand for smart and connected vehicles [4][5] - The partnership with DeepSeek is seen as a strategic decision to accelerate BMW's digital transformation and leverage the advanced technologies and innovative models from Chinese tech companies [4][6]
DeepSeek开源新模型,数学推理能力大提升
Hu Xiu· 2025-05-01 00:48
Core Insights - DeepSeek has officially released DeepSeek-Prover-V2 on Hugging Face, continuing its open-source momentum with two versions launched [1][4] - The training core of DeepSeek-Prover-V2 combines "recursion + reinforcement learning," enabling the model to break down complex theorems into sub-goals and reasoning paths [3][8] Model Specifications - DeepSeek-Prover-V2-7B is based on the previous V1.5 model and supports a maximum context input of 32K [4] - DeepSeek-Prover-V2-671B is trained on the DeepSeek-V3-Base, showcasing the strongest reasoning performance [4] Training Process - The training process consists of two phases: the first phase focuses on rapid mode using an "expert iteration" method, where successful answers refine the model [5] - In the second phase, more complex logical reasoning capabilities are trained, incorporating mathematical knowledge from DeepSeek-V3 and formal data [6] Reinforcement Learning - The GRPO reinforcement learning algorithm is introduced to enhance reasoning capabilities, allowing the model to autonomously learn to select optimal solutions from multiple candidates [8] - The system generates 32 different proof schemes for each theorem, retaining only those verified as correct by the Lean verification system [9] Model Distillation - After developing the powerful 671B model, the team distilled its capabilities into a smaller 7B model, allowing users to achieve near-equivalent mathematical reasoning abilities on resource-limited devices [10][11] Reasoning Modes - The rapid mode (non-CoT) focuses on speed, generating concise Lean code answers without showing the thought process, suitable for handling numerous problems [12] - The logical mode (CoT) details each step of the reasoning process, ensuring clarity and transparency [12] Performance Evaluation - In the final performance assessment, DeepSeek-Prover-V2-671B achieved an 88.9% pass rate in the MiniF2F test, successfully solving 49 problems from the PutnamBench dataset [17] New Dataset - DeepSeek introduced a new formal mathematical dataset, ProverBench, containing 325 problems across various mathematical domains, including number theory, algebra, and calculus [18][19] Comparison and Trends - The comparison shows a significant trend: the performance gap between large language models in "informal mathematical reasoning" and "formal mathematical reasoning" is narrowing [21] - The evolution of model structure and training strategies enables models to produce rigorous, verifiable mathematical proofs [22] Future Directions - DeepSeek-Prover-V2 indicates a shift in focus from merely generating content to generating structured logic, which may touch upon the foundational structure of general artificial intelligence [33][34]
美乌重磅协议签署!矿产开发+重建基金;道指月线3连跌,美油跌超3%;证监会副主席王建军被查;DeepSeek开源新模型丨每经早参
Mei Ri Jing Ji Xin Wen· 2025-04-30 23:00
每经编辑 陈鹏程 袁东 1 隔夜市场 美股三大指数收盘涨跌不一,纳指跌0.09%,4月份累涨0.85%;标普500指数涨0.15%,4月份累跌0.76%;道指涨0.35%,4月份累跌3.17%,其中,标普500 指数、道指连跌3个月;大型科技股跌多涨少,特斯拉跌超3%,亚马逊、英特尔跌超1%,谷歌、Meta小幅下跌;奈飞、苹果、微软小幅上涨,超微电脑跌 超11%。中概股涨跌不一,纳斯达克中国金龙指数跌0.95%,4月份累跌9.79%;贝壳跌超2%,百度、极氪跌超1%,满帮、腾讯音乐、京东等小幅下跌;金 山云涨超9%,万国数据涨逾5%,名创优品涨超2%,拼多多、BOSS直聘涨超1%,阿里巴巴、理想汽车小幅上涨。 美国商务部4月30日公布最新数据显示,2025年第一季度美国国内生产总值(GDP)环比按年率计算萎缩0.3%。2024年第四季度,美国GDP环比按年率计算 增长2.4%。 国际油价大幅走低,美油主力合约跌3.64%,报58.22美元/桶;布伦特原油主力合约跌3.37%,报61.15美元/桶。4月份,美油跌18.55%,布油跌18.22%,创将 近三年半来最大月跌幅。 现货黄金跌0.85%,报3288.2 ...
AI数学天花板来了?DeepSeek新模型低调开源,网友直呼:R2指日可待!
Hua Er Jie Jian Wen· 2025-04-30 12:52
就在所有人都在期待DeepSeek官宣R2大模型之际,公司却出其不意地在"五一"前夕投下了另一枚技术炸弹。 4月30日,DeepSeek在Hugging Face平台上悄然开源了其最新模型——DeepSeek-Prover-V2-671B,一个专注于数学定理证明的大语言模型,专门针 对形式化数学证明任务进行优化。 DeepSeek-Prover-V2-671B使用了DeepSeek-V3架构,参数高达6710亿,采用MoE(混合专家)模式,具有61层Transformer层,7168维隐藏层。 | Hugging Face Q. Search models, datasets, users ... | | Models | ■ Datasets ■ Spaces Posts | Docs | Enterprise | Pricing | VII | Log In Sign Up | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | < deepseek-ai/DeepSeek-Prover-V2-671B = 0 Wke 152 | Follo ...
华为郭振兴: DeepSeek浪潮后,AI将快速释放巨大的制造业生产红利 | 最前线
3 6 Ke· 2025-04-30 09:48
Group 1 - Huawei hosted the AI + Manufacturing Industry Summit 2025 in Guangzhou, focusing on accelerating industry intelligence with over 900 attendees from various manufacturing sectors [1] - Huawei introduced a "three-layer, five-step, eight-phase" methodology and shared 20 solutions across seven key scenarios in the manufacturing sector [1] - The company emphasized its full-stack AI infrastructure, which can adapt flexibly to multiple manufacturing scenarios, lowering the threshold for AI adoption [1] Group 2 - In the automotive sector, Huawei's collaboration with GAC Group has significantly reduced the vehicle development cycle from 36 months to 18 months using AI models and development toolchains [1] - Huawei's software development cycle has improved from 9-18 months to one month per release by integrating over 13 million high-value documents and 850+ open-source code repositories into its data platform [2] - By 2025, over 300 enterprises are expected to have plans for large model deployment, indicating a surge in demand for AI capabilities in manufacturing [2] Group 3 - Huawei has adapted its DeepSeek solution across various scenarios, including pre-training and reinforcement learning, to help clients complete secondary training quickly [3] - The company has optimized the performance of various models in the Ascend environment, with over 100 manufacturing partners already utilizing the DeepSeek solution [3] - Huawei aims to provide end-to-end full-stack infrastructure to support enterprises' digital transformation by focusing on data management and intelligent connectivity [3]
从DeepSeek到硬科技:国中资本的投资新视野 | 投资人:快答2025
Sou Hu Cai Jing· 2025-04-30 06:29
前言: 2025年,以一副猝不及防的形态出现,国防、科技、文化、国际政治领域的变化让人目不暇接。回望过去几年, 我们共同经历了一场前所未有的全球性震荡。疫情、战争、经济衰退、地缘政治……这些关键词充斥着我们的视 野,也深刻地改变着世界的运行规则。然而,危机往往与机遇并存,动荡中也孕育着新的希望。 2025年,作为投资人,我们比任何人都更深刻地感受到时代浪潮的冲击,更能深切的感受到春江水暖。站在产业 前端,立在创新前沿,投资人对于已经到来的2025有着什么样的期冀和见解? 我们邀请了多位顶尖投资人,将以他们敏锐的洞察力和独到的见解,为我们勾勒出一幅未来经济、投资的路线 图。 此文为融中特别策划·《投资人:快答2025》系列报道第十一篇。 在当今快速发展的科技时代,人工智能与硬科技领域的突破正深刻改变着全球产业格局。 国中资本作为中国创投行业的领军者,始终站在行业前沿,洞察技术变革与市场机遇。从DeepSeek的崛起引发的 AI技术革新,到新能源汽车、半导体、医疗大健康等硬科技赛道的持续深耕,国中资本不仅见证并推动了中国科 技产业的成长,更在投资实践中形成了独特的价值理念与投资逻辑。 在2025年这一关键节点,国 ...
Qwen3深夜炸场,阿里一口气放出8款大模型,性能超越DeepSeek R1,登顶开源王座
3 6 Ke· 2025-04-29 09:53
Core Insights - The release of Qwen3 marks a significant advancement in open-source AI models, featuring eight hybrid reasoning models that rival proprietary models from OpenAI and Google, and surpass the open-source DeepSeek R1 model [4][24]. - Qwen3-235B-A22B is the flagship model with 235 billion parameters, demonstrating superior performance in various benchmarks, particularly in software engineering and mathematics [2][4]. - The Qwen3 series introduces a unique dual reasoning mode, allowing the model to switch between deep reasoning for complex problems and quick responses for simpler queries [8][21]. Model Performance - Qwen3-235B-A22B achieved a score of 95.6 in the ArenaHard test, outperforming OpenAI's o1 (92.1) and DeepSeek's R1 (93.2) [3]. - Qwen3-30B-A3B, with 30 billion parameters, also shows strong performance, scoring 91.0 in ArenaHard, indicating that smaller models can still achieve competitive results [6][20]. - The models have been trained on approximately 36 trillion tokens, nearly double the data used for the previous Qwen2.5 model, enhancing their capabilities across various domains [17][18]. Model Architecture and Features - Qwen3 employs a mixture of experts (MoE) architecture, activating only about 10% of its parameters during inference, which significantly reduces computational costs while maintaining high performance [20][24]. - The series includes six dense models ranging from 0.6 billion to 32 billion parameters, catering to different user needs and computational resources [5][6]. - The models support 119 languages and dialects, broadening their applicability in global contexts [12][25]. User Experience and Accessibility - Qwen3 is open-sourced under the Apache 2.0 license, making it accessible for developers and researchers [7][24]. - Users can easily switch between reasoning modes via a dedicated button on the Qwen Chat website or through commands in local deployments [10][14]. - The model has received positive feedback from users for its quick response times and deep reasoning capabilities, with notable comparisons to other models like Llama [25][28]. Future Developments - The Qwen team plans to focus on training models capable of long-term reasoning and executing real-world tasks, indicating a commitment to advancing AI capabilities [32].
DeepSeek-R2发布在即,参数量翻倍,华为昇腾芯片利用率达82%!
Sou Hu Cai Jing· 2025-04-29 07:17
Core Insights - The next-generation AI model DeepSeek-R2 is set to be released, featuring advanced parameters and architecture [1][5] - DeepSeek-R2 will utilize a hybrid expert model (MoE) with an intelligent gating network, significantly enhancing performance for high-load inference tasks [5] - The total parameter count for DeepSeek-R2 is expected to reach 1.2 trillion, doubling the 671 billion parameters of DeepSeek-R1, making it comparable to GPT-4 Turbo and Google's Gemini 2.0 Pro [5] Cost Efficiency - DeepSeek-R2's unit inference cost is projected to decrease by 97.4% compared to GPT-4, costing approximately $0.07 per million tokens, while GPT-4 costs $0.27 per million tokens [8] - The model's cost efficiency is attributed to the use of Huawei's Ascend 910B chip cluster, which achieves a computational performance of 512 PetaFLOPS with an 82% resource utilization rate [7][8] Hardware and Infrastructure - DeepSeek-R2's training framework is based on Huawei's Ascend 910B chip cluster, which has been validated to deliver 91% of the performance of NVIDIA's previous A100 training cluster [7] - The introduction of Huawei's Ascend 910C chip, which is entering mass production, may provide a domestic alternative to NVIDIA's high-end AI chips, enhancing hardware autonomy in China's AI sector [10]
阿里发布并开源模型Qwen3,成本仅为DeepSeek-R1的1/3
Guan Cha Zhe Wang· 2025-04-29 03:27
4月29日凌晨,阿里巴巴开源新一代通义千问模型Qwen3(简称千问3),参数量仅为DeepSeek-R1的 1/3,成本大幅下降,性能全面超越R1、OpenAI-o1等领先模型,登顶全球最强开源模型。 千问3是国内首个"混合推理模型",将"快思考"与"慢思考"集成进同一个模型,大大节省算力消耗。 根据官方的说法,千问3的旗舰版本 Qwen3-235B-A22B,在代码、数学、通用能力等基准测试中,达到 了与 DeepSeek-R1、o1、o3-mini、Grok-3 和 Gemini-2.5-Pro 同一梯度的水平。 在奥数水平的 AIME25 测评中,Qwen3-235B-A22B 斩获 81.5 分,刷新了开源模型的纪录;在考察代码 能力的 LiveCodeBench 评测中,Qwen3-235B-A22B 突破 70 分,表现甚至超过 Grok 3;在评估模型人类 偏好对齐的 ArenaHard 测评中,Qwen3-235B-A22B 以 95.6 分超越 OpenAI-o1 及 DeepSeek-R1。 | | Qwen3-235B-A22B | Qwen3-32B | OpenAl-o1 | Dee ...
阿里Qwen3性能超越DeepSeek-R1;美媒曝马斯克孩子数量远超14个;ChatGPT推出购物功能
Guan Cha Zhe Wang· 2025-04-29 01:10
Group 1: Stock Market Performance - The three major U.S. stock indices closed mixed, with the Dow Jones up by 0.28% and the S&P 500 up by 0.06%, while the Nasdaq fell by 0.1% [1] - Major tech stocks showed varied performance, with Intel rising over 2%, while Nvidia dropped over 2% [1] Group 2: AI and Technology Developments - Alibaba's Qwen3 has been released as an open-source model, surpassing competitors with a total of 235 billion parameters [2] - Apple CEO Tim Cook has reorganized the company's robotics team, indicating dissatisfaction with the progress in AI and machine learning [6][5] - OpenAI is enhancing its ChatGPT tool to facilitate online shopping, allowing users to purchase products directly through the platform [7] Group 3: Investment and Financial Activities - Amazon launched its first batch of satellites for the "Project Kuiper" internet initiative, aiming to deploy over 3,200 satellites for global internet coverage [7] - Alphabet plans to issue approximately $4 billion in high-grade corporate bonds, with the longest maturity potentially yielding 1% to 1.05% above U.S. Treasury rates [7] - Over 700 billion yuan has been allocated by local government funds towards humanoid robotics and related industries [8] Group 4: Company Listings and IPOs - Seres has applied for a mainboard listing in Hong Kong, with projected revenue of 145.1 billion yuan in 2024, marking a 305.5% year-on-year increase [9] - Stone Technology is considering an IPO in Hong Kong to raise up to $500 million, although plans are still in the early stages [10]