Workflow
DeepSeek
icon
Search documents
MiniMax发布推理模型对标DeepSeek,算力成本仅约53万美元
Di Yi Cai Jing· 2025-06-17 07:26
Core Insights - MiniMax, one of the "Six Little Dragons," has announced significant updates, starting with the release of its first open-source inference model, MiniMax-M1 [1] - MiniMax-M1 has shown competitive performance in benchmark tests, comparable to leading overseas models like DeepSeek-R1 and Qwen3 [3] - The model's training was completed in just three weeks using 512 H800 GPUs, with a total computing cost of only $534,700, which is an order of magnitude lower than initially expected [3][8] Performance Metrics - MiniMax-M1's context window length is 1 million tokens, which is eight times that of DeepSeek R1 and matches Google's Gemini 2.5 Pro, allowing superior performance in long-context understanding tasks [5] - In the TAU-bench evaluation, MiniMax-M1 outperformed DeepSeek-R1-0528 and Google's Gemini 2.5 Pro, ranking just below OpenAI o3 and Claude 4 Opus globally [7] - The model excels in coding capabilities, significantly surpassing most open-source models, with only a slight gap behind the latest DeepSeek R1 [7] Innovations and Cost Efficiency - MiniMax-M1 utilizes a hybrid architecture based on a lightning attention mechanism, enhancing efficiency in long-text input and deep reasoning tasks [7] - The introduction of the CISPO reinforcement learning algorithm has resulted in faster convergence performance compared to Byte's recent DAPO algorithm, contributing to the low training cost [8] - MiniMax's pricing strategy is tiered based on input length, with costs ranging from $0.8 to $2.4 per million tokens for input and $8 to $24 for output, offering competitive pricing against DeepSeek [8] Competitive Landscape - Concurrently, another competitor, Moonlight, has released its programming model Kimi-Dev-72B, which reportedly achieved the highest open-source model level in SWE-bench tests, surpassing the new DeepSeek-R1 [8] - However, Kimi-Dev-72B faced scrutiny for potential overfitting, as it generated less code than required for certain tasks, raising questions about its performance reliability [9] - The AI industry is witnessing renewed competition among the "Six Little Dragons," with MiniMax expected to release further updates in the coming days, potentially impacting the multi-modal AI landscape [9]
Claude时代终结?LMArena实测DeepSeek R1编程得分超Opus 4,但月暗称其新模型更胜一筹
AI前线· 2025-06-17 06:56
Core Viewpoint - The article highlights the significant advancements of the open-source AI model DeepSeek-R1 (0528), which has demonstrated competitive performance against leading proprietary models like Claude Opus 4 and GPT-4.1 in various benchmarks, marking a notable milestone in the open-source AI landscape [1][14]. Performance in Benchmarks - DeepSeek-R1 (0528) achieved a score of 1408.84 in the WebDev Arena, surpassing Claude Opus 4's score of 1405.51, and tying with Gemini-2.5-Pro-Preview-06-05 for the top position [4][5]. - In the LMArena public benchmark tests, R1 (0528) outperformed several top closed models, showcasing its coding capabilities [3][4]. - The model ranks sixth in the Text Arena, indicating strong performance in language understanding and reasoning tasks [6]. Technical Specifications - DeepSeek-R1 (0528) utilizes a mixture of experts (MoE) architecture with a total parameter count of 685 billion, activating approximately 37 billion parameters during inference for efficient computation [9]. - It supports a long context window of 128K tokens, enhancing its performance in long text understanding and complex logical reasoning tasks [9]. Community Reactions - The release of DeepSeek-R1 (0528) has sparked discussions in developer communities, with some users expressing skepticism about its performance compared to proprietary models [10][11][16]. - Users have noted the impressive coding capabilities of R1, suggesting that developers using this model could outperform those using closed models [16]. Competitive Landscape - The article mentions the recent release of Kimi-Dev-72B, another open-source model that has achieved high scores in programming benchmarks, indicating a competitive environment in the open-source AI space [22][23]. - Kimi-Dev-72B scored 60.4% in the SWE-bench Verified programming benchmark, surpassing DeepSeek-R1 (0528) in specific coding tasks [23]. Conclusion - The advancements of DeepSeek-R1 (0528) signify a critical moment for open-source AI, demonstrating that open models can compete with proprietary systems in terms of performance and capabilities [14].
刚刚,LMArena最新模型榜单出炉!DeepSeek-R1网页编程能力赶超了Claude Opus 4
机器之心· 2025-06-17 00:10
机器之心报道 编辑:杜伟 在开源模型领域,DeepSeek 又带来了惊喜。 上个月 28 号,DeepSeek 来了波小更新,其 R1 推理模型升级到了最新版本(0528),并公开了模型及权重。 这一次,R1-0528 进一步改进了基准测试性能,提升了前端功能,减少了幻觉,支持 JSON 输出和函数调用。 今天,业界知名、但近期也陷入争议(曾被指出对 OpenAI、谷歌及 Meta 的大模型存在偏袒)的大模型公共基准测试平台 LMArena 公布了最新的性能排行榜,其 中 DeepSeek-R1(0528)的成绩尤为引人瞩目 。 | | Rank (UB) ↑ Model ↑↓ | | Score 11 | | 95% Cl (±) 1↓ Votes 1J | لا Organization 1 | License 1لا | | --- | --- | --- | --- | --- | --- | --- | --- | | | 1 | G gemini-2.5-pro-preview-06-05 | 1468 | +8/-6 | 8,454 | Google | Proprietary | | | 2 ...
外资投行展望下半年中国经济和股票市场
淡水泉投资· 2025-06-16 13:01
Core Viewpoint - The sentiment of foreign investors towards the Chinese market is improving, with a focus on the recovery of the domestic economy and the ongoing dynamics of Sino-U.S. relations [1][4]. Group 1: Structural Improvement in the Stock Market - Since the second half of 2024, the Chinese stock market has been experiencing structural improvements, driven by a rebound in ROE and the rise of new technology sectors [4]. - Domestic leading companies are demonstrating operational resilience and growth momentum through measures such as shareholder returns, stock buybacks, and moderate leverage, contributing to sustainable ROE recovery and valuation uplift [4]. - Global investors express a willingness to increase their allocation to Chinese stocks, acknowledging that their current allocation is 2.4 percentage points below the MSCI Emerging Markets benchmark, indicating potential for increased investment [4][6]. Group 2: Interest in AI and Technology - Foreign investors are increasingly interested in AI, technology-related themes, and new consumption trends, recognizing missed opportunities in China's technological advancements since 2021-2022 [6]. - Concerns about China's competitiveness in global technology have shifted, with breakthroughs in AI and advancements in electric vehicles and robotics prompting a reevaluation of investment strategies [6]. Group 3: Key Topics of Interest - The recovery of the domestic economy remains a focal point for foreign investment banks, with challenges to sustainable growth still present [9]. - Catalysts for market observation include fiscal policy timing and scale, export resilience, real estate market stabilization, and the evolution of Sino-U.S. tariffs [10][12]. - The divergence between A-shares and H-shares is of interest, attributed to differences in industry composition and the concentration of high-ROE sectors in the Hong Kong market [12]. Group 4: Investment Strategy Consensus - In the context of structural improvements in the Chinese stock market and the clear intent of foreign investors to increase allocations, a balanced approach with selective stock picking is a common consensus among institutions [15].
花旗:全球半导体_2025 年下半年 GDDR7 推动全球 DRAM 需求上升
花旗· 2025-06-16 03:16
Investment Rating - The report reiterates a Buy rating on SK Hynix and Samsung Electronics due to expected demand growth in the DRAM market driven by GDDR7 and LPDDR5X [1][6]. Core Insights - The global memory supply shortage is anticipated to intensify in the second half of 2025, primarily due to rising demand for GDDR7 driven by advancements in AI inference models and edge AI devices [1][5]. - GDDR7 is expected to significantly enhance performance with a 2x increase in data rates, reaching 4.8Gbps per pin, and doubling bandwidth capacity to 192GB/s per device [2]. - The demand for GDDR7 is projected to contribute an additional 4.03 billion Gb to global DRAM demand in 2H25, representing a 24% increase in graphic DRAM demand and a 2.4% increase in overall global DRAM demand [4][7]. Summary by Sections GDDR7 Technology - GDDR7 features advanced PAM3 technology, improving data density by 50% per clock cycle compared to GDDR6, while operating at a lower voltage of 1.1-1.2V [2]. - The architecture of GDDR7 utilizes four 8-bit channels, enhancing parallel processing capabilities and reducing latency for AI workloads [2]. AI Inference Demand - The emergence of AI distillation technology is expected to drive significant memory demand for AI inference, leading to increased adoption of GDDR7 as an alternative to HBM [3]. Market Projections - The report projects GPU demand from DeepSeek to reach 2 million units in 2H25, with each GPU requiring 96GB of DRAM, contributing to the overall demand increase [4]. - The anticipated DRAM content upgrade in Apple's iPhone 17 series is expected to add an additional 3.2% to global DRAM demand in 2H25 [4].
摩根士丹利:DeepSeek R2:AI推理新一代重量级模型?
摩根· 2025-06-16 03:16
Investment Rating - The report provides a cautious outlook on the technology sector in Asia Pacific, particularly focusing on the developments surrounding DeepSeek's R2 model [7]. Core Insights - DeepSeek's R2 model is anticipated to redefine AI development, pricing, and reliance on domestic AI chip supply chains in China, serving as a potential catalyst for accelerating AI application deployment [1][2]. - The R2 model is expected to achieve significant advancements in multilingual reasoning and code generation, offering a hybrid model with lower power consumption and smaller parameter scale, while being cost-effective compared to its predecessor R1 [2][9]. - The model's efficiency is projected to lower computational requirements, facilitating AI commercialization and expanding total demand, potentially disrupting the AI market [2][10]. Summary by Sections R2 Model Overview - R2 represents the second major iteration of DeepSeek's reasoning model, promising improvements in multilingual reasoning and code generation, with a focus on efficiency and cost reduction [2][9]. - The model is designed to be multimodal, featuring enhanced visual capabilities and a significant reduction in operational costs compared to R1 [2][13]. Supply Chain Developments - The R2 model is supported by a robust ecosystem of Chinese companies, leveraging Huawei's Ascend 910B chip cluster for training, which signifies a shift towards a localized supply chain [3][17]. - DeepSeek aims to reduce dependency on external chip manufacturers, contrasting with the previous reliance on NVIDIA GPUs for training the R1 model [17][20]. Market Impact - The report suggests that DeepSeek's advancements will benefit local GPU, GDDR, and China's HBM sectors, indicating a positive outlook for these industries amidst a broader AI market recovery [20][22]. - The performance of DeepSeek's models, particularly in the context of increasing computational demands during inference, is expected to drive further innovation and resource allocation within the AI ecosystem [20][23]. Competitive Landscape - DeepSeek's approach emphasizes software-driven resource optimization rather than hardware dependency, which could lead to significant cost reductions and efficient training of large models [23][24]. - The report highlights the competitive pressure on NVIDIA from Huawei's Ascend chips, which are designed to match NVIDIA's performance while being domestically produced [17][20].
依托电子硬件制造强大实力 加速推动AI终端百花齐放 宝安:湾区AI终端产业领跑人
Shen Zhen Shang Bao· 2025-06-15 16:55
Core Insights - The next "battlefield" for artificial intelligence (AI) is in terminal applications, with a significant trend towards AI integration in devices like smartphones, smart glasses, and smart home systems by 2025 [1][2] - Shenzhen is positioned as a global leader in electronic information and hardware manufacturing, providing a solid foundation for AI terminal product production [1][3] - The Bao'an district in Shenzhen is actively pursuing the "AI + terminal" strategy to capture the high ground in the AI terminal industry, aiming for substantial growth in AI technology applications [1][4] Industry Growth and Projections - The AI terminal industry is expected to experience a "tsunami-like" growth, with IDC predicting a 20% increase in shipments of AI smartphones, tablets, and computers in China by 2025, and a staggering 99% increase in smart glasses and wearable devices [3] - Shenzhen's action plan aims for the AI terminal industry to reach over 800 billion yuan by 2026, with a target of 1 trillion yuan and the production of over 150 million AI terminal products [3] Bao'an District's Industrial Strength - Bao'an's smart terminal industry cluster is projected to achieve an added value of 24.36 billion yuan in 2024, reflecting an 8.4% year-on-year growth, with 601 enterprises in the cluster [4] - The district is home to notable companies like YingShi Innovation and Zhaowei Machinery, covering various AI terminal fields [4][5] Supply Chain and Innovation - Shenzhen's supply chain advantages, particularly in PCB (printed circuit board) manufacturing, are crucial for the hardware innovation needed for AI terminals [5] - Leading PCB manufacturers in Bao'an are benefiting from the AI terminal innovation cycle, with Pengding Holdings expected to achieve a revenue of 35.14 billion yuan in 2024, a 9.59% increase [5][6] Capital Market Engagement - Two Bao'an companies, YingShi Innovation and SwitchBot, have recently gained attention in the capital market, showcasing the integration of AI technology in their products [6][7] Ecosystem Development - The Greater Bay Area's AI companies excel in their ability to implement AI in real-world scenarios, supported by a robust manufacturing base in Bao'an [8] - Bao'an's industrial ecosystem is characterized by a diverse range of manufacturing enterprises, with over 5,601 large-scale industrial companies, accounting for nearly 40% of Shenzhen's total [8][9] Policy and Market Synergy - Bao'an is focusing on enhancing its industrial chain and supporting AI transformation in key sectors, including e-commerce and logistics, to strengthen its competitive edge [9] - The district is implementing action plans to foster the development of the smart terminal industry, including establishing public service platforms and creating demonstration projects in smart home and health sectors [9]
百度重大宣布!这项计划的offer薪资上不封顶
Zheng Quan Shi Bao· 2025-06-15 05:50
Group 1 - Baidu's AIDU program aims to recruit top talent for AI technology leadership, with no salary cap for offers to 2026 graduates [1] - The AIDU program has expanded recruitment by over 60% compared to last year, covering 23 core business areas and 11 research directions [1] - Baidu has invested 180 billion yuan in AI research and development, with a significant portion allocated to talent cultivation [2] Group 2 - Baidu plans to train 10 million AI talents over the next five years, building on its previous goal of training 5 million by 2024 [2] - Other major tech companies, such as Alibaba, are also ramping up recruitment efforts, with nearly 50% of their 3,000 open positions related to AI [2] - High salaries are being offered in the AI field, with top positions in large model companies reaching monthly salaries of 60,000 yuan [3]
百度重大宣布!这项计划的offer薪资上不封顶!
证券时报· 2025-06-15 05:32
百度近期举办的AIDU计划OpenDay活动,再次让其2026届"AIDU计划"受到关注! 此次活动传出,为招纳到顶尖的校园人才,百度2026届"AIDU计划"的offer薪资上不封顶。 "AIDU计划"是百度推出的一项精英招募计划,以培养AI技术领军人才为目标,特别注重候选人对技术的纯粹热爱与追求。加入该计划的学员有机会接触到最前沿的 研究课题,并在业务场景中实现技术的落地应用。百度将为每位学员量身定制个性化的培养路径,并配备专属技术专家导师进行一对一指导,助力他们快速成长为 AI领域的佼佼者。 今年5月,百度2026届"AIDU计划"正式启动,招募对象为2026届博士和硕士生,倾向于超级学霸、学术大神、竞赛高手和工程大牛。 不过,大模型Top级别的岗位要求也较高。一家大模型企业相关负责人就坦言,公司期望招揽的人才具备以下特征:敢于突破固有经验和传统范式;能够凭借技术 创新创造产品价值与用户价值,收获正向反馈;认同公司文化,有志于与公司携手打造行业领军企业。 在此次AIDU计划OpenDay活动中,百度透露,相比去年,2026届"AIDU计划"岗位招聘扩增超60%,覆盖百度23个核心业务和11类研究方向,包 ...
百度重大宣布!这项计划的offer薪资上不封顶!
证券时报· 2025-06-15 05:31
百度近期举办的AIDU计划OpenDay活动,再次让其2026届"AIDU计划"受到关注! "AIDU计划"是百度推出的一项精英招募计划,以培养AI技术领军人才为目标,特别注重候选人对技术的纯粹热爱与追求。加入该计划的学员有机会接触到最前沿的 研究课题,并在业务场景中实现技术的落地应用。百度将为每位学员量身定制个性化的培养路径,并配备专属技术专家导师进行一对一指导,助力他们快速成长为 AI领域的佼佼者。 今年5月,百度2026届"AIDU计划"正式启动,招募对象为2026届博士和硕士生,倾向于超级学霸、学术大神、竞赛高手和工程大牛。 此次活动传出,为招纳到顶尖的校园人才,百度2026届"AIDU计划"的offer薪资上不封顶。 在此次AIDU计划OpenDay活动中,百度透露,相比去年,2026届"AIDU计划"岗位招聘扩增超60%,覆盖百度23个核心业务和11类研究方向,包含大模型算法、大 模型基础架构、机器学习、语音技术、智能体等,是其最大规模的顶尖AI人才招聘。 "百度会像培养飞行员一样,培养未来的AI领航者。"百度方面称。据了解,在培养体系方面,学员不仅可以参与核心AI项目,接触前沿技术挑战,还能享受 ...