Workflow
算力自由
icon
Search documents
DeepSeek V3到V3.1,走向国产算力自由
虎嗅APP· 2025-08-24 09:02
以下文章来源于未尽研究 ,作者未尽研究 未尽研究 . AI,新能源,合成生物,地缘X 本文来自微信公众号: 未尽研究 (ID:Weijin_Research) ,作者:未尽研究,题图来自:AI生成 从V3到V3.1,DeepSeek正在探索出一条"算力自由"之路。 从魔改PTX到使用UE8M0 FP8 Scale的参数精度,DeepSeek先榨取英伟达GPU算力,再适配国产芯 片,可能会在软硬件协同方面带来新的突破, 进一步提高训练效率,最多可以减少75%的内存使用 ,从而在实际应用中减少对进口先进GPU芯片的依赖。 DeepSeek正在与下一代国产GPU芯片厂商一起,向算力自主又迈进一步。正是这样一种令人激动的 前景,激活了科技色彩愈发浓厚的中国资本市场。 DeepSeek发布了V3.1,而不是广受期待的V4或者R2,连R1也消失了。 DeepSeek变成了一个混合推 理架构,即一个模型同时支持思考模式和非思考模式。 这是一个趋势,在V3.1发布一周之前,GPT- 5发布了,这是一个"统一的系统",包括一个对话模型,一个思考模型,和一个实时路由,用来决定 如何结合对话与思考。 这次升级提高了DeepSeek ...
DeepSeek V3到V3.1,走向国产算力自由
Hu Xiu· 2025-08-24 00:33
Core Insights - DeepSeek is leveraging NVIDIA's GPU capabilities while adapting to domestic chips, potentially reducing memory usage by up to 75% and decreasing reliance on imported advanced GPU chips [1][34][39] - The release of DeepSeek V3.1 marks a significant step towards the Agent era, showcasing a hybrid reasoning architecture that supports both thinking and non-thinking modes [3][10] - The upgrade enhances DeepSeek's efficiency, allowing it to answer questions using fewer tokens and less time, improving user experience and economic considerations [4][6] Technical Developments - DeepSeek V3.1 utilizes UE8M0 FP8 scale parameter precision, which significantly reduces memory and bandwidth requirements, thus improving training and inference efficiency [11][15][30] - The model has undergone extensive retraining with an additional 840 billion tokens, achieving a context length of 128k [7][10] - The API Beta interface now supports strict function calling, enhancing reliability and usability in enterprise applications, aligning with trends seen in other major AI companies [8][9] Market Implications - The advancements in DeepSeek's technology are expected to invigorate the Chinese capital market, reflecting a growing focus on technological self-sufficiency [2] - As domestic GPU manufacturers adopt FP8 precision, the demand for NVIDIA's H20/B30 chips may decline, especially if the next generation of domestic GPUs can efficiently run large models [36][38] - The shift towards UE8M0 and ultra-low precision training could lead to a gradual reduction in reliance on NVIDIA's ecosystem, fostering a more independent Chinese AI chip and model landscape [42] Competitive Landscape - Despite the innovations from DeepSeek, NVIDIA maintains its competitive edge with superior bandwidth, interconnect capabilities, and a robust software ecosystem [40][41] - The ongoing evolution of low-precision digital representation technology, as exemplified by DeepSeek's UE8M0, may accelerate the development of next-generation domestic chips [39][42] - The industry is witnessing a potential shift where companies may prioritize domestic solutions over NVIDIA's offerings, particularly in cost-sensitive scenarios [38][42]