超低精度训练 - filings, earnings calls, financial reports, news

超低精度训练

Search documents

虎嗅APP· 2025-08-24 09:02

Core Insights - DeepSeek is advancing towards a "computing power freedom" path with its V3.1 release, optimizing the use of NVIDIA GPU power while adapting to domestic chips, potentially reducing memory usage by up to 75% [4][27]. - The V3.1 upgrade enhances DeepSeek's efficiency in reasoning and tool usage, positioning it competitively against international AI firms [8][9]. Group 1: Technological Advancements - DeepSeek V3.1 introduces a hybrid reasoning architecture, supporting both thinking and non-thinking modes, which improves efficiency and reduces token consumption [6][8]. - The model has undergone extensive retraining with an additional 840 billion tokens, achieving a context length of 128k, which enhances performance while lowering costs [8][9]. - The API Beta interface now supports strict function calling, improving reliability and usability in enterprise applications, making it easier to replace existing solutions like GPT/Claude [9]. Group 2: Market Positioning - DeepSeek's V3.1 is a significant milestone in its transition to the Agent era, allowing for better integration into the enterprise market, particularly with support for Anthropic API formats [9][30]. - The shift towards using UE8M0 FP8 scale data format allows DeepSeek to efficiently run large models on domestic AI chips, reducing reliance on imported GPUs [12][27]. - The potential decline in demand for NVIDIA's H20/B30 chips in China is noted, as domestic chips become more capable of handling large models with the new low-precision training methods [29][30]. Group 3: Competitive Landscape - NVIDIA's long-standing use of low-precision formats has set a benchmark, but DeepSeek's innovations may accelerate the development of domestic chips, creating a more independent AI ecosystem in China [16][32]. - Despite the advancements by DeepSeek, NVIDIA retains advantages in bandwidth, interconnectivity, and a robust software ecosystem, which may still attract international firms [32].

DeepSeek V3到V3.1，走向国产算力自由

Hu Xiu· 2025-08-24 00:33

Core Insights - DeepSeek is leveraging NVIDIA's GPU capabilities while adapting to domestic chips, potentially reducing memory usage by up to 75% and decreasing reliance on imported advanced GPU chips [1][34][39] - The release of DeepSeek V3.1 marks a significant step towards the Agent era, showcasing a hybrid reasoning architecture that supports both thinking and non-thinking modes [3][10] - The upgrade enhances DeepSeek's efficiency, allowing it to answer questions using fewer tokens and less time, improving user experience and economic considerations [4][6] Technical Developments - DeepSeek V3.1 utilizes UE8M0 FP8 scale parameter precision, which significantly reduces memory and bandwidth requirements, thus improving training and inference efficiency [11][15][30] - The model has undergone extensive retraining with an additional 840 billion tokens, achieving a context length of 128k [7][10] - The API Beta interface now supports strict function calling, enhancing reliability and usability in enterprise applications, aligning with trends seen in other major AI companies [8][9] Market Implications - The advancements in DeepSeek's technology are expected to invigorate the Chinese capital market, reflecting a growing focus on technological self-sufficiency [2] - As domestic GPU manufacturers adopt FP8 precision, the demand for NVIDIA's H20/B30 chips may decline, especially if the next generation of domestic GPUs can efficiently run large models [36][38] - The shift towards UE8M0 and ultra-low precision training could lead to a gradual reduction in reliance on NVIDIA's ecosystem, fostering a more independent Chinese AI chip and model landscape [42] Competitive Landscape - Despite the innovations from DeepSeek, NVIDIA maintains its competitive edge with superior bandwidth, interconnect capabilities, and a robust software ecosystem [40][41] - The ongoing evolution of low-precision digital representation technology, as exemplified by DeepSeek's UE8M0, may accelerate the development of next-generation domestic chips [39][42] - The industry is witnessing a potential shift where companies may prioritize domestic solutions over NVIDIA's offerings, particularly in cost-sensitive scenarios [38][42]