Workflow
FP8
icon
Search documents
DeepSeek官宣重磅更新!寒武纪等国产AI芯片全面爆发!DeepSeek FP8概念股来袭
私募排排网· 2025-08-25 07:00
Core Viewpoint - The article discusses the significant growth and potential of domestic AI chip companies in China, particularly in light of the recent advancements in AI models and the strategic shifts by international competitors like NVIDIA [2][11]. Summary by Sections AI Chip Market Dynamics - The domestic AI chip market is expected to grow from 142.54 billion yuan in 2024 to 1.34 trillion yuan by 2029, with a compound annual growth rate (CAGR) of 53.7% from 2025 to 2029 [12]. - The penetration rate of domestic AI chip brands is projected to reach 30% in 2024, up from 15% the previous year, indicating a strong upward trend in local market share [12]. DeepSeek's Innovations - DeepSeek has launched its V3.1 model, which introduces the UE8M0 FP8 low-precision format, designed specifically for upcoming domestic chips, enhancing computational efficiency [5][6]. - The UE8M0 FP8 format reduces memory usage and computational overhead significantly compared to traditional formats like FP16 and FP32, making it suitable for large-scale neural network training and inference [5]. Impact of NVIDIA's Strategy - NVIDIA has informed suppliers to halt production of AI chips tailored for the Chinese market, indicating a strategic shift that may benefit domestic chip manufacturers [11]. - The company is developing a new AI chip based on its latest Blackwell architecture, which is expected to outperform the halted H20 model [11]. Key Domestic AI Chip Companies - Major players in the domestic AI chip sector include: - **寒武纪 (Cambricon)**: Known for its Shiyuan 220 chip, which is the first mass-produced edge AI acceleration chip in China, with a recent stock price increase of 113.61% [7]. - **华为 (Huawei)**: The Ascend 910D chip is highlighted for its advanced architecture and potential to surpass NVIDIA's H100 in performance [13]. - **摩尔线程 (Moore Threads)**: Has achieved single-chip support for FP8 computing precision, with a stock increase of 57.20% [9]. Conclusion on Domestic Chip Development - The proactive definition of low-precision formats by domestic companies like DeepSeek marks a significant step towards reducing reliance on foreign technology, fostering a closed-loop ecosystem of domestic models, chips, and systems [6].
Deepseek V3.1的UE8M0 FP8和英伟达的FP8格式有什么区别
傅里叶的猫· 2025-08-24 12:31
Core Viewpoint - The introduction of UE8M0 FP8 by Deepseek for the upcoming domestic chips signifies a strategic move to enhance compatibility and efficiency in the Chinese AI ecosystem, addressing the unique requirements of domestic hardware [5][10][12]. Group 1: UE8M0 and FP8 Concept - FP8 is an 8-bit floating-point format that significantly reduces memory usage by 75% compared to 32-bit formats, enhancing computational speed and efficiency for large model training and inference [7][13]. - UE8M0 is a specific encoding format for FP8 tensor data, designed to optimize compatibility with domestic chips, differing from Nvidia's E4M3 and E5M2 formats which focus on precision and dynamic range [9][10]. - The Open Compute Project (OCP) introduced UE8M0 as part of its MXFP8 formats, aiming to standardize FP8 usage across various hardware platforms [8]. Group 2: Strategic Importance of UE8M0 - The development of UE8M0 is crucial for ensuring that domestic chips can effectively utilize FP8 without relying on foreign standards, thus reducing dependency on Nvidia's technology [12]. - Deepseek's integration of UE8M0 into its model development process aims to ensure that models can run stably on upcoming domestic chips, facilitating a smoother transition from development to deployment [11][12]. - The focus of UE8M0 is not to outperform foreign FP8 standards but to provide a viable solution that allows domestic chips to leverage FP8 efficiency [14]. Group 3: Performance and Limitations - UE8M0 can save approximately 75% in memory usage compared to FP32, allowing for larger models or increased request handling during inference [13]. - The inference throughput using UE8M0 can be about twice that of BF16, making it particularly beneficial for large-scale AI applications [13]. - However, UE8M0 is not a one-size-fits-all solution; certain calculations still require higher precision formats like BF16 or FP16, and effective calibration is necessary to avoid errors in extreme value scenarios [15].
突发!英伟达停产H20芯片
猿大侠· 2025-08-24 04:11
Core Viewpoint - NVIDIA has paused the production of its AI chip H20 for the Chinese market, which was initially designed to comply with U.S. export restrictions, indicating a shift in supply chain management in response to market conditions [1][3][6]. Group 1: H20 Chip Developments - H20 is crucial for NVIDIA's revenue in China, accounting for 80% of its income from the region [6]. - After a ban on H20 sales to China in April, NVIDIA's CEO announced a resumption of sales in July, driven by unexpectedly strong demand [7][8]. - In August, H20 received export approval to China, but with a stipulation to pay 15% of sales to the U.S. government [9]. Group 2: Security Concerns - Reports surfaced in late July about serious security issues with NVIDIA's chips, prompting the Chinese government to request explanations and proof from NVIDIA regarding potential backdoor vulnerabilities [10][13]. - NVIDIA responded by asserting that its chips do not have backdoors and emphasized the importance of cybersecurity [13]. Group 3: New Chip Development - NVIDIA is developing a new AI chip for the Chinese market, tentatively named B30A, which is expected to outperform H20 and utilize a single-chip design [13][14]. - The new chip will incorporate HBM high-bandwidth memory and NVLink technology for high-speed data transfer, with samples expected to be available for testing soon [16]. Group 4: Domestic Chip Advancements - DeepSeek has released a new version of its software and hinted at the upcoming UE8M0 FP8 chip, which is designed to enhance performance in deep learning applications [19][20]. - The UE8M0 FP8 format is expected to significantly improve efficiency, reducing memory usage by 50% and increasing computation speed by two times compared to traditional FP16 [23][24]. - This development indicates that domestic chips are closing the performance gap with NVIDIA's offerings [25].
Deepseek发布V3.1 为何火的却是官方留言?
Huan Qiu Wang Zi Xun· 2025-08-23 05:26
Core Viewpoint - DeepSeek has launched DeepSeek-V3.1, featuring comprehensive upgrades in mixed reasoning architecture, cognitive efficiency, and agent capabilities, with a notable focus on the UE8M0 FP8 technology designed for the next generation of domestic chips [1][3]. Company Insights - The announcement of UE8M0 FP8 technology has triggered a positive response in the capital market, with stocks of companies like Cambrian and Haiguang Information experiencing short-term increases [3]. - DeepSeek's introduction of UE8M0 FP8 technology signifies a strategic move into chip technology, reflecting the trend of software-defined hardware, where software stacks and algorithm optimization are crucial for enhancing hardware performance [5]. Industry Trends - The FP8 format is emerging as a new standard for AI training and inference, with major international players like NVIDIA and AMD also investing in this technology, indicating a potential shift in global technology direction [5]. - The FP8 technology optimizes data storage and transmission, significantly improving chip throughput and energy efficiency, thereby enhancing the competitiveness of domestic AI chips in both domestic and international markets [4][5]. - The balance between data precision and computational efficiency remains a core issue in the evolution of technology within the artificial intelligence and high-performance computing sectors [3].
算力股、芯片股都疯了!DeepSeek一句话让国产芯片集体暴涨!
是说芯语· 2025-08-22 07:49
Core Viewpoint - The release of DeepSeek V3.1 and its mention of a new architecture and next-generation domestic chips has created significant excitement in the AI industry, indicating a potential shift towards improved performance and capabilities in domestic AI solutions [1][27]. Group 1: Market Reaction - Following the announcement, domestic chip companies saw a surge in stock prices, with Cambrian Technology closing up 20%, making it the top company on the STAR Market [2][24]. - The semiconductor ETF also experienced a substantial increase, rising by 10% on the same day [3][26]. - The overall market response indicates a strong investor sentiment towards domestic chip manufacturers, reflecting optimism about their future prospects [26]. Group 2: Technical Insights - The concept of "UE8M0 FP8" is introduced, which is a new format that enhances the efficiency of data processing in AI applications by using an 8-bit micro-scaling format [6][10]. - This format allows for a significant reduction in data bandwidth requirements, saving 75% of the flow compared to traditional formats, which is crucial for the next generation of chip architectures [21][28]. - The UE8M0 format is designed to improve dynamic range and reduce information loss, making it more suitable for next-generation domestic chips that are beginning to adopt FP8 capabilities [19][27]. Group 3: Industry Implications - The adoption of UE8M0 FP8 by domestic chip manufacturers signifies a move towards reducing reliance on foreign computing power, particularly from companies like NVIDIA and AMD [27][29]. - This shift is expected to enhance the competitiveness of domestic chips, as they will be able to run larger models with improved efficiency, thus increasing their value proposition in the market [28][29]. - The collaboration between DeepSeek and domestic chip manufacturers is likened to the historical "Wintel alliance," suggesting a potential for creating a robust ecosystem around domestic AI technologies [29]. Group 4: Key Players - Several domestic companies, including Cambrian Technology, Huawei, and others, are identified as early adopters of the UE8M0 FP8 format, with their products already supporting this new precision format [24][30]. - The market is closely watching these companies, particularly Cambrian Technology, which has seen a significant increase in market capitalization due to its advancements in chip technology [24][26].
究竟会花落谁家?DeepSeek最新大模型瞄准了下一代国产AI芯片
机器之心· 2025-08-22 04:01
Core Viewpoint - DeepSeek has released its upgraded model V3.1, which features a new hybrid reasoning architecture that supports both "thinking" and "non-thinking" modes, resulting in significant performance improvements in various intelligent tasks [1][6]. Performance Improvement - The V3.1 model has shown substantial performance enhancements compared to its predecessors, with benchmark scores in SWE-bench verified at 66.0, compared to 45.4 and 44.6 for previous models [2]. - In multilingual programming benchmarks, V3.1 outperformed Anthropic's Claude 4 Opus while also demonstrating a significant cost advantage [1][2]. - The model's token consumption can be reduced by 20-50% while maintaining task performance, making its effective cost comparable to GPT-5 mini [2]. Technical Innovations - DeepSeek V3.1 utilizes a unique mechanism called UE8M0 FP8, designed for upcoming domestic chips, which indicates a move towards independent innovation in FP8 technology [5][8]. - The model parameters amount to 685 billion, and it employs FP8 format to lower storage and computational costs while maintaining numerical stability and model precision [7][10]. - The UE8M0 format uses all 8 bits for the exponent, allowing for a wide range of positive values, which is particularly suitable for handling large-scale data variations [9]. Industry Context - The adoption of FP8 technology is gaining traction among major players like Meta, Intel, and AMD, indicating a potential shift towards this format as a new industry standard [8]. - Domestic AI chip manufacturers, including Huawei and Cambrian, are focusing on supporting FP8 format, which has drawn significant attention from the industry and investors [9][10]. - There are speculations regarding the training of DeepSeek V3.1 on domestic chips, although the likelihood appears low at this stage, with the UE8M0 mechanism likely optimized for domestic inference chips [14][15].
DeepSeek正式发布新模型,还透露国产AI芯片关键信息
Xuan Gu Bao· 2025-08-21 23:22
Group 1 - DeepSeek's latest V3.1 version utilizes UE8M0 FP8 Scale parameter precision, designed for the upcoming domestic chip release [1] - FP8 is a cutting-edge low-precision format for AI computing, significantly enhancing GPU performance and reducing memory usage for large language model training [1] - Domestic GPUs are rapidly developing, transitioning from "usable" to "user-friendly" stages, although they have not yet matched international products [1] Group 2 - Companies like Cambrian, Haiguang Information, and Huawei are leading the A-share computing chip market [3] - Moer Thread provides AI training and inference cards, with its latest GPU supporting FP8 precision, significantly boosting AI computing power [1] - Muxi offers C series GPUs for integrated training and inference, and N series GPUs focused on cloud AI inference, showcasing strong mixed-precision computing capabilities [2] Group 3 - The global GPU market is projected to reach 36,119.74 billion yuan by 2029, with China's GPU market expected to reach 13,635.78 billion yuan, increasing its global market share from 30.8% in 2024 to 37.8% in 2029 [2] - DeepSeek is driving the shift of AI applications from centralized cloud services to mass terminals, necessitating high-cost performance dedicated chips [2] - The domestic chip manufacturers and application enterprises are accelerating their integration with DeepSeek, anticipating a significant increase in domestic computing power by 2025 [2] Group 4 - Huawei's Ascend ecosystem includes companies like Tuo Wei Information, Digital China, and Huafeng Technology, enhancing mixed inference architecture and agent capabilities [4] - The upgraded model shows significant improvements in tool usage and intelligent agent tasks through Post-Training optimization [4] Group 5 - Related companies include Dingjie Zhizhi, Fanwei Network, and Kute Intelligent [5]