究竟会花落谁家？DeepSeek最新大模型瞄准了下一代国产AI芯片

Core Viewpoint - DeepSeek has released its upgraded model V3.1, which features a new hybrid reasoning architecture that supports both "thinking" and "non-thinking" modes, resulting in significant performance improvements in various intelligent tasks [1][6]. Performance Improvement - The V3.1 model has shown substantial performance enhancements compared to its predecessors, with benchmark scores in SWE-bench verified at 66.0, compared to 45.4 and 44.6 for previous models [2]. - In multilingual programming benchmarks, V3.1 outperformed Anthropic's Claude 4 Opus while also demonstrating a significant cost advantage [1][2]. - The model's token consumption can be reduced by 20-50% while maintaining task performance, making its effective cost comparable to GPT-5 mini [2]. Technical Innovations - DeepSeek V3.1 utilizes a unique mechanism called UE8M0 FP8, designed for upcoming domestic chips, which indicates a move towards independent innovation in FP8 technology [5][8]. - The model parameters amount to 685 billion, and it employs FP8 format to lower storage and computational costs while maintaining numerical stability and model precision [7][10]. - The UE8M0 format uses all 8 bits for the exponent, allowing for a wide range of positive values, which is particularly suitable for handling large-scale data variations [9]. Industry Context - The adoption of FP8 technology is gaining traction among major players like Meta, Intel, and AMD, indicating a potential shift towards this format as a new industry standard [8]. - Domestic AI chip manufacturers, including Huawei and Cambrian, are focusing on supporting FP8 format, which has drawn significant attention from the industry and investors [9][10]. - There are speculations regarding the training of DeepSeek V3.1 on domestic chips, although the likelihood appears low at this stage, with the UE8M0 mechanism likely optimized for domestic inference chips [14][15].