Workflow
SP8数据格式
icon
Search documents
大厂怎么看DeepSeek-V3
2025-08-25 09:13
Summary of DeepSeek and the AI Chip Industry Conference Call Industry and Company Overview - The conference call focuses on the AI chip industry, specifically discussing DeepSeek's new U18M Zero IP8 format and its implications for domestic AI chip development and training efficiency. Key Points and Arguments Introduction of U18M Zero IP8 Format - DeepSeek has defined the U18M Zero IP8 format to establish a new standard for domestic chips, aiming to reduce training memory usage by 20%-30% and improve training efficiency by 30%-40% [1][2] - This new format is expected to guide the design of the next generation of domestic chips and may expand into the RP8 protocol standard through OCP [1][2] Training and Inference Efficiency - The U18M Zero IP8 format optimizes memory usage and computational overhead by splitting weight data into smaller blocks, thus enhancing training and inference efficiency while maintaining high precision [4] - The SP8 data format is anticipated to significantly improve the training efficiency of domestic large models, helping to close the gap with international leaders [6][7] Current Challenges in Domestic AI Chips - Domestic AI chips face challenges such as insufficient operator coverage (approximately 50%), gradient quantization errors, and immature tensor expansion [8][9] - Full-scale application of these technologies is expected to take until Q2 or Q3 of the following year [8] Future Developments and Market Impact - The introduction of FP8 format in inference will lower costs and is expected to be implemented rapidly in domestic chips within the next six months to a year [8] - However, no domestic manufacturer can independently complete training tasks yet, with significant technical hurdles remaining [8][10] Mixed Precision Strategy - DeepSeek employs a mixed precision strategy to balance performance and precision, retaining high precision for sensitive parameters while using the new U18M Zero IP8 format for less sensitive ones [5] Competitive Landscape - DBC V3.1 version introduces mixed inference capabilities and enhances agent abilities, with a significant increase in the dataset size to 840 billion tokens, improving understanding of long texts and code [3][25] - Compared to international models like GPT-5 and Claude 4, DBC V3.1 ranks among the top six globally, indicating strong competitiveness [26][27] Multi-Modal Transition - By Q1 2026, leading domestic AI models are expected to transition into the multi-modal era, requiring high-performance computing resources [30] - The integration of different modalities will necessitate re-training and will increase the demand for training equipment [30] Long-Term Outlook - The adoption of new data formats and standards is a gradual process, with significant changes expected over the next year, particularly in hardware support for FP8 [10][11] - The industry is moving towards a more standardized approach to avoid fragmentation, with major manufacturers leading the charge [10] Additional Important Insights - The current strategy involves maximizing the potential of existing hardware while preparing for the transition to new formats [19] - The impact of new formats on model training methods will require substantial adjustments and a phased approach to implementation [15][16] - The FP8 format has limitations in high-precision fields such as finance and medicine, indicating a need for careful application [23][24] This summary encapsulates the critical insights from the conference call, highlighting the advancements and challenges within the domestic AI chip industry and the strategic direction of DeepSeek.