Workflow
算力优化
icon
Search documents
DeepSeek“点燃”国产芯片 FP8能否引领行业新标准?
财联社· 2025-08-24 04:34
FP8是什么,有哪些提升? 在AI训练与推理过程中,为提升计算效率,数值精度的降低是一个常见的技术路径。 摩尔线程AI Infra总监陈志向《科创板日报》记者称, 过去,大模型训练推理普遍使用FP32(32位浮点数),随后逐步过渡到FP16(16位浮点 数)混合精度,以减少存储和通信开销,FP8则进一步将数据宽度压缩至8位。 国产大模型企业DeepSeek"点燃"资本市场。 近日,DeepSeek宣布其新一代模型DeepSeek-V3.1采用了UE8M0 FP8 Scale参数精度,并明确指出该精度标准是针对即将发布的下一代 国产芯片设计。这一消息迅速在资本市场引发强烈反应,寒武纪等芯片类上市企业股价集体拉升。 不过,在近两日举办的2025算力大会上,据《科创板日报》记者的现场采访和观察来看, 大家在聚焦国产算力时,DeepSeek的FP8精度标准 虽被讨论,但业内人士的情绪显然没有资本市场那么高亢。技术派更关注FP8在模型训练、推理及生态标准化上的实际价值与挑战。 在业内看来 , DeepSeek此举无疑给了国内算力厂商的机会,FP8代表了算力优化的正确方向,大模型训练推理不只是堆砌硬件,但它也并非"灵丹 ...
华为“数字化风洞”小时级预演万卡集群方案,昇腾助力大模型运行“又快又稳”
第一财经· 2025-06-11 12:12
Core Viewpoint - The article emphasizes the importance of optimizing hardware and software integration in AI model training and inference systems to avoid inefficiencies and maximize computational power [1][2][3]. Group 1: Challenges and Solutions - The article identifies three main challenges in dynamic load demands and the hardware-software interplay, proposing a "digital wind tunnel" for pre-simulation of AI models to identify bottlenecks and optimize resource allocation [2][3]. - The "Sim2Train" framework is introduced as an efficiency engine for large-scale training clusters, addressing issues like resource allocation and communication efficiency to maintain high performance during training [3][4]. Group 2: Performance Optimization Techniques - The "Sim2Infer" framework is presented as a performance accelerator for inference systems, utilizing dynamic optimization techniques to enhance end-to-end inference performance by over 30% [5][10]. - The article discusses a multi-level inference system modeling simulation that integrates various core functions to achieve optimal hardware utilization and low latency in AI applications [10][11]. Group 3: Reliability and Availability - The "Sim2Availability" framework is described as a safety net for large-scale training clusters, ensuring high availability and quick recovery from hardware failures, achieving a 98% availability rate [9][11]. - The article highlights the importance of real-time monitoring and fault management in maintaining the reliability of AI computing systems [9][11]. Group 4: Future Outlook - The article concludes with a vision for continuous innovation in system architecture to support evolving AI applications, emphasizing the need for advanced modeling and simulation techniques to enhance computational infrastructure [12].