Dynamo 推理架构
Search documents
从英伟达Rubin CPX和Oracle看算力趋势
2025-09-11 14:33
Summary of Conference Call Records Industry and Companies Involved - **Industry**: AI Computing and Cloud Infrastructure - **Companies**: Nvidia, Oracle, DeepSeek, Coveo, Nevo, Microsoft, SenseTime, Alibaba Cloud Key Points and Arguments Nvidia's Innovations - Nvidia's RoboMaster 4PX supports FP4 data format, significantly reducing computing power consumption and costs while enhancing inference efficiency [1][2] - The hardware now separates the Prefill and Decode stages during inference, optimizing the process and improving efficiency [1][2] - The introduction of the Dynamo inference architecture allows for integrated computing and automatic adjustment of computation graphs, enhancing model inference [1][6] Oracle's Competitive Advantages - Oracle emphasizes its AI inference capabilities through the construction of large-scale computing clusters, equipped with optimized software frameworks and vector database capabilities [1][3][10] - The company projects its cloud infrastructure revenue could exceed $100 billion by FY2029, showcasing its growth potential [3][11] - Oracle's model involves not just providing bare metal cabinets but also integrating hardware and software for enhanced performance [8][9] Transition from Training to Inference - The AI industry is shifting focus from training to inference, with large tech companies increasingly outsourcing GPU cloud services to save costs and improve flexibility [2][13] - Hardware utilization rates are critical for cloud service providers' profitability, with typical usage rates for H20 clusters adjustable to 75%-80% and H100 clusters reaching 85%-88% [13] Domestic Computing Cards - Domestic computing cards currently lag behind Nvidia's advancements, with the latest FP4 format not yet supported [5] - The next critical development for domestic cards is to achieve support for FP4 to close the gap with international leaders [5] Investment Opportunities - Nvidia's Robin CPX inference tool is expected to benefit related industries such as optical modules and PCBs, with companies like Industrial Fulian being highlighted [14] - Domestic companies like Haiguang and Haiwu G are also positioned to benefit from increased computing power demand [14] - SenseTime is noted for its capabilities in building card clusters and having a comprehensive AI training and inference framework [14] Other Important Insights - The collaboration between Nvidia's Robin CPS GPU system and the Dynamo framework enhances the efficiency of long-context applications like AI programming and video processing [7] - Oracle's complex and high-tech infrastructure is not easily replicable by typical data centers, giving it a significant edge in the AI inference market [10]