MoE (Mixture of Experts)
Search documents
最新英伟达经济学:每美元性能是AMD的15倍,“买越多省越多”是真的
量子位· 2026-01-01 04:15
Core Insights - The article emphasizes that NVIDIA remains the dominant player in AI computing power, providing significantly better performance per dollar compared to AMD [1][30]. - A report from Signal65 reveals that under certain conditions, NVIDIA's cost for generating the same number of tokens is only one-fifteenth of AMD's [4][30]. Performance Comparison - NVIDIA's platform offers 15 times the performance per dollar compared to AMD when generating tokens [1][30]. - The report indicates that for complex models, NVIDIA's advantages become more pronounced, especially in the context of the MoE (Mixture of Experts) architecture [16][24]. MoE Architecture - The MoE architecture allows models to split parameters into specialized "expert" sub-networks, activating only a small portion for each token, which reduces computational costs [10][11]. - However, communication delays between GPUs can lead to idle time, increasing costs for service providers [13][14]. Cost Analysis - Despite NVIDIA's higher pricing, the overall cost-effectiveness is better due to its superior performance. For instance, the GB200 NVL72 costs $16 per GPU per hour, while AMD's MI355X costs $8.60, making NVIDIA's price 1.86 times higher [27][30]. - The report concludes that at 75 tokens per second per user, the performance advantage of NVIDIA is 28 times, resulting in a cost per token that is one-fifteenth of AMD's [30][35]. Future Outlook - AMD's competitiveness is not entirely negated, as its MI325X and MI355X still have applications in dense models and capacity-driven scenarios [38]. - AMD is developing a cabinet-level solution, Helios, which may narrow the performance gap in the next 12 months [39].