Blackwell Ultra B300

Search documents
H20卖不动?英伟达最新“特供版”芯片曝光
是说芯语· 2025-08-20 08:45
目前暂定名为B30A的芯片,将采用单芯片(single-die)设计,种设计意味着所有主要电路都集成在同 一块连续的硅晶圆上,而非分散于多个芯片,有助于提升芯片的整体性能与稳定性。从性能预期来看, B30A的原始计算性能或许是英伟达旗舰产品Blackwell Ultra B300双芯片配置的一半。它具备与H20相似 的功能特性,同样配备了高带宽内存(HBM)和NVLink技术,可实现处理器之间的快速数据传输,满 足大规模数据处理需求。 据行业消息人士透露, 英伟达正在开发基于其最新Blackwell架构的中国特供版人工智能芯片—— B30A,性能将比目前允许在中国销售的H20型号更强大。 Blackwell架构是英伟达去年的GTC大会上发布的新一代计算架构,是Hopper架构的升级。新的芯片将 搭载HBM高带宽内存与NVLink技术,实现处理器间的高速数据传输。 消息指出,B30A可能并非唯一一款针对中国市场的新产品。 英伟达还将推出RTX 6000D , 售价将低 于H20, 其配置更简单 ,采用传统的GDDR内存,显存带宽约为1.398 TB/s,定价低于H20,主要面向 AI推理任务,预计最快在9月实现 ...
深度解读黄仁勋GTC演讲:全方位“为推理优化”,“买越多、省越多”,英伟达才是最便宜!
硬AI· 2025-03-19 06:03
Core Viewpoint - Nvidia's innovations in AI inference technologies, including the introduction of inference Token expansion, inference stack, Dynamo technology, and Co-Packaged Optics (CPO), are expected to significantly reduce the total cost of ownership for AI systems, thereby solidifying Nvidia's leading position in the global AI ecosystem [2][4][68]. Group 1: Inference Token Expansion - The rapid advancement of AI models has accelerated, with improvements in the last six months surpassing those of the previous six months. This trend is driven by three expansion laws: pre-training, post-training, and inference-time expansion [8]. - Nvidia aims to achieve a 35-fold improvement in inference cost efficiency, supporting model training and deployment [10]. - As AI costs decrease, the demand for AI capabilities is expected to increase, demonstrating the classic example of Jevons Paradox [10][11]. Group 2: Innovations in Hardware and Software - Nvidia's new mathematical rules introduced by CEO Jensen Huang include metrics for FLOPs sparsity, bidirectional bandwidth measurement, and a new method for counting GPU chips based on the number of chips in a package [15][16]. - The Blackwell Ultra B300 and Rubin series showcase significant performance improvements, with the B300 achieving over 50% enhancement in FP4 FLOPs density and maintaining an 8 TB/s bandwidth [20][26]. - The introduction of the inference stack and Dynamo technology is expected to greatly enhance inference throughput and efficiency, with improvements in smart routing, GPU planning, and communication algorithms [53][56]. Group 3: Co-Packaged Optics (CPO) Technology - CPO technology is anticipated to significantly lower power consumption and improve network scalability by allowing for a flatter network structure, which can lead to up to 12% power savings in large deployments [75][76]. - Nvidia's CPO solutions are expected to enhance the number of GPUs that can be interconnected, paving the way for networks exceeding 576 GPUs [77]. Group 4: Cost Reduction and Market Position - Nvidia's advancements have led to a performance increase of 68 times and a cost reduction of 87% compared to previous generations, with the Rubin series projected to achieve a 900-fold performance increase and a 99.97% cost reduction [69]. - The overall trend indicates that as Nvidia continues to innovate, it will maintain a competitive edge over rivals, reinforcing its position as a leader in the AI hardware market [80].