Blackwell Ultra B300

Search documents
H20卖不动?英伟达最新“特供版”芯片曝光
是说芯语· 2025-08-20 08:45
Core Viewpoint - NVIDIA is developing a China-specific AI chip, B30A, based on its latest Blackwell architecture, which is expected to outperform the currently available H20 model in China [1][4]. Group 1: Chip Specifications - The Blackwell architecture was introduced at NVIDIA's GTC conference last year as an upgrade to the Hopper architecture [3]. - The B30A chip will feature a single-die design, integrating all major circuits on a single silicon wafer, enhancing overall performance and stability [3]. - B30A's original computing performance is anticipated to be about half of the flagship Blackwell Ultra B300's dual-chip configuration [3]. - The Blackwell Ultra B300 is manufactured using TSMC's 4NP process, supports fifth-generation NVLink with a chip-to-chip interconnect bandwidth of 1.8TB/s, and has a typical power consumption of 1400W [3]. Group 2: Performance Comparison - The B30A is likely a "cut-down" version of the B300A, which is a general product for the global market, featuring 144GB HBM3E and approximately 4 PFLOPS of FP8 performance [4]. - The H20 model has 96GB HBM3, a bandwidth of 900GB/s, and an FP8 performance of 296 TFLOPS, which is significantly lower than both B300A and B300 [4]. - The B30A is expected to exceed the performance of H20 and may approach that of H100, which has an FP8 performance of 4 PFLOPS [4]. Group 3: Market Strategy - NVIDIA is also planning to launch the RTX 6000D, which will be priced lower than H20 and designed for AI inference tasks, featuring traditional GDDR memory with a bandwidth of approximately 1.398 TB/s [5]. - The RTX 6000D is expected to be delivered in small batches to customers as early as September, further expanding NVIDIA's product offerings in the Chinese market [5].
深度解读黄仁勋GTC演讲:全方位“为推理优化”,“买越多、省越多”,英伟达才是最便宜!
硬AI· 2025-03-19 06:03
Core Viewpoint - Nvidia's innovations in AI inference technologies, including the introduction of inference Token expansion, inference stack, Dynamo technology, and Co-Packaged Optics (CPO), are expected to significantly reduce the total cost of ownership for AI systems, thereby solidifying Nvidia's leading position in the global AI ecosystem [2][4][68]. Group 1: Inference Token Expansion - The rapid advancement of AI models has accelerated, with improvements in the last six months surpassing those of the previous six months. This trend is driven by three expansion laws: pre-training, post-training, and inference-time expansion [8]. - Nvidia aims to achieve a 35-fold improvement in inference cost efficiency, supporting model training and deployment [10]. - As AI costs decrease, the demand for AI capabilities is expected to increase, demonstrating the classic example of Jevons Paradox [10][11]. Group 2: Innovations in Hardware and Software - Nvidia's new mathematical rules introduced by CEO Jensen Huang include metrics for FLOPs sparsity, bidirectional bandwidth measurement, and a new method for counting GPU chips based on the number of chips in a package [15][16]. - The Blackwell Ultra B300 and Rubin series showcase significant performance improvements, with the B300 achieving over 50% enhancement in FP4 FLOPs density and maintaining an 8 TB/s bandwidth [20][26]. - The introduction of the inference stack and Dynamo technology is expected to greatly enhance inference throughput and efficiency, with improvements in smart routing, GPU planning, and communication algorithms [53][56]. Group 3: Co-Packaged Optics (CPO) Technology - CPO technology is anticipated to significantly lower power consumption and improve network scalability by allowing for a flatter network structure, which can lead to up to 12% power savings in large deployments [75][76]. - Nvidia's CPO solutions are expected to enhance the number of GPUs that can be interconnected, paving the way for networks exceeding 576 GPUs [77]. Group 4: Cost Reduction and Market Position - Nvidia's advancements have led to a performance increase of 68 times and a cost reduction of 87% compared to previous generations, with the Rubin series projected to achieve a 900-fold performance increase and a 99.97% cost reduction [69]. - The overall trend indicates that as Nvidia continues to innovate, it will maintain a competitive edge over rivals, reinforcing its position as a leader in the AI hardware market [80].