Workflow
GB200 NVL72
icon
Search documents
How Extreme Hardware–Software Co-Design Is Driving the Future of AI Supercomputing
NVIDIA· 2025-11-24 16:41
Hi, I'm Jesse Clayton, product marketing manager for AI infrastructure at NVIDIA here at Supercomputing 2025. More than ever, scientific discovery relies on converged high performance computing and AI. But meeting the needs of today's workflows demands extreme code design, hardware and software, built-in synergy to deliver optimizations across the entire stack and accelerate breakthrough science at scale.Nvidia's platform spans GPUs, CPUs, DPUs, nicks, scale up networking, scale out networking, and software ...
上市飙涨 5 倍、随后腰斩,英伟达“亲儿子”CRWV 股价神话何时重现?
RockFlow Universe· 2025-11-24 10:32
划重点 ① 云计算正在经历一场暴力更迭。当 AWS 等传统巨头在忙着 AI 转型时,CoreWeave 等 Neocloud 正凭借极致效率,成为 AI 时代的"新基建"。RockFlow 投研团队认为,Neocloud 正在 打破旧秩序,填补算力黑洞,成为继 NVIDIA 芯片之后,AI 产业链中爆发力最强的 Alpha 赛 道。 ② CoreWeave 是这场大戏的绝对主角。从加密矿工到估值 370 亿美元的独角兽,其营收两年暴 涨 100 倍。这背后是英伟达"权力的游戏"——黄仁勋通过扶持"亲儿子"来制衡科技巨头,赋予 其优先配货权和百亿产能保底。它不仅是英伟达的超级分销商,更手握 AI 时代最深的护城河: 稀缺供给和与 OpenAI、微软的深度绑定。 ③ CoreWeave 的终局会是下一个亚马逊,还是当年的思科泡沫?市场在狂热,但逻辑需冷峻。 投资 Neocloud,本质是博弈算力的"供需周期"。它既有成为 AI 时代"超级公用事业公司" 的潜 力,也面临着高杠杆带来的生死考验。 RockFlow 本文共3192字, 阅读需约11分钟 2025 年 3 月,当 CoreWeave (CRWV) 敲响 ...
Meta首席AI科学家杨立昆拟离职创业;“大空头”伯里:AI巨头靠会计手法人为抬高利润丨全球科技早参
Mei Ri Jing Ji Xin Wen· 2025-11-11 23:57
Group 1: AMD and AI Data Center Market - AMD CEO Lisa Su predicts that the AI data center market will exceed $1 trillion by 2030, highlighting significant growth potential in the industry [1] - AMD plans to launch the next-generation MI400 series AI chips in 2026, which will include various models for scientific computing and generative AI [1] - The company expects overall revenue to grow at a compound annual growth rate of approximately 35% over the next three to five years, primarily driven by its data center business [1] Group 2: Meta's AI Leadership Changes - Meta's Chief AI Scientist Yann LeCun plans to leave the company to start his own venture, indicating a significant shift in the AI landscape [2] - LeCun is reportedly in early discussions with potential investors to fund his startup, which will focus on "world models" research [2] - This departure follows other high-profile exits from Meta's AI division, including the departure of AI Research VP Joelle Pineau and recent layoffs affecting around 600 employees [2] Group 3: OpenAI and Copyright Issues - A German court ruled that OpenAI infringed copyright by using lyrics from a German musician without authorization, requiring compensation to a major music copyright association [3] - This case may set a significant precedent for copyright regulation of generative AI technologies in Europe [3] - The lawsuit was initiated by a major music copyright collective, representing around 100,000 songwriters and publishers [3] Group 4: Microsoft's AI Investment in Europe - Microsoft announced a $10 billion investment in AI infrastructure in Sintra, Portugal, marking one of the largest AI investment projects in Europe [4] - The project will involve collaboration with developers and chip manufacturers, including Nvidia, to deploy 12,600 next-generation GPUs [4] - This investment aims to position Portugal as a leading hub for responsible and scalable AI development in Europe [4] Group 5: Accounting Practices of Tech Giants - Investor Michael Burry criticized major tech companies for extending the useful life of assets to artificially inflate profits, labeling it a common form of fraud [5][6] - Burry highlighted that companies like Meta, Alphabet, Microsoft, Oracle, and Amazon are extending depreciation periods for equipment typically with a 2-3 year lifecycle [5][6] - He estimates that these practices could lead to an inflated profit of $176 billion for large tech companies from 2026 to 2028 due to underestimated depreciation [5][6]
中美算力,都等电来
Xi Niu Cai Jing· 2025-11-07 08:21
Core Insights - The token economy in both China and the U.S. is heavily reliant on electricity, with each country facing unique challenges in this regard [1][3] - The U.S. is experiencing a power shortage due to outdated generation and grid infrastructure, limiting token production [1][2] - In contrast, China faces high token production costs due to relatively low-efficiency hardware, impacting the overall cost of token generation [1][3] Group 1: U.S. Challenges - Microsoft CEO Satya Nadella emphasized that the real issue is not a shortage of GPUs but a lack of electricity, which restricts token production and monetization [1] - Major U.S. tech companies are in a race for AI infrastructure investment, which has turned into a competition for electricity supply [1][2] - The construction of large-scale data centers in the U.S. is progressing from 1GW to 10GW, with companies like Crusoe targeting significant capacity increases [1][2] Group 2: Infrastructure and Policy - Silicon Valley giants are urging the White House for support in developing infrastructure, particularly the power grid, to match the pace of AI innovation [3] - OpenAI has suggested that the U.S. needs to add 100GW of electricity capacity annually to compete effectively in AI against China [3] - The U.S. added 51GW of power capacity last year, while China added 429GW, highlighting a significant "power gap" [3] Group 3: China's Challenges - China's AI infrastructure is built on domestic chips, which currently have lower efficiency, leading to increased demand for computational power [3][4] - ByteDance's daily token calls have surged from 16.4 trillion in May to 30 trillion in September, indicating a rapid increase in computational needs [3] - The cost of electricity for a major cloud provider in China is estimated at 8-9 billion yuan for 1GW annually, reflecting the high operational costs associated with domestic chip usage [5] Group 4: Efficiency and Cost - The competition in the token economy involves not just hardware but also the software, tools, and the electricity and cooling systems required to operate them [4] - Huawei's CloudMatrix 384 has shown a significant increase in total computational power but at a much higher energy cost compared to NVIDIA's latest offerings [5][6] - The average industrial electricity cost in the U.S. is approximately 9.1 cents per kWh, while certain regions in China have reduced costs to below 4 cents per kWh, indicating a competitive advantage for Chinese data centers [6]
回归技术--Scale Up割裂的生态
傅里叶的猫· 2025-10-18 16:01
Core Viewpoint - The article discusses the comparison of Scale Up solutions in AI servers, focusing on the UALink technology promoted by Marvell and the current mainstream Scale Up approaches in the international market [1][3]. Comparison of Scale Up Solutions - Scale Up refers to high-speed communication networks between GPUs within the same server or rack, allowing them to operate collaboratively as a large supercomputer [3]. - The market for Scale Up networks is projected to reach $4 billion in 2024, with a compound annual growth rate (CAGR) of 34%, potentially growing to $17 billion by 2029 [5][7]. Key Players and Technologies - NVIDIA's NVLink technology is currently dominant in the Scale Up market, enabling GPU interconnection and communication within server configurations [11][12]. - AMD is developing UALink, which is based on its Infinity Fabric technology, and aims to transition to a complete UALink solution once native switches are available [12][17]. - Google utilizes inter-chip interconnect (ICI) technology for TPU Scale Up, while Amazon employs NeuronLink for its Trainium chips [13][14]. Challenges in the Ecosystem - The current ecosystem for Scale Up solutions is fragmented, with various proprietary technologies leading to compatibility issues among different manufacturers [10][22]. - Domestic GPU manufacturers face challenges in developing their own interconnect protocols due to system complexity and resource constraints [9]. Future Trends - The article suggests that as the market matures, there will be a shift from proprietary Scale Up networks to open solutions like UAL and SUE, which are expected to gain traction by 2027-2028 [22]. - The choice between copper and optical connections for Scale Up networks is influenced by cost and performance, with copper currently being the preferred option for short distances [20][21].
X @郭明錤 (Ming-Chi Kuo)
Market Position & Competition - Oracle accounted for roughly 12% of global GB200 NVL72 shipments in 2025, trailing Microsoft (~30%), Google (~16%), and Dell (~14%) [1] - Oracle's GB200 NVL72 deliveries arrived later, around late second quarter 2025, compared to initial small-batch shipments starting in first quarter 2025 [1] GPU Rental Business Analysis (June-August 2025) - Oracle generated approximately $900 million from Nvidia chip rentals, with a gross profit of $125 million [1] - Oracle experienced losses on rentals of small quantities of both newer and older Nvidia GPUs [1] - During June-August 2025, Oracle was in the early stages of receiving and deploying GB200 NVL72 systems, resulting in limited Blackwell compute capacity and a focus on prior-generation Hopper rentals [2] Profitability Factors - Early GB200 NVL72 deployments are unlikely to be profitable due to increased AI server costs, front-loaded infrastructure retrofit expenses, and limited initial compute/service scale [3] - Losses largely reflect the early-phase costs of the Hopper-to-Blackwell transition, particularly when considering "small quantities of both newer and older Nvidia chips" [5] Key Takeaways from The Information Report - The "small quantities" mentioned in the report are attributed to the recent arrival and preparation of Blackwell/GB200 NVL72 for service, limiting available compute capacity [4]
X @郭明錤 (Ming-Chi Kuo)
Business Overview - The report analyzes an article by The Information regarding Oracle's Nvidia GPU rental business, focusing on the period of June-August 2025 [1] - The analysis aims to clarify the logic behind the report, without commenting on the motives or Oracle's stock price [1] Financial Performance - In June-August 2025, Oracle generated approximately $900 million from Nvidia chip server rentals, with a gross profit of $125 million [1] - Oracle experienced losses in some cases when renting out small quantities of both newer and older Nvidia chips [1] Industry Dynamics - Oracle's procurement of GB200 NVL72 accounted for approximately 12% of global shipments in 2025, ranking lower than Microsoft (~30%), Google (~16%), and Dell (~14%) [2] - Oracle's GB200 NVL72 acquisition occurred later, around the end of 2Q25, compared to the small-scale shipments that began in 1Q25 [2] - During June-August 2025, Oracle was in the process of acquiring and deploying GB200 NVL72, primarily offering Hopper GPU compute power for rental services [2] Cost Analysis - Initial deployment of GB200 NVL72 was not profitable due to increased AI server costs, infrastructure upgrades, and limited scale of compute power/services [2] - Losses were attributed to the high initial costs of transitioning from Hopper to Blackwell, specifically when considering "small quantities of both newer and older versions of Nvidia chips" [3]
英伟达豪掷600亿!挖人、救急、扶贫
美股研究社· 2025-09-25 13:06
Core Insights - Nvidia's recent investment of $5 billion in Intel marks its largest investment to date, significantly boosting Intel's stock price by over 20% [4][29][32] - Nvidia has been actively acquiring AI startups, with plans to purchase at least 11 companies between 2024 and 2025, including several founded by Chinese entrepreneurs [15][16][19] - Nvidia's strategy includes not only large investments but also talent acquisition, as seen in its $900 million deal for Enfabrica's CEO and team, focusing on advanced networking chip technology [20][21][25] Investment and Acquisition Strategy - Nvidia's investment in Intel is part of a broader strategy to enhance its AI infrastructure and capabilities, integrating Nvidia's GPUs with Intel's x86 CPUs for future products [7][29] - The company has committed to investing £2 billion (approximately ¥193 billion) in the UK AI startup ecosystem, including a significant investment in Nscale [10][37][40] - Nvidia's acquisition strategy has shifted towards smaller, strategic purchases, focusing on niche AI technologies and talent rather than large-scale acquisitions [16][21][22] Market Impact - Nvidia's total market capitalization has increased by $895.4 billion (approximately ¥6 trillion) this year, reflecting a 26.43% rise [13] - The collaboration with Intel is expected to create substantial business growth opportunities for both companies, particularly in the data center and AI computing sectors [7][29] - Nvidia's investments in AI startups and infrastructure are positioning it as a key player in the rapidly evolving AI market, with a focus on fostering innovation and expanding its ecosystem [42]
又一次巨大飞跃: The Rubin CPX 专用加速器与机框 - 半导体分析
2025-09-11 12:11
Summary of Nvidia's Rubin CPX Announcement Company and Industry - **Company**: Nvidia - **Industry**: Semiconductor and GPU manufacturing, specifically focusing on AI and machine learning hardware solutions Key Points and Arguments 1. **Introduction of Rubin CPX**: Nvidia announced the Rubin CPX, a GPU optimized for the prefill phase of inference, emphasizing compute FLOPS over memory bandwidth, marking a significant advancement in AI processing capabilities [3][54] 2. **Comparison with Competitors**: The design gap between Nvidia and competitors like AMD has widened significantly, with AMD needing to invest heavily to catch up, particularly in developing their own prefill chip [5][6] 3. **Technical Specifications**: The Rubin CPX features 20 PFLOPS of FP dense compute and only 2 TB/s of memory bandwidth, utilizing 128 GB of GDDR7 memory, which is less expensive compared to HBM used in previous models [9][10][17] 4. **Rack Architecture**: The introduction of the Rubin CPX expands Nvidia's rack-scale server offerings into three configurations, allowing for flexible deployment options [11][24] 5. **Cost Efficiency**: By using GDDR7 instead of HBM, the Rubin CPX reduces memory costs by over 50%, making it a more cost-effective solution for AI workloads [17][22] 6. **Disaggregated Serving**: The Rubin CPX enables disaggregated serving, allowing for specialized hardware to handle different phases of inference, which can improve efficiency and performance [54][56] 7. **Impact on Competitors**: The announcement is expected to force Nvidia's competitors to rethink their roadmaps and strategies, as failing to release a comparable prefill specialized chip could lead to inefficiencies in their offerings [56][57] 8. **Performance Characteristics**: The prefill phase is compute-intensive, while the decode phase is memory-bound. The Rubin CPX is designed to optimize performance for the prefill phase, reducing waste associated with underutilized memory bandwidth [59][62] 9. **Future Roadmap**: The introduction of the Rubin CPX is seen as a pivotal moment that could reshape the competitive landscape in the AI hardware market, pushing other companies to innovate or risk falling behind [56][68] Other Important but Possibly Overlooked Content 1. **Memory Utilization**: The report highlights the inefficiencies in traditional systems where both prefill and decode phases are processed on the same hardware, leading to resource wastage [62][66] 2. **Cooling Solutions**: The new rack designs incorporate advanced cooling solutions to manage the increased power density and heat generated by the new GPUs [39][43] 3. **Modular Design**: The new compute trays feature a modular design that enhances serviceability and reduces potential points of failure compared to previous designs [50][52] 4. **Power Budget**: The power budget for the new racks is significantly higher, indicating the increased performance capabilities of the new hardware [29][39] This summary encapsulates the critical aspects of Nvidia's announcement regarding the Rubin CPX, its implications for the industry, and the technical advancements that set it apart from competitors.
英伟达(NVDA):公司点评:长期空间广阔,产品迭代顺利推进
SINOLINK SECURITIES· 2025-08-28 08:39
Investment Rating - The report maintains a "Buy" rating for the company, indicating an expected price increase of over 15% in the next 6-12 months [5]. Core Insights - The company reported FY26Q2 revenue of $46.743 billion, a year-on-year increase of 55.6% and a quarter-on-quarter increase of 6.1%. The GAAP gross margin was 72.4%, with a GAAP net profit of $26.422 billion [2]. - The data center business continues to grow, with FY26Q2 revenue reaching $41.096 billion, up 56.4% year-on-year and 5.1% quarter-on-quarter. Network revenue surged by 98% year-on-year, driven by cabinet shipments and the Spectrum-X platform [3]. - Non-data center businesses also showed strong growth, with gaming, professional visualization, and automotive revenues of $4.287 billion, $601 million, and $586 million, respectively, reflecting year-on-year increases of 48.9%, 32.4%, and 69.4% [4]. - The company is expected to become a significant AI hardware platform, with downstream cloud vendors driving growth through model iteration and increased inference demand. Sovereign AI demand is anticipated to contribute additional revenue [5]. Summary by Sections Performance Review - FY26Q2 revenue was $46.743 billion, with a GAAP net profit of $26.422 billion and a GAAP gross margin of 72.4% [2]. Business Analysis - Data center revenue for FY26Q2 was $41.096 billion, with a notable increase in network revenue due to product iterations and new customer adoption [3]. - Non-data center revenue segments, including gaming and automotive, also experienced significant growth [4]. Profit Forecast and Valuation - Projected GAAP net profits for FY26, FY27, and FY28 are $111.15 billion, $164.14 billion, and $188.22 billion, respectively, supporting the "Buy" rating [5].