Workflow
AI Inference
icon
Search documents
中国-全球人工智能供应链最新动态;亚洲半导体的关键机遇
2025-08-19 05:42
We upgraded our Industry View to Attractive going into 2H25, and reiterate our preference of AI over non-AI. With semi tariff and FX impact concerns behind us, we expect the sector to further re-rate. It is also a good time to preview key investment themes in 2026. See our report for details: Greater China Semiconductors: AI semi strength continues; upgrading Industry View to Attractive (13 Jul 2025) | M | | Foundation | | --- | --- | --- | | August 18, 2025 10:13 AM GMT | | | | Investor Presentation Asia P ...
DigitalOcean(DOCN) - 2025 Q2 - Earnings Call Transcript
2025-08-05 13:00
DigitalOcean (DOCN) Q2 2025 Earnings Call August 05, 2025 08:00 AM ET Speaker0Ladies and gentlemen, thank you for standing by. My name is Krista, and I will be your conference operator today. At this time, I would like to welcome everyone to DigitalOcean Second Quarter twenty twenty five Earnings Conference Call. All lines have been placed on mute to prevent any background noise. After the speakers' remarks, there will be a question and answer session.Thank you. And I would now like to turn the conference o ...
Flipping the Inference Stack — Robert Wachen, Etched
AI Engineer· 2025-08-01 14:30
Flipping the Inference Stack: Why GPUs Bottleneck Real Time AI at Scale Current AI inference systems rely on brute-force scaling—adding more GPUs for each user—creating unsustainable compute demands and spiraling costs. Real-time use cases are bottlenecked by their latency and costs per user. In this talk, AI hardware expert and founder Robert Wachen will break down why the current approach to inference is not scalable, and how rethinking hardware is the only way to unlock real-time AI at scale. ---related ...
Powering AI_ Google Reports Surging 2024 Electricity & Water Use
2025-07-07 00:51
Summary of Key Points from the Conference Call Company and Industry Overview - The conference call primarily discusses **Google** and its sustainability efforts, particularly in relation to electricity and water usage in the context of the **hyperscaler** industry, which includes major tech companies like **Microsoft** and **Meta** [2][8]. Core Insights and Arguments 1. **Electricity Usage Growth**: Google's electricity use surged by **27% year-over-year** in 2024, reaching approximately **32 terawatt-hours (TWh)**, with a **25% increase in North America** and a **32% increase internationally** [2][8]. 2. **Hyperscaler Demand**: The report indicates that hyperscalers are on track for their **7th consecutive year** of **25%+ year-over-year electricity demand**, driven by increasing AI inference demand [2][8]. 3. **Data Center Capacity**: Assuming an **85% average data center utilization**, the collective increase in electricity usage by Google, Microsoft, and Meta implies an additional **2.3 gigawatts (GW)** of data center capacity needed [2][8]. 4. **Carbon-Free Energy Goals**: Google aims to achieve **100% 24/7 carbon-free energy** by **2030**. In 2024, it managed to meet **66%** of its electricity demand with carbon-free energy, a slight increase from **64%** in 2023 [8][11]. 5. **Regional Performance**: In 2024, **9 out of 20 grid regions** achieved over **80% carbon-free energy**, with the U.S. at **70%**, while the Middle East/Africa and Asia Pacific lagged at **5%** and **12%**, respectively [8][9]. 6. **Water Usage Increase**: Google's water withdrawal and consumption rose by **28%** and **27% year-over-year**, totaling approximately **11 billion gallons** and **8 billion gallons**, respectively [15][17]. 7. **Power Use Effectiveness (PUE)**: Google's global average PUE ratio remained low at **1.09x** in 2024, compared to **1.10x** in 2023, indicating efficient energy use in data centers [14][17]. Additional Important Insights 1. **Challenges in Achieving Carbon-Free Energy**: Google acknowledged various market barriers to sourcing carbon-free energy, particularly in Asia Pacific and parts of the U.S., including constrained transmission grids and higher costs for clean energy [11][12]. 2. **Trade-offs in Cooling Methods**: Google emphasized the balance between water use and electricity use in cooling data centers, noting that water is the most efficient cooling method in many regions [17][18]. 3. **Future Projections**: The U.S. Department of Energy forecasts that direct water use by data centers could increase by **17-33% annually** by **2028**, excluding indirect water use related to electricity generation [17][18]. This summary encapsulates the critical points discussed in the conference call, highlighting Google's sustainability efforts and the broader implications for the hyperscaler industry.
AAI 2025: Enterprise AI Inference – An Uber™ Success Story
AMD· 2025-07-02 17:13
AI Workloads & AMD's Solutions - AI workloads are classified into five buckets: traditional machine learning, recommendation systems, language models and generative AI, and mixed AI enabled workloads [7] - AMD offers both GPUs and CPUs to cover the span of all enterprise AI needs, supported through an open ecosystem [11] - AMD's 5 GHz EPYC processor is purpose-built as a host processor for AI accelerators, leveraging the x86 ecosystem for broad software support and flexibility [13][14] - AMD EPYC CPUs lead with 64 cores and 5 GHz operation, suitable for robust enterprise-class workloads [15] Performance & Efficiency - AMD EPYC CPUs demonstrate a 7% to 13% performance boost compared to Xeon when used as a host processor for GPUs [17] - For generative workloads, AMD EPYC CPUs show a 28% to 33% improvement compared to the competition [24] - For natural language workloads, AMD EPYC CPUs outperform the latest generation competition by 20% to 36% [25] - AMD EPYC processors are built for low power, low cost AI inference, offering fast inference, easy integration, and the ability to add AI workloads without adding significant power consumption [28] Uber's Use Case - Uber handles 33 million trips daily, serving 170 million monthly active users, requiring a robust technology stack [30] - Uber began its cloud migration journey with GCP and OCI in late 2022, focused on accelerating innovation and optimizing costs [33] - Uber is migrating more workloads to AMD CPUs in a multi-cloud environment, leveraging next-gen technologies like PCI Gen 6 and CXL [37] - Uber expects over 30% better SPECjbb2015-perf per dollar with GCP C40 chips based on Turin architecture compared to CKD [38]
花旗:全球半导体_2025 年下半年 GDDR7 推动全球 DRAM 需求上升
花旗· 2025-06-16 03:16
Investment Rating - The report reiterates a Buy rating on SK Hynix and Samsung Electronics due to expected demand growth in the DRAM market driven by GDDR7 and LPDDR5X [1][6]. Core Insights - The global memory supply shortage is anticipated to intensify in the second half of 2025, primarily due to rising demand for GDDR7 driven by advancements in AI inference models and edge AI devices [1][5]. - GDDR7 is expected to significantly enhance performance with a 2x increase in data rates, reaching 4.8Gbps per pin, and doubling bandwidth capacity to 192GB/s per device [2]. - The demand for GDDR7 is projected to contribute an additional 4.03 billion Gb to global DRAM demand in 2H25, representing a 24% increase in graphic DRAM demand and a 2.4% increase in overall global DRAM demand [4][7]. Summary by Sections GDDR7 Technology - GDDR7 features advanced PAM3 technology, improving data density by 50% per clock cycle compared to GDDR6, while operating at a lower voltage of 1.1-1.2V [2]. - The architecture of GDDR7 utilizes four 8-bit channels, enhancing parallel processing capabilities and reducing latency for AI workloads [2]. AI Inference Demand - The emergence of AI distillation technology is expected to drive significant memory demand for AI inference, leading to increased adoption of GDDR7 as an alternative to HBM [3]. Market Projections - The report projects GPU demand from DeepSeek to reach 2 million units in 2H25, with each GPU requiring 96GB of DRAM, contributing to the overall demand increase [4]. - The anticipated DRAM content upgrade in Apple's iPhone 17 series is expected to add an additional 3.2% to global DRAM demand in 2H25 [4].
Microsoft:微软(MSFT):Agentic Web Likely to Accelerate AI Inference Development-20250609
华泰金融· 2025-06-09 05:48
Investment Rating - The investment rating for the company is maintained as BUY with a target price of USD 564.57, indicating a potential upside of 20% from the closing price of USD 470.38 as of June 6, 2025 [1][8]. Core Insights - The company is leveraging its enterprise capabilities and Azure product advantages to establish a foundational platform for the Agentic Web, which is expected to accelerate the development of Agent applications and increase AI inference demand for its cloud business [1][2]. - The company has completed the infrastructure for Agentic Web development, focusing on both edge and cloud-side toolchains, which enhances development capabilities and supports third-party integrations [2]. - The cloud business has shown strong growth, with Azure and other cloud services revenue increasing by 33% year-over-year in 3QFY25, driven by AI contributions [3]. - The commercialization of AI applications in the US is accelerating, with strategic collaborations between the company and software vendors like SAP and ServiceNow enhancing cloud service reliance [4]. - Earnings forecasts project revenue growth for FY25E/FY26E/FY27E at USD 278.8 billion, USD 320.2 billion, and USD 368.8 billion respectively, with EPS expected to rise to USD 13.77, USD 16.12, and USD 18.85 [5]. Summary by Sections Development and Infrastructure - The company is enhancing its development capabilities with the launch of the GitHub Coding Agent and a complete enterprise-grade Agent customization system, which supports multi-Agent orchestration and flexible model selection [2]. - The introduction of Windows AI Foundry supports local Agent development, creating a more complete development ecosystem [2]. Financial Performance - The cloud business revenue growth of 33% year-over-year in 3QFY25 was significantly supported by AI, which contributed 16% to Azure's revenue growth [3]. - The company processed over 100 trillion tokens in 3QFY25, reflecting a fivefold increase year-over-year, indicating strong demand for AI inference [3]. Earnings and Valuation - The company maintains its earnings forecasts with projected revenues and EPS growth over the next three fiscal years, reflecting confidence in its competitive edge in AI and cloud business [5]. - The stock is valued at 41x FY25E PE, which is above the peers' average of 29.8x, justifying the BUY rating [5].