AI Inference
Search documents
The Lazy Way to Play NVIDIA’s $20B Groq Deal
Yahoo Finance· 2025-12-30 13:24
For the last two years, the market has focused on Training AI, building the massive digital brains behind chatbots and data models.To understand why the VanEck Semiconductor ETF (SMH) is the aggressive choice, investors must first understand the business case behind the merger.This scenario creates the perfect storm for semiconductor Exchange Traded Funds (ETFs). These funds offer a backdoor entry into the trade, allowing investors to participate in the upside without the stress of managing a single stock t ...
Intel Snaps Up AI Tech for Pennies on the Dollar
Yahoo Finance· 2025-12-17 17:47
Market Timing: Intel is striking while the market is fearful, picking up a unicorn-status company for a fraction of its previous valuation.The NVIDIA Moat: NVIDIA’s overwhelming dominance has starved competitors of revenue, making it difficult for second-tier startups to raise the billions needed to stay afloat.The Capital Crunch: High interest rates have made it expensive for startups to borrow money.In 2021, during its peak funding rounds, SambaNova Systems was valued at over $5 billion. If the deal close ...
Intel Is Eyeing an AI Acquisition. Its Track Record Isn't Great.
The Motley Fool· 2025-12-16 00:15
Core Insights - Intel is reportedly in talks to acquire SambaNova Systems, an AI start-up previously valued at $5 billion, with a rumored acquisition price of $1.6 billion [1][9] Company Overview - SambaNova focuses on fast and efficient AI inference, developing custom AI chips known as Reconfigurable Dataflow Units (RDUs) [2] - The company offers a complete rack-scale solution called SambaRack, which integrates hardware, networking, and software, along with a cloud AI platform powered by its hardware [2] Previous Acquisition Context - Intel's last significant AI acquisition was Habana Labs in 2019 for approximately $2 billion, which focused on AI training processors [4] - Despite launching Gaudi 2 and Gaudi 3 under Intel, the chips failed to gain traction against Nvidia's GPUs due to an unfamiliar architecture and immature software ecosystem [5][6] Market Dynamics - Nvidia's CUDA platform has become the industry standard for accelerated computing, providing a competitive edge over Intel in the AI training market [7] - SambaNova's focus on AI inference solutions positions it in a more competitive market, where efficiency is crucial [10] Recent Developments - SambaNova has secured deals to power sovereign AI inference clouds in Australia, Europe, and the UK, and was selected by OVHcloud for its AI Endpoints solution [11] - The shift towards rack-scale AI solutions aligns with Intel's strategy after canceling Falcon Shores, indicating a potential acceleration in developing integrated systems [12] Strategic Implications - Acquiring SambaNova could help Intel gain ground in the AI infrastructure market, especially given its focus on AI inference and rack-scale solutions [13]
Credo Stock: The Smart Money AI Bet?
Forbes· 2025-12-04 11:35
Core Insights - Credo Technology has emerged as a crucial player in the generative AI sector, with a market capitalization of approximately $33 billion following a nearly 10% rise in stock price after a strong earnings report [2][5] - The company's stock has increased over 2.5 times year-to-date, indicating significant investor interest and confidence in its role in AI infrastructure [2] Financial Performance - In Q2 FY'26, Credo reported a revenue increase of 272% year-over-year, reaching $268 million, while adjusted net income surged over 10 times to $128 million ($0.67 per share) [5] - Guidance for Q3 indicates expected revenue could reach up to $345 million, representing a 156% growth compared to the previous year [5] - The company maintains robust profitability with a 19% operating margin and a 21% cash-flow margin, alongside a nearly debt-free balance sheet with over half of its assets in cash [13] Technological Edge - Credo addresses the "interconnect bottleneck" in data centers by utilizing Active Electrical Cables (AECs) and Bluebird DSPs, which enhance signal quality and reduce latency, making them suitable for high-density GPU environments [6][10][12] - AECs allow for thinner, longer, and faster copper connections, achieving speeds up to 1.6 Terabits per second without the heat and cost associated with optical cables [10] Market Positioning - Credo's growth is closely tied to the capital expenditure plans of major tech companies, with a total projected capex of $364 billion from Amazon, Alphabet, Microsoft, and Meta for their current fiscal years [7][8] - The company serves as a proxy for the spending of these "Big Four" tech giants, positioning itself as a focused investment in their competitive race to expand GPU clusters [8] Future Opportunities - The shift towards inference in AI could provide a significant boost for Credo, as it requires high rack density and low latency, areas where Credo's technology excels [9][12] - Despite high valuation metrics, including approximately 26 times trailing sales and over 120 times earnings, the company's strong fundamentals and rapid revenue growth justify these multiples [9][14]
SSSTC Launches 16TB Enterprise SATA SSD with Breakthrough IOPS Performance
Newsfile· 2025-11-27 01:30
SSSTC Launches 16TB Enterprise SATA SSD with Breakthrough IOPS PerformanceNovember 26, 2025 8:30 PM EST | Source: HmediumTaipei, Taiwan--(Newsfile Corp. - November 26, 2025) - Responding to the rapidly growing demand for high-density, low-latency storage in AI servers and data centers, SSSTC has introduced its next-generation enterprise solid-state drive (SSD), the ER4 Series SATA SSD. The new series offers capacities of up to 16TB, making it one of the few SATA SSDs on the market to deliver s ...
KV Cache Acceleration of vLLM using DDN EXAScaler
DDN· 2025-11-11 16:44
AI Inference Challenges & KV Caching Solution - AI inference faces challenges with large context windows, impacting tokenization and latency [1][2] - Caching context tokens speeds up responsiveness, lowers latency, and allows storing larger context amounts [4] - Effective caching requires storage systems with low latency and large capacity at scale [5] DDN's Solution & Performance - DDN's Exoscaler platform enables high-performance KV caching for AI inference, improving user concurrency, responsiveness, and user experience [7] - DDN leverages GPU direct storage (GDS) for cached engine [9] - Caching demonstrates a 10x improvement in performance with larger context [14] - DDN's Exoscaler performance can improve time to first token during inference by 10-25x [16] - DDN improves response times, provides larger cache repository space, and delivers cost-effective performance and capacity density [17] Capacity Implications - KV caching accelerates the end-user experience, putting a premium on high-performance shared storage [16] - Approximately 200,000 input characters resulted in a cache of 796 files, totaling almost 13 gigabytes [15]
Scaling AI Inference Performance in the Cloud with Nebius
NVIDIA· 2025-11-10 14:01
AI Infrastructure & Cloud Platform - The industry faces the challenge of rapid movement in AI infrastructure development [1] - The company's mission is to provide scalable, high-performance, and highly reliable AI cloud infrastructure [1] - The company builds its AI platform from the ground up, focusing on core AI scenarios: training, inference, and data processing [2] Inference & Business Needs - Inference is favored due to the real business needs behind it [2] - The power of AI lies in serving real customer use cases through inference [4] - The company aims to make inference efficient and economically pragmatic for customers to facilitate their growth [4] Model Scaling & Performance - Model sizes are growing, requiring more memory and performance, including multi-node utilization and networking [3] - Nvidia continuously pushes boundaries, providing new and more performant hardware [3] - The company provides customers with flexibility through a managed Kubernetes with autoscaling features [3] - Customers can scale up or down based on demand [4]
Google's Latest AI Chip Puts the Focus on Inference
The Motley Fool· 2025-11-09 11:42
Core Insights - Google has launched its seventh-generation Tensor Processing Unit (TPU), named Ironwood, designed specifically for AI workloads, marking a significant advancement in AI computing capabilities [1][2][3] - The new TPU offers a 10X peak performance improvement over the previous generation and more than 4X better performance per chip for both training and inference tasks [3] - Google is positioning itself in the "age of inference," where the focus shifts from training AI models to utilizing them for practical applications, anticipating a surge in demand for AI computing [5][9] Product Launch and Features - Ironwood TPUs will be available for Google Cloud customers soon, alongside new Arm-based Axion virtual machine instances that enhance performance per dollar [2] - The Ironwood TPU is optimized for high-volume AI inference workloads, which require quick response times and the ability to handle numerous requests [4] Market Position and Growth - Google Cloud generated $15.2 billion in revenue in Q3, reflecting a 34% year-over-year increase, with an operating income of $3.6 billion and an operating margin of approximately 24% [8] - The cloud computing sector is competitive, with Microsoft Azure and Amazon Web Services also expanding their AI capabilities, but Google is leveraging its decade-long experience in TPU development to gain an edge [7][9] Strategic Partnerships - AI companies like Anthropic are expanding their use of Google's TPUs, with a new deal granting access to 1 million TPUs, which is crucial for their goal of reaching $70 billion in revenue by 2028 [6]
Akamai(AKAM) - 2025 Q3 - Earnings Call Transcript
2025-11-06 22:30
Financial Data and Key Metrics Changes - Akamai reported Q3 2025 revenue of $1.055 billion, representing a 5% year-over-year increase as reported and a 4% increase in constant currency [4][20] - Non-GAAP operating margins improved to 31%, and non-GAAP earnings per share was $1.86, up 17% year-over-year as reported and in constant currency [4][20] - Non-GAAP net income for Q3 was $269 million, with a non-GAAP EPS of $1.86, exceeding guidance by $0.20 [21][24] Business Line Data and Key Metrics Changes - Cloud Infrastructure Services (CIS) revenue was $81 million, up 39% year-over-year as reported and in constant currency, accelerating from a 30% growth rate in Q2 [6][19] - Security revenue reached $568 million, up 10% year-over-year as reported and 9% in constant currency, with high-growth security products generating $77 million, an increase of 35% year-over-year [20][14] - Delivery revenue was $306 million, down 4% year-over-year as reported and in constant currency, but showing improved trends [20] Market Data and Key Metrics Changes - International revenue was $525 million, up 9% year-over-year, representing 50% of total revenue in Q3 [20] - Foreign exchange fluctuations positively impacted revenue by $4 million sequentially and $8 million year-over-year [20] Company Strategy and Development Direction - Akamai is transitioning from a CDN pioneer to a leader in cloud security and distributed cloud computing, with a focus on AI inference capabilities [5][10] - The launch of Akamai Inference Cloud aims to support the growing demand for AI inference on the internet, positioning the company to leverage its distributed architecture [7][11] - The company emphasizes the importance of reliability, aiming for five nines of uptime, which is critical for attracting major clients like banks [75] Management's Comments on Operating Environment and Future Outlook - Management expressed confidence in the growth of CIS and high-growth security solutions, anticipating continued strong demand for AI-related services [20][24] - The company expects Q4 revenue to be in the range of $1.065 billion to $1.085 billion, reflecting a 4%-6% increase as reported [23] - Management noted that the AI inference market is at a transition point, with significant growth expected as AI systems are adopted at scale [10][12] Other Important Information - Akamai's CapEx for Q3 was $224 million, representing 21% of revenue, as the company continues to invest in its CIS business [21] - The company has not repurchased any shares in Q3 but has spent $800 million year-to-date on share buybacks, marking the largest annual buyback in its history [21][22] Q&A Session Summary Question: Guidance on security and compute growth - Management reiterated security growth at about 10% and compute growth slightly under 15% for the year, with momentum in CIS [28] Question: Insights on Akamai Inference Cloud - Management indicated strong interest and demand for AI applications, with many customers looking to adopt inference capabilities [30][32] Question: Hiring strategy for sales reps - The company is continuing to hire sales reps to support new business sales in security and compute, with a transformation expected to be largely complete by Q2 next year [36][37] Question: Confidence in benefiting from capacity constraints at hyperscalers - Management highlighted Akamai's unique platform and extensive points of presence, which allow it to provide faster services compared to hyperscalers [41][42] Question: Opportunities in API Security - Management confirmed ongoing efforts to extend API security into new agentic protocols, with strong interest from customers [44] Question: CapEx requirements for inference - Management noted that CapEx will closely follow revenue and demand, with expectations for similar gross margins to current compute margins [46][47] Question: Traffic mix and future trends - Management indicated that video delivery currently dominates traffic, but AI applications are expected to increase traffic significantly in the future [68][70]
Supermicro Launches New 6U 20-Node MicroBlade with AMD EPYC 4005
Yahoo Finance· 2025-10-30 13:31
Core Insights - Super Micro Computer Inc. (NASDAQ:SMCI) is recognized as a promising growth stock for the next five years, particularly following the launch of its new 6U 20-Node MicroBlade system featuring AMD EPYC 4005 Series Processors [1][3] Group 1: Product Launch and Features - The new 6U MicroBlade system is designed to be a cost-effective and environmentally friendly solution for Cloud Service Providers, achieving 3.3 times higher density than traditional 1U servers, allowing up to 160 servers and 16 Ethernet switches in a single 48U rack, resulting in up to 2560 CPU cores per rack [2] - The system utilizes Supermicro's unique building block architecture, providing up to 95% cable reduction, 70% space savings, and 30% energy savings compared to traditional 1U servers, which helps enterprises maximize their Total Cost of Ownership (TCO) savings [3] Group 2: Company Overview - Super Micro Computer Inc. and its subsidiaries develop and sell server and storage solutions based on modular and open-standard architecture across the US, Asia, Europe, and internationally [4]