AI inference performance - filings, earnings calls, financial reports, news - Reportify

AI inference performance

Search documents

Extreme Co-Design for Efficient Tokenomics and AI at Scale

NVIDIA· 2026-02-12 01:49

As AI evolves toward real-time reasoning, every part of the system is stressed all at once, from compute, memory, networking, storage, and even software. This new generation of AI requires extreme co-design: engineering the entire stack as a single system, in fact, across the entire data center. This shift is especially clear for state-of-the-art mixture-of-expert models like DeepSeek-R1, Kimi K2 Thinking, and gpt-oss.Reasoning, MoE models generate a ton of tokens, creating higher-quality answers for users ...

Extreme Co-Design

Mixture-of-expert models

AI inference performance

Extreme Co-Design

Mixture-of-expert models

AI inference performance