低延迟推理解决方案 - filings, earnings calls, financial reports, news

低延迟推理解决方案

Search documents

新华网财经· 2026-01-16 03:34

Core Viewpoint - OpenAI and Cerebras are collaborating to deploy a 750 MW wafer-scale system, which will become the world's largest high-speed AI inference platform by 2028, with a project value exceeding $10 billion [1]. Group 1: Collaboration and Market Demand - The partnership between OpenAI and Cerebras signifies a strong market demand for inference computing power and highlights the increasing importance of inference speed among tech giants [1]. - Cerebras, founded in 2015, aims to create the fastest AI inference and training platform, with its CS-2 and CS-3 systems already applied in various fields such as medical research and cryptography [4]. Group 2: Technological Advancements - Cerebras' unique system integrates massive computing power, memory, and bandwidth into a single giant chip, eliminating traditional hardware bottlenecks that limit inference speed [4]. - The response speed of large language models based on Cerebras technology can be up to 15 times faster than those based on GPU systems for code and voice chat tasks [4]. Group 3: Industry Trends - The tech industry's history shows that speed has played a crucial role in technology adoption, with significant advancements in processing frequency and internet connectivity driving the growth of personal computing and modern internet [5]. - Low-latency inference solutions provide faster response times and more natural interactions, enhancing productivity in the AI-driven market [5]. Group 4: Competitive Landscape - In December 2025, AI chip startup Groq announced a non-exclusive licensing agreement with NVIDIA, valued at $20 billion, marking NVIDIA's largest transaction to date [5]. - NVIDIA plans to integrate Groq's low-latency processors into its AI factory architecture to support a broader range of AI inference and real-time workloads [6].

Shang Hai Zheng Quan Bao· 2026-01-15 15:47

当地时间1月14日，OpenAI与美国AI芯片初创公司Cerebras宣布，将部署750兆瓦的Cerebras晶圆级系统。该合作将于2026年起分阶段落地，并于2028年完成，建成后将成为全球规模最大的高速AI推理平台。据美国消费者新闻与商业频道（CNBC）报道，该项合作的价值超过100亿美元。 Cerebras联合创始人兼首席执行官安德鲁·费尔德曼（Andrew Feldman）表示，与OpenAI合作，意味着将全球领先的AI模型引入全球最快的AI处理器。实时推理将彻底变革AI领域，开启构建和交互AI模型的全新方式。据悉，Cerebras系统的独特之处在于，其将海量计算能力、内存和带宽集成到单个巨型芯片上，从而消除了传统硬件上制约推理速度的瓶颈。在代码及语音聊天任务上，基于Cerebras的大语言模型所给出的响应速度比基于GPU的系统快高达15 倍。通常来说，重点面向逻辑推理的模型往往需要较长时间"思考"后才能生成回应。但回溯科技行业的发展历程，"速度"在推动技术普及上发挥了重要作用。如果运行频率没有出现从千赫兹到兆赫兹再到吉赫兹的飞跃，就不会有个人电脑产业；同样，如果没有从拨号上网到宽带网 ...