模型推理
Search documents
商汤分拆的AI芯片公司,为何全盘押注模型推理市场?
Nan Fang Du Shi Bao· 2025-11-25 06:45
曦望Sunrise脱胎于商汤的芯片业务部门,于2024年底独立并完成第一轮外部融资,今年7月宣布了一笔 近10亿元的新融资。公司董事长由商汤科技联合创始人徐冰担任,联席CEO王勇和王湛均有百度从业背 景。 截至目前,该公司已推出三代推理芯片。据记者了解,第一代S1芯片于2020年量产,定位于视觉推理 芯片,主要服务于商汤的计算机视觉(CV)业务,累计销售超2万颗;第二代S2芯片从2024年9月起量 产,采用GPGPU(通用GPU)架构,公司方面声称该芯片实测性能接近英伟达A100的80%;第三代S3 芯片在2025年5月正式立项,预计2026年点亮——该步骤意味着芯片设计和制造过程已经成功。S3芯片 针对大模型推理定制优化,支持FP8和FP4(8位和4位浮点数)低精度数据格式。 阎研介绍,S3芯片将配置200G以上的显存以及足够的带宽去满足推理需求。公司的目标是使其在大模 型推理部署的成本,能接近英伟达的下一代Rubin架构芯片。 相较于训练芯片,推理芯片的设计难度和数据处理规模相对较低,成为众多国产AI芯片公司必争之 地。同时,AI应用的加速普及,带动推理算力市场需求的高涨。这也是曦望Sunrise全面押注 ...
腾讯云邱跃鹏:模型产业重心已向推理转变
第一财经· 2025-09-16 02:52
Core Viewpoint - The focus of the model industry is shifting towards inference, with a predicted inflection point in 2025 when inference demand will surpass training demand [1] Group 1 - Tencent's MAU (Monthly Active Users) for its IMA has increased 80 times in six months [1] - The "Yuanbao" service has answered high school entrance exam questions 150 million times [1] - By 2028, it is expected that 33% of enterprise software will include Agentic AI [1] Group 2 - The domestic enterprise-level Agent (intelligent agent) market is projected to exceed $27 billion [1]
腾讯云邱跃鹏:模型产业重心已向推理转变
Di Yi Cai Jing· 2025-09-16 02:28
Core Insights - Tencent's IMA (Intelligent Model Assistant) has seen a significant increase in monthly active users, growing 80 times within six months [1] - The focus of the model industry is shifting towards inference, with expectations that by 2025, the demand for inference will surpass that for training [1] - The IMA has answered high school entrance exam questions 150 million times, indicating its widespread usage and utility [1] - By 2028, it is projected that 33% of enterprise software will incorporate Agentic AI, with the domestic enterprise-level Agent market expected to exceed $27 billion [1]
联想申请一种模型推理方法及电子设备专利,可基于第三向量和目标向量确定推理结果
Jin Rong Jie· 2025-08-06 12:32
Group 1 - Lenovo (Beijing) Co., Ltd. has applied for a patent titled "A Model Inference Method and Electronic Device," with publication number CN120430411A, and the application date is April 2025 [1] - The patent abstract reveals a method for model inference that includes determining first and second vectors based on the acquired text, processing the first vector to determine a third vector with a smaller dimension and a greater head count than the first vector [1] - The company was established in 1992 and is primarily engaged in the manufacturing of computers, communications, and other electronic devices, with a registered capital of 565 million Hong Kong dollars [1] Group 2 - Lenovo (Beijing) Co., Ltd. has made investments in 107 companies and participated in 5,000 bidding projects [1] - The company holds 1,751 trademark records and 5,000 patent records, along with 238 administrative licenses [1]
周鸿祎:360最近都采购华为芯片,国产性价比高
Nan Fang Du Shi Bao· 2025-07-23 14:03
Group 1 - The gap between domestic chips and Nvidia is acknowledged, but the necessity to use domestic products is emphasized for improvement [1] - 360 Group has recently procured Huawei's chip products, indicating a shift towards domestic technology [1] - Nvidia's H20 chip has been approved for sale to China, which is more suitable for model inference, providing opportunities for domestic AI chips [2] Group 2 - DeepSeek has contributed significantly to the popularity of inference models, although it recently experienced a decline in monthly active users [2] - The decline in DeepSeek's application traffic is not solely negative, as many cloud vendors still rely on DeepSeek's model services [2] - The performance enhancement of open-source models has laid the foundation for the booming AI agents this year, which are seen as key to AI implementation [3] Group 3 - AI coding has emerged as a hot vertical direction for AI agents, with a focus on engineering capabilities like context and prompt engineering [3] - The development of specialized AI agents tailored to different industries is recommended to create unique technical barriers [3] - The potential disruptive future of AI agents has led to significant changes in operational strategies within companies, with a push for efficiency through AI utilization [3]
AI算力需求涌向模型推理,国产芯片站上竞技台了
Di Yi Cai Jing· 2025-05-28 07:22
Core Insights - The Chinese data center accelerator card market is experiencing a significant shift, with domestic computing power expected to exceed 40% in the first half of 2024, up from approximately 30% last year [1][2] - NVIDIA's CEO highlighted the ongoing AI investment trend, indicating that the demand for AI computing power is evolving, particularly with the rise of inference chips [1][8] - The introduction of DeepSeek has led to a notable increase in the demand for inference chips, which are expected to constitute over 57.6% of the market by 2024 [8][11] Market Dynamics - The construction of data centers is accelerating, with a projected 97.3% year-on-year growth in China's accelerated computing server market in 2024 [4] - The number of successful bids for intelligent computing centers in China has increased significantly, indicating a robust demand for computing resources [4] - Universities and enterprises are increasingly seeking computing power, with many opting for cloud solutions or purchasing their own computing cards [5][6] Technological Shifts - The demand for inference capabilities is reshaping the chip composition in the market, allowing domestic chips to gain traction as they are suitable for inference tasks [11][12] - The performance requirements for inference chips are lower, enabling a broader range of domestic chips to compete effectively against NVIDIA [10][11] - Companies like Tencent are adapting to the changing landscape by increasing their focus on inference needs, indicating a shift in AI application strategies [9][13] Competitive Landscape - NVIDIA's market share in China's data center accelerator card market has decreased from 95% to around 65.2%, while domestic chip manufacturers are gaining ground [11][13] - The introduction of export controls on NVIDIA's chips has prompted the company to consider launching a new AI chip tailored for the Chinese market [13] - Domestic AI chip manufacturers, such as Cambricon, are beginning to report profitability, reflecting a positive trend in the domestic chip market [12]