大模型推理系统
Search documents
24小时环球政经要闻全览 | 2月27日
Ge Long Hui A P P· 2026-02-27 00:40
Market Overview - Major global stock indices showed mixed performance, with the Dow Jones Industrial Average up by 17.05 points (0.03%) to 49,499.2, while the Nasdaq fell by 273.7 points (-1.18%) to 22,878.38 [1] - The S&P 500 decreased by 37.27 points (-0.54%) to 6,908.86, while the European Stoxx 50 dropped by 11.76 points (-0.19%) to 6,161.56 [1] - In Asia, the Hang Seng Index fell by 220.46 points (-2.44%) to 8,814.29, while the Nikkei 225 rose by 170.27 points (0.29%) to 58,753.39 [1] Geopolitical Developments - Tensions escalated between Pakistan and Afghanistan following intense border clashes, with Pakistan conducting airstrikes in response to casualties and territorial losses [2] - Significant progress was reported in US-Iran nuclear negotiations, with plans for further discussions in Vienna, aimed at de-escalating regional tensions [3] - The US Treasury proposed new regulations to cut off Swiss MBaer Bank from the US financial system due to alleged support for illegal activities related to Russia and Iran [3] Technology and Innovation - DeepSeek, in collaboration with Tsinghua and Peking University, released a paper on the DualPath model inference system, enhancing offline inference by up to 1.87 times and online service by 1.96 times [4] - ASML confirmed that its next-generation EUV lithography machines are ready for mass production, which will significantly impact advanced semiconductor manufacturing [5] - Broadcom announced the shipment of the industry's first 3.5D face-to-face computing SoC, utilizing 2nm technology to support next-generation AI computing needs [6] Corporate Actions - Netflix rejected a bid to increase its offer for Warner Bros. Discovery, citing financial concerns, and announced a stock buyback plan, leading to a 10% increase in its after-hours stock price [5]
企业应聚焦大模型微调与推理 实现技术与业务场景融合
Zhong Guo Zheng Quan Bao· 2025-10-29 21:10
Core Insights - The core argument emphasizes the importance of "model fine-tuning" and "model inference application" for companies to achieve high-quality development through AI technology [1][2]. Group 1: Model Development and Application - The AI model lifecycle includes data acquisition, preprocessing, training, fine-tuning, and inference, with model training being the most critical phase [1]. - Due to the lack of specialized domain data, foundational models require "model fine-tuning" to adapt to specific industry needs, transforming general capabilities into specialized applications for sectors like healthcare, finance, and manufacturing [1]. Group 2: Efficient Implementation Strategies - Companies are advised to leverage existing foundational models from specialized tech firms like DeepSeek and Huawei, rather than investing heavily in initial data acquisition and training [2]. - The architecture of AI PC, centered around GPUs, offers significant computational advantages, enabling the development of personalized AI assistants for individuals [2]. Group 3: AI's Role in Business Transformation - AI is positioned as a core infrastructure rather than a mere IT tool, with the next competitive battleground for companies being the integration of data algorithms and computational efficiency [3]. - AI serves as a second engine for growth, reshaping products, services, and operational models, thereby enhancing revenue and profit margins [3]. - By optimizing internal processes and reducing operational costs, AI creates significant competitive advantages and barriers to entry for businesses [3].
终端云端三连发!无问芯穹开源大模型推理加速神器,加码构建新一代端、云推理系统
机器之心· 2025-04-29 09:14
机器之心发布 机器之心编辑部 当前 AI 领域呈现「端云并发」的发展态势,端侧与云侧大模型各展所长,共同推动着智能发展与应用落地的边界。端侧模型实现本地毫秒级实时响应,云 侧模型依托强大算力支持复杂大规模推理,而两者都离不开高效的推理系统支撑。 在 GTC 2025 上,NVIDIA CEO 黄仁勋强调,大模型计算正从预训练转向推理优化阶段。 随着产业落地加速,推理计算需求正呈现爆发式增长,如何在性 能、成本和响应速度间取得平衡成为关键工程挑战,推理系统正是解决这一问题的核心 。 近日,无问芯穹发起了一次推理系统开源节,连续开源了三个推理工作,包括加速端侧推理速度的 SpecEE、计算分离存储融合的 PD 半分离调度新机制 Semi-PD、低计算侵入同时通信正交的计算通信重叠新方法 FlashOverlap,为高效的推理系统设计提供多层次助力。下面让我们一起来对这三个工作展开 一一解读: Day 1|SpecEE:基于推测的 Early Exiting 机制,让 AI PC 推理速度起飞 随着 DeepSeek 等开源模型表现出越来越强悍的性能,在 PC 端本地部署大模型的需求持续增长。尽管许多情况下使用云端 ...