Workflow
AI推理
icon
Search documents
AI重磅!华为“黑科技”来了
Zhong Guo Ji Jin Bao· 2025-08-12 07:40
Core Insights - Huawei has officially launched its AI inference technology UCM (Unified Cache Manager), aimed at addressing challenges in AI inference efficiency and user experience [1] - The AI industry is shifting focus from maximizing model capabilities to optimizing inference experiences, which directly impacts user satisfaction and commercial viability [1] Group 1: UCM Technology Overview - UCM is a KV Cache-centered inference acceleration suite that integrates various caching algorithms to manage KV Cache memory data during inference, enhancing throughput and reducing latency [2] - The growth of AI inference demands has led to an increase in KV Cache capacity, which has exceeded GPU memory limits, necessitating innovative solutions like UCM [2][3] - UCM's core value lies in providing faster inference responses and longer inference sequences, addressing the limitations of current AI models [2] Group 2: Performance Improvements - UCM enables dynamic KV unloading and position encoding expansion, achieving a tenfold increase in inference context window [3] - The technology allows for on-demand data flow across different storage media (HBM, DRAM, SSD), improving TPS (tokens per second) by 2 to 22 times, thereby reducing the cost per token [4] - Current mainstream AI models in China output tokens at a significantly lower speed compared to their international counterparts, highlighting the need for UCM's capabilities [4] Group 3: Practical Applications - Huawei's AI inference acceleration solution, in collaboration with China UnionPay, is being piloted in three business scenarios: customer voice, marketing planning, and office assistant [5] - The office assistant application can support user inputs exceeding 170,000 tokens, overcoming challenges associated with long sequence models [5]
AI重磅!华为“黑科技”来了
中国基金报· 2025-08-12 07:37
Core Viewpoint - Huawei has officially launched the AI inference technology "UCM" (Inference Memory Data Manager) to address challenges in AI inference efficiency and user experience [2][4]. Group 1: AI Inference Development - The AI industry is shifting focus from maximizing model capabilities to optimizing inference experiences, which directly impacts user satisfaction and commercial viability [4]. - Huawei plans to open-source UCM in September, initially releasing it on the Magic Engine community and gradually contributing to mainstream inference engine communities [5]. Group 2: UCM Technology and Benefits - UCM is a KV Cache-centered inference acceleration suite that integrates various caching acceleration algorithms to manage KV Cache memory data during inference, enhancing throughput and reducing latency [7]. - UCM enables longer inference sequences by offloading cache to external storage, achieving a tenfold increase in inference context window [8]. Group 3: Cost Efficiency and Performance - UCM can dynamically manage memory across HBM, DRAM, and SSD based on memory usage, improving TPS (tokens per second) by 2 to 22 times, thus lowering the cost per token [11]. - Current mainstream AI models in China output less than 60 tokens per second with a latency of 50 to 100 ms, while leading models abroad reach 200 tokens per second with a latency of 5 ms [11]. Group 4: Practical Applications - Huawei's AI inference acceleration solution, combining UCM with OceanStor A series technology, is being piloted in collaboration with China UnionPay across three business scenarios: Voice of Customer, Marketing Planning, and Office Assistant [12]. - In the Office Assistant scenario, the solution supports user input of over 170,000 tokens for long-sequence inference, addressing the limitations of long-sequence models [15].
华为发布AI推理创新技术UCM:实现高吞吐、低时延推理体验,降低每Token推理成本
Xin Lang Ke Ji· 2025-08-12 07:22
据介绍,华为此次发布的AI推理创新技术UCM(推理记忆数据管理器),作为一款以KV Cache为中心 的推理加速套件,其融合了多类型缓存加速算法工具,分级管理推理过程中产生的KV Cache记忆数 据,扩大推理上下文窗口,以实现高吞吐、低时延的推理体验,降低每Token推理成本。 责任编辑:郭栩彤 新浪科技讯 8月12日下午消息,在2025金融AI推理应用落地与发展论坛上,华为联合中国银联共同发布 AI推理创新技术UCM(推理记忆数据管理器),实现高吞吐、低时延的推理体验。 在当今数字化时代,AI发展日新月异。大模型训练的热潮尚未消退,AI推理体验却已悄然成为AI应用 的关键。中信建投在2025WAIC期间发布的白皮书指出,AI正从训练向推理的结构性转变而快速增长。 在这样的大背景下,AI推理体验的重要性愈发凸显。 推理体验直接关系到用户与AI交互时的感受,包括回答问题的时延、答案的准确度以及复杂上下文的 推理能力等方面。资料显示,国外主流模型的单用户输出速度已进入200 Tokens/s区间(时延5ms),而 我国普遍小于60Tokens/s(时延50 - 100ms),如何解决推理效率与用户体验的难题迫在 ...
张忆东:震荡是港股长期行情的蓄电池!恒生科技ETF基金(513260)、港股通科技ETF汇添富(520980)连续回调“吸金”!
Xin Lang Cai Jing· 2025-08-12 06:57
Market Overview - The Hong Kong stock market experienced a collective decline, with the Hang Seng Tech ETF (513260) dropping by 0.43% despite attracting over 640 million yuan in net inflows over the past 10 days [1] - The financing balance for the Hang Seng Tech ETF has exceeded 130 million yuan, with a recent financing purchase amounting to 39.57 million yuan [1] Sector Performance - The technology sector in Hong Kong showed mixed results, with notable gains from Huahong Semiconductor (up over 4%), SMIC (up over 3%), and BYD Electronics (up over 2%) [4] - Conversely, Kuaishou saw a significant drop of over 8%, while Alibaba and Tencent experienced slight declines [4] Company Insights - Huawei is set to unveil breakthrough technology in AI inference at a forum on August 12, which may reduce reliance on HBM technology and enhance the performance of domestic AI models [5] - The performance of major tech companies is expected to be a catalyst for market movements, with a focus on their mid-year earnings reports [8] Investment Sentiment - Analysts from Xinyi Securities maintain a bullish long-term outlook for Hong Kong stocks, emphasizing the strengthening position of Hong Kong as an international financial center and the positive feedback loop from quality companies listing in Hong Kong [6] - The market is anticipated to experience a phase of consolidation, with a focus on mid-year earnings and value propositions [6][8] Long-term Outlook - The long-term outlook for Hong Kong stocks remains optimistic, driven by improving supply-demand dynamics and the potential for economic recovery from a "passive destocking" phase [8] - The technology sector is viewed as a key driver for economic transformation, with AI playing a significant role in future growth [9]
华为发布AI推理创新技术UCM
人民财讯8月12日电,8月12日,华为正式发布AI推理创新技术UCM(推理记忆数据管理器)。据了解,作 为一款以KV Cache为中心的推理加速套件,UCM融合了多类型缓存加速算法工具,分级管理推理过程 中产生的KV Cache记忆数据,可扩大推理上下文窗口,实现高吞吐、低时延的推理体验,降低每Token 推理成本。该技术已率先在中国银联"客户之声""营销策划""办公助手"三大业务场景中,开展智慧金融 AI推理加速应用试点,并已取得成果。 ...
华为即将发布AI推理领域突破性黑科技;供需失衡,第三季DDR4合约价或季增85%-90%——《投资早参》
Mei Ri Jing Ji Xin Wen· 2025-08-12 01:01
Market News - The three major US stock indices experienced slight declines, with the Dow Jones down 0.45%, Nasdaq down 0.3%, and S&P 500 down 0.25%. Major tech stocks mostly fell, including Apple, Microsoft, Nvidia, Google, Amazon, Meta, and AMD, while Intel dropped over 3% and Tesla rose over 2% [1] - The Chinese concept stocks mostly declined, with the Nasdaq China Golden Dragon Index down 0.29%. Notable declines included TAL Education down over 3%, Li Auto down nearly 3%, and Baidu and Alibaba down over 1% [1] - Metal futures generally fell, with COMEX gold futures down 2.80% at $3393.7 per ounce, and COMEX silver futures down 2.33%. International oil prices saw slight increases, with WTI crude up 0.19% at $64.00 per barrel and Brent crude up 0.15% at $66.69 per barrel [1] Industry Insights - Huawei held a forum titled "AI Rise, Opening a New Chapter in Smart Finance," discussing the importance of AI reasoning experience and the launch of AI reasoning acceleration technology, which aims to reduce reliance on HBM technology and enhance AI model performance in China [2] - TrendForce reported that the DDR4 market will face sustained supply shortages and price increases in the second half of 2025, driven by strong server orders affecting the supply for computers and end-users. The price of Consumer DDR4 contracts surged by 60% to 85%, leading to a significant upward revision of third-quarter prices by 85% to 90% [3][4] - The Hangzhou Municipal Justice Bureau is seeking public opinion on a draft regulation to promote the development of embodied intelligent robots, focusing on enhancing computing resource efficiency and reducing costs, with an emphasis on core technologies in the field [5][6] - The market for embodied intelligence is expected to grow significantly, potentially exceeding one trillion yuan by 2026, driven by advancements in humanoid robots and AI models [6] Stock Movements - A number of companies announced share reduction plans, including Aokang International, Tianfu Communication, and Qide New Materials, with various shareholders planning to reduce their stakes through centralized bidding or block trading [7][8] - Chongqing Bank reported that a major shareholder plans to reduce its stake by up to 52 million shares, which would decrease its holding from 8.5% to 7% [8]
沪指再创年内新高,A股超4200只股票上涨,锂矿股大爆发
Mei Ri Jing Ji Xin Wen· 2025-08-11 08:16
每经编辑|金冥羽 8月11日,市场全天震荡走高,创业板指领涨,沪指、深成指盘中均再创年内新高。A股全天成交额1.85万亿元,较上个交易日放量1136.68亿元。盘面 上,市场热点良性轮动,个股涨多跌少,全市场超4200只股票上涨,逾百股涨超9%。截至收盘,沪指涨0.34%,深成指涨1.46%,创业板指涨1.96%。 受消息影响,8月11日早上开盘,碳酸锂期货所有合约均触及涨停,其中主力合约涨幅8%,报81000元/吨。 不少券商认为,此次停产利好较大。据财通证券研报,此次江西多个锂矿或因矿证审批流程也面临停产可能,或导致每月7000吨~8000吨碳酸锂当量受到 影响。瓷土矿转锂土矿对应税率也将大幅提高成本,叠加9—11月传统旺季供需更紧,多重因素推高碳酸锂价格。天风证券也认为,此次宁德锂矿争议落 地,对江西地区后续类似问题有代表性意义,碳酸锂供给收缩预期加剧,或迎来价值重估。 算力硬件股震荡走强,高新发展涨停。 消息面上,华为将于8月12日发布AI推理领域的突破性技术成果,或能降低中国AI推理对HBM(高带宽内存)技术的依赖,提升国内AI大模型推理性能, 完善中国AI推理生态的关键部分。 沪指、深成指盘中均再 ...
金融AI论坛来袭!华为将发布突破性成果,聚焦信息技术自主可控的——信创ETF基金(562030)盘中涨超1%
Xin Lang Ji Jin· 2025-08-11 03:02
Core Insights - The focus on the self-controllable information technology sector is driving the performance of the Xinchuang ETF fund, with significant gains in constituent stocks like Dongfang Guoxin and Electric Science Cybersecurity [1][5] Group 1: Market Performance - On August 11, the Xinchuang ETF fund (562030) saw an intraday price increase of over 1%, currently up by 0.79% [1] - Key constituent stocks such as Dongfang Guoxin rose by over 5%, while Electric Science Cybersecurity increased by more than 4% [1] Group 2: Technological Developments - Huawei is set to unveil breakthrough technologies in AI reasoning at a forum on August 12, which may reduce China's reliance on high bandwidth memory (HBM) technology and enhance domestic AI model performance [3] - The new high-performance AI storage introduced by Huawei aims to significantly improve data loading times and increase computing cluster efficiency from 30% to 60% [3] Group 3: Industry Growth Projections - The Xinchuang industry is transitioning from policy-driven to a dual-driven model of policy and market, with significant growth expected in the coming years [4] - Market growth rates are projected to reach 17.84% in 2025 and 26.82% in 2026, with the market size expected to exceed 2.6 trillion yuan by 2026 [4] Group 4: Investment Logic - The Xinchuang ETF fund tracks the CSI Xinchuang Index, which encompasses core segments of the Xinchuang industry, indicating high growth and elasticity [5] - Key investment drivers include geopolitical tensions necessitating self-sufficiency, increased government procurement, and breakthroughs in technology by domestic manufacturers like Huawei [5][6]
增长迅猛如火箭!网络业务成英伟达(NVDA.US)AI芯片霸主地位隐形支柱
智通财经网· 2025-08-11 02:41
Core Viewpoint - The focus of investors on NVIDIA's Q2 earnings report will be on its data center business, which is crucial for revenue generation through high-performance AI processors [1] Group 1: Data Center Business - NVIDIA's data center segment generated $115.1 billion in revenue last fiscal year, with the network business contributing $12.9 billion, surpassing the gaming segment's revenue of $11.3 billion [1] - In Q1, the network business contributed $4.9 billion to the data center revenue of $39.1 billion, indicating strong growth potential as AI computing power expands [2] Group 2: Network Technology - NVIDIA's network products, including NVLink, InfiniBand, and Ethernet solutions, are essential for connecting chips and servers within data centers, enabling efficient AI application performance [1][2] - The three types of networks—NVLink for intra-server communication, InfiniBand for inter-server connections, and Ethernet for storage and system management—are critical for building large-scale AI systems [3] Group 3: Importance of Network Business - The network business is considered one of the most undervalued parts of NVIDIA's operations, with its growth rate described as "rocket-like" despite only accounting for 11% of total revenue [2] - Without the network business, NVIDIA's ability to meet customer expectations for computing power would be significantly compromised [3] Group 4: AI Model Development - As enterprises develop larger AI models, the need for synchronized GPU performance is increasing, particularly during the inference phase, which demands higher data center system performance [4] - The misconception that inference is simple has been challenged, as it is becoming increasingly complex and similar to training, highlighting the importance of network technologies [5] Group 5: Competitive Landscape - Competitors like AMD, Amazon, Google, and Microsoft are developing their own AI chips and network technologies, posing a challenge to NVIDIA's market position [5] - Despite the competition, NVIDIA is expected to maintain its lead as demand for its chips continues to grow among tech giants, research institutions, and enterprises [5]
华为将发布AI推理黑科技;工业富联业绩创新高丨科技风向标
21世纪经济报道新质生产力研究院综合报道 【巨头风向标】 中国工程院院士倪光南:构建AI+机器人的生态系统 中国工程院院士倪光南8月10日在2025世界机器人大会上表示,当前是人工智能引领科技范式变革的时 代,在国家实施"人工智能+"行动的大形势下,机器人产业要构建"AI+机器人"的生态系统,更好发挥其 新质生产力的作用。倪光南还表示,这一要求的关键在于提升机器人智能审评,要用脑-眼-行动协同的 系统来提高机器人智能水平,真正让机器人能够看见世界、理解世界、行动于世界。 ChatGPT-4o重新上线 OpenAI宣布GPT-4o已重新上线,供Plus和Team用户使用。若需多平台使用,用户可在ChatGPT网页版 设置中启用"显示旧版模型"来访问GPT-4o。此前,在GPT-5发布后,OpenAI 曾停止提供GPT-4o,这一 决定曾引发用户争议。 华为将发布AI推理黑科技 8月12日,华为将联合中国银联共同发布AI推理最新应用成果。据悉,该成果或能降低中国AI推理对 HBM(高带宽内存)技术的依赖,提升国内AI大模型推理性能,完善中国AI推理生态的关键部分。 HBM是解决"数据搬运"的关键。HBM不足时,用 ...