UCM推理记忆数据管理器

Search documents
华为提出构建先进数据基础设施路径,助力AI智能体时代加速发展
Sou Hu Cai Jing· 2025-09-08 02:20
在AI全链条中,存储技术正从幕后走向台前。周跃峰比喻称:"AI推理如同人类思考,需要快速调用历史记忆提升效率。"华为推出的UCM推理记忆数据管 理器,通过KV Cache核心架构实现记忆数据多级缓存,结合信息浓缩与智能联想技术,动态优化推理过程中的数据调用路径。该方案在金融领域试点中, 将AI服务响应延迟降低30%,算力成本下降25%,验证了存储革新对突破推理效率瓶颈的实效性。 作为连续多年稳居国内存储市场首位、全球第二的供应商,华为正通过技术开放推动生态共建。周跃峰透露,企业已将AI工具链、推理框架等核心软件能 力开源,与产业伙伴共建技术底座。他特别强调,存储技术应上升为AI战略基础设施,通过政企学研协同创新,加速先进数据基础设施落地,为我国在AI 下半场竞争中构筑领先优势。 随着人工智能技术深度融入经济社会发展各领域,数据要素的汇聚、流通与利用效率正成为全球AI竞争的关键。据国家数据局最新统计,2024年我国数据 产量突破41.06ZB,同比增幅达25%,这一爆发式增长对数据基础设施建设提出了更高要求。在第十五届智慧城市与智能经济博览会上,华为数据存储产品 线总裁周跃峰提出,需通过分层布局构建先进数据基础 ...
建设先进数据基础设施,夯实AI智能体时代发展底座
Sou Hu Cai Jing· 2025-09-07 21:36
Core Insights - 2025 is widely regarded as the "Year of AI Agents," with artificial intelligence accelerating its integration into various sectors, leading to an explosive growth in data volume, projected to reach 41.06ZB in 2024, a 25% year-on-year increase [1] - The construction of "advanced data infrastructure" is essential for efficient data aggregation, circulation, and utilization, which has become a focal point in global AI competition [1] Group 1: Data Infrastructure Development - A layered approach is necessary for building data infrastructure at the city, industry, and enterprise levels to meet diverse scenario demands [2] - At the city level, the focus is on breaking data silos to achieve comprehensive data aggregation and trustworthy circulation, supporting smart governance and public services [4] - At the industry level, the bottleneck has shifted from "data scarcity" to "lack of high-quality corpus," necessitating the establishment of industry-level data sharing platforms to promote multi-source data integration [4] - At the enterprise level, the deployment of multi-agent systems requires the creation of AI data lakes to unify management of corporate knowledge bases and enhance collaboration and decision-making accuracy among intelligent agents [4] Group 2: Storage Innovation - The efficiency of AI inference is critically dependent on memory management, as AI needs to "remember" historical reasoning processes to improve response speed and accuracy [5] - Huawei has introduced the UCM inference memory data manager, a solution centered on KV Cache, which features a multi-level caching architecture to dynamically call historical data during inference, significantly enhancing response efficiency and economic viability in sectors like finance [5] Group 3: Open Ecosystem and Collaboration - Huawei has established itself as a trusted provider of data infrastructure, ranking second globally and first domestically in storage market revenue [6] - The company is actively opening its AI toolchain and inference frameworks through open-source methods to empower industry partners and build a prosperous technological ecosystem [6] - The development of AI cannot rely on isolated efforts; it requires collective industry collaboration to strengthen the strategic positioning of storage technology within the AI framework [6]
华为发布AI推理创新技术--UCM推理记忆数据管理器
Zhong Guo Chan Ye Jing Ji Xin Xi Wang· 2025-08-28 00:35
Core Insights - Huawei launched the UCM inference memory data manager at the Financial AI Inference Application Forum, aiming to enhance AI inference experience and cost-effectiveness while accelerating the positive cycle of AI in business [1][2] - The UCM technology is being piloted in collaboration with China UnionPay in typical financial scenarios, showcasing its application in smart finance [1] Technology Overview - The UCM inference memory data manager consists of three main components: a connector for different engines and computing power, a library for multi-level KV Cache management and acceleration algorithms, and a high-performance KV Cache access adapter [1] - The technology enables a 90% reduction in latency for the first token by directly accessing KV cache data, avoiding redundant calculations [2] - UCM allows for a tenfold expansion of the inference context window, addressing long text processing needs by offloading ultra-long sequence cache to external professional storage [2] Performance Improvements - UCM's intelligent hierarchical caching capability allows for on-demand flow among HBM, DRAM, and SSD storage media based on memory heat [2] - The integration of various sparse attention algorithms enhances the collaboration between computation and storage, resulting in a 2 to 22 times increase in TPS (tokens processed per second) in long sequence scenarios, significantly lowering the inference cost per token [2] - In a pilot with China UnionPay, the UCM technology improved large model inference speed by 125 times, enabling precise identification of customer inquiries in just 10 seconds [2] Future Developments - UCM is set to be open-sourced in September 2023, with a unified interface to adapt to various inference engine frameworks, computing power, and storage systems [2] - The company aims to contribute UCM to mainstream inference engine communities, fostering the development of the AI inference ecosystem across the industry [2]
每Token成本显著降低 华为发布UCM技术破解AI推理难题
Huan Qiu Wang· 2025-08-18 07:40
Core Insights - The forum highlighted the launch of Huawei's UCM inference memory data manager, aimed at enhancing AI inference experiences and cost-effectiveness in the financial sector [1][5] - AI inference is entering a critical growth phase, with inference experience and cost becoming key metrics for model value [3][4] - Huawei's UCM technology has been validated through a pilot project with China UnionPay, demonstrating a 125-fold increase in inference speed [5][6] Group 1: AI Inference Development - AI inference is becoming a crucial area for explosive growth, with a focus on balancing efficiency and cost [3][4] - The transition from "model intelligence" to "data intelligence" is gaining consensus in the industry, emphasizing the importance of high-quality data [3][4] - The UCM data manager consists of three components designed to optimize inference experience and reduce costs [4] Group 2: UCM Technology Features - UCM technology reduces latency for the first token by up to 90% and expands context windows for long text processing by tenfold [4] - The intelligent caching capability of UCM allows for on-demand data flow across various storage media, significantly improving token processing speed [4] - UCM's implementation in financial applications addresses challenges such as long sequence inputs and high computational costs [5] Group 3: Industry Collaboration and Open Source - Huawei announced an open-source plan for UCM, aiming to foster collaboration across the industry and enhance the AI inference ecosystem [6][7] - The open-source initiative is expected to drive standardization and encourage more partners to join in improving inference experiences and costs [7] - The launch of UCM technology is seen as a significant breakthrough for AI inference and a boost for smart finance development [7]
2025金融AI推理应用落地与发展论坛在金融数据港成功举办
Sou Hu Cai Jing· 2025-08-15 17:35
Group 1 - The 2025 Financial AI Inference Application Landing and Development Forum was held at the Financial Data Port AI Innovation Center on August 12, with key figures from China UnionPay and Huawei in attendance [1] - Huawei's Vice President and President of the Data Storage Product Line, Dr. Zhou Yuefeng, introduced the AI inference innovation technology called UCM Inference Memory Data Manager at the forum [3] - China UnionPay plans to leverage the National Artificial Intelligence Application Pilot Base to collaborate with Huawei and other ecosystem partners to build "AI + Finance" demonstration applications, transitioning technology results from "laboratory validation" to "large-scale application" [5] Group 2 - Huawei and China UnionPay jointly released the application results of the Smart Financial AI Inference Acceleration Program during the forum [3][5]
破解效率与成本难题:华为UCM技术推动AI推理体验升级
Yang Guang Wang· 2025-08-13 06:13
Group 1 - The forum on the application and development of financial AI reasoning took place in Shanghai, featuring key figures from China UnionPay and Huawei [1] - Huawei introduced the UCM reasoning memory data manager, aimed at enhancing AI reasoning experiences and cost-effectiveness, while accelerating the positive cycle of AI in business [1][3] - AI reasoning is entering a critical growth phase, with reasoning experience and cost becoming key metrics for evaluating model value [3] Group 2 - The UCM reasoning memory data manager includes three main components: reasoning engine plugins, a function library for multi-level KV Cache management, and high-performance KV Cache access adapters [3][4] - UCM technology can reduce the latency of the first token by up to 90% and expand the reasoning context window by ten times, addressing long text processing needs [3][4] - The UCM's intelligent caching capabilities significantly enhance processing speed, achieving a 125-fold increase in reasoning speed for China UnionPay's "Voice of the Customer" scenario [4] Group 3 - Huawei announced an open-source plan for UCM, which will be available in September, allowing adaptation to various reasoning engine frameworks and storage systems [4] - The collaboration between Huawei and China UnionPay aims to build "AI + Finance" demonstration applications, transitioning technology from laboratory validation to large-scale application [4]
即将开源!华为发布AI推理黑科技,已在中国银联落地
Tai Mei Ti A P P· 2025-08-13 03:44
Core Insights - Huawei has launched the UCM inference memory data manager to enhance AI inference experiences, improve cost-effectiveness, and accelerate the commercial cycle of AI [2] - The UCM technology has been piloted in financial scenarios in collaboration with China UnionPay, showcasing its application in smart finance [2] Industry Trends - The focus in the large model industry is shifting from training to inference, with current inference computing power demand exceeding training by 58.5% [2] - The release of new models often leads to instability in service providers due to high user demand, necessitating optimizations to reduce inference costs without compromising user experience [3] Performance Comparison - Foreign mainstream large models achieve output speeds in the range of 200 tokens/s with a latency of 5ms, while Chinese models generally fall below 60 tokens/s with latencies of 50-100ms, indicating a maximum disparity of 10 times [4] - Chinese models also support fewer tokens in context windows compared to their foreign counterparts, with a significant probability of missing key information during long text analysis [4] Technical Innovations - The UCM system consists of three main components: a connector for popular inference frameworks, an accelerator for multi-level KV cache management, and an adapter for high-performance KV cache access [6] - By caching previously processed results and data in a high-performance external shared storage, UCM can reduce the first token delay by 90% and significantly speed up inference processes [8][9] Financial Sector Applications - The financial industry is rapidly adopting large models, with a focus on reducing high costs and latency associated with AI inference, which is critical for risk control and transaction security [10] - A collaboration between China UnionPay and Huawei has led to a significant reduction in inference time for label classification from 600 seconds to under 10 seconds, achieving over a 50-fold improvement in efficiency [11] Future Developments - Huawei plans to open-source the UCM technology in September, aiming to create a unified interface that can adapt to various inference engine frameworks, computing power, and storage systems [11]
贴息政策来了!事关个人消费贷款、服务业经营主体贷款丨盘前情报
2 1 Shi Ji Jing Ji Bao Dao· 2025-08-13 00:43
Market Overview - On August 12, the A-share market experienced a steady rise, with all three major indices reaching new highs for the year. The Shanghai Composite Index rose by 0.5%, the Shenzhen Component Index increased by 0.53%, and the ChiNext Index gained 1.24% [2][3] - The total trading volume in the Shanghai and Shenzhen markets was 1.88 trillion yuan, an increase of 54.5 billion yuan compared to the previous trading day. Despite the overall market rise, over 3,100 stocks declined, indicating a mixed performance among individual stocks [2] Sector Performance - Semiconductor stocks surged in the afternoon, while A-hardware stocks showed strength. The leading sectors included semiconductors, ports, CPO, and Xinjiang-related stocks, while PEEK materials, rare earth permanent magnets, and lithium mining sectors faced declines [2] International Market - The U.S. stock market saw gains on August 12, with the Dow Jones Industrial Average rising by 483.52 points (1.10%) to close at 44,458.61 points, the S&P 500 increasing by 72.31 points (1.13%) to 6,445.76 points, and the Nasdaq Composite up by 296.50 points (1.39%) to 21,681.90 points [4][5] - In Europe, the FTSE 100 rose by 0.20%, the CAC 40 increased by 0.71%, while the DAX index fell by 0.23% [4][5] Commodity Prices - International oil prices declined on August 12, with light crude oil futures for September dropping by $0.79 to $63.17 per barrel (1.24% decrease) and Brent crude for October falling by $0.51 to $66.12 per barrel (0.77% decrease) [4] Policy Announcements - Nine departments, including the Ministry of Finance and the People's Bank of China, issued a policy implementation plan for interest subsidies on loans to service industry entities, effective from March 16, 2025, to December 31, 2025 [6] - A separate plan for personal consumption loan interest subsidies was announced, applicable from September 1, 2025, to August 31, 2026, covering various consumer sectors [6] Corporate Developments - Huawei announced the upcoming open-source release of its AI inference technology, UCM, which is set to launch in September [7] - The Ministry of Commerce initiated an anti-dumping investigation into imported pea starch from Canada, with the investigation period set from January 1, 2024, to December 31, 2024 [8] - Key players in the dry-process lithium battery separator industry reached consensus on several measures to promote healthy competition and industry cooperation [8] Economic Indicators - The U.S. Consumer Price Index (CPI) for July increased by 2.7% year-on-year, with a month-on-month rise of 0.2%. The core CPI, excluding food and energy, rose by 3.1% year-on-year [9] Investment Insights - Analysts from Great Wall Securities noted that the A-share market's upward momentum is supported by infrastructure and policy expectations, while the Hang Seng Technology Index has lagged behind [10] - Zhongyin International emphasized the acceleration of AI application commercialization, suggesting a focus on revenue growth and user expansion in AI sectors [11] Company Announcements - Zhenlei Technology reported a 1007% year-on-year increase in net profit for the first half of the year [12] - Baiyun Airport signed a cooperation contract for duty-free operations at T3 terminal with China Duty Free Group [12]
华为在沪发布AI推理创新技术UCM 9月将正式开源
Sou Hu Cai Jing· 2025-08-12 11:53
周跃峰在论坛上表示:"AI时代,模型训练、推理效率与体验的量纲都以Token数为表征,Token经济已经到来"。为保障 流畅的推理体验,企业需持续加大算力投入,但如何在推理效率与成本之间找到最佳平衡点,成为了全行业亟待解决的 重要课题。 东方网记者曹磊8月12日报道:当前,人工智能已步入发展深水区,AI推理正成为下一个爆发式增长的关键阶段。今天下 午,2025金融AI推理应用落地与发展论坛在上海举行。论坛上,华为公司副总裁、数据存储产品线总裁周跃峰博士发布 AI推理创新技术——UCM推理记忆数据管理器。 作为一款以KV Cache为中心的推理加速套件,其融合了多类型缓存加速算法工具,分级管理推理过程中产生的KV Cache 记忆数据,扩大推理上下文窗口,以实现高吞吐、低时延的推理体验,降低每Token推理成本。同时,华为携手中国银联 率先在金融典型场景开展UCM技术试点应用,并联合发布智慧金融AI推理加速方案应用成果。 为此,华为推出UCM推理记忆数据管理器,包括对接不同引擎与算力的推理引擎插件(Connector)、支持多级KV Cache管理及加速算法的功能库(Accelerator)、高性能KV Cac ...
华为:AI推理创新技术UCM将于今年9月正式开源
Xin Lang Ke Ji· 2025-08-12 11:21
Group 1 - The forum on the application and development of financial AI reasoning in 2025 featured speeches from executives of China UnionPay and Huawei, highlighting the importance of AI in the financial sector [2] - Huawei introduced the UCM reasoning memory data manager, aimed at enhancing AI reasoning experiences and improving cost-effectiveness, while accelerating the positive cycle of AI in business [2] - The UCM technology was piloted in typical financial scenarios with China UnionPay, showcasing its application in smart financial AI reasoning acceleration [2] Group 2 - The UCM technology demonstrated significant value in a pilot with China UnionPay, achieving a 125-fold increase in large model reasoning speed, allowing for precise identification of customer issues in just 10 seconds [3] - China UnionPay plans to collaborate with Huawei and other partners to build "AI + Finance" demonstration applications, transitioning technology from laboratory validation to large-scale application [3] - Huawei announced the UCM open-source plan, which will be officially launched in September, aiming to contribute to mainstream reasoning engine communities and promote the development of the AI reasoning ecosystem [3]