Workflow
UCM推理记忆数据管理器
icon
Search documents
华为提出构建先进数据基础设施路径,助力AI智能体时代加速发展
Sou Hu Cai Jing· 2025-09-08 02:20
Core Insights - The integration of artificial intelligence (AI) technology into various sectors is driving the need for efficient data aggregation, circulation, and utilization, which is becoming a key factor in global AI competition [1][3] - Huawei's president of data storage product line, Zhou Yuefeng, emphasized the necessity of building advanced data infrastructure through layered approaches to address the bottlenecks in AI development [1][4] Data Growth and Infrastructure Needs - According to the National Data Bureau, China's data output is expected to exceed 41.06 ZB in 2024, representing a year-on-year growth of 25%, which raises the demand for enhanced data infrastructure [1] - The need for a robust data foundation is highlighted by the requirement to break down data silos at the city level, create shared platforms at the industry level, and establish unified AI data lakes at the enterprise level [3] Storage Technology and AI Efficiency - Storage technology is transitioning from a background role to a forefront position in the AI ecosystem, with innovations like Huawei's UCM inference memory data manager, which reduces AI service response latency by 30% and computing costs by 25% in financial sector trials [3][4] - The analogy of AI reasoning to human thinking underscores the importance of quickly accessing historical memory to enhance efficiency [3] Collaborative Ecosystem and Strategic Infrastructure - Huawei is positioned as the leading storage provider in China and the second globally, focusing on open technology to foster ecosystem collaboration [4] - The company advocates for storage technology to be recognized as a strategic infrastructure for AI, promoting collaborative innovation among government, enterprises, academia, and research institutions to accelerate the implementation of advanced data infrastructure [4] Competitive Landscape in AI - The current phase of AI development is characterized as a "deep water zone," where the efficient use of data elements is crucial for breakthroughs [4] - Huawei's proposed layered construction path and storage technology innovations offer practical solutions for the industry, potentially redefining the competitive rules in the AI era through an open ecosystem strategy [4]
建设先进数据基础设施,夯实AI智能体时代发展底座
Sou Hu Cai Jing· 2025-09-07 21:36
Core Insights - 2025 is widely regarded as the "Year of AI Agents," with artificial intelligence accelerating its integration into various sectors, leading to an explosive growth in data volume, projected to reach 41.06ZB in 2024, a 25% year-on-year increase [1] - The construction of "advanced data infrastructure" is essential for efficient data aggregation, circulation, and utilization, which has become a focal point in global AI competition [1] Group 1: Data Infrastructure Development - A layered approach is necessary for building data infrastructure at the city, industry, and enterprise levels to meet diverse scenario demands [2] - At the city level, the focus is on breaking data silos to achieve comprehensive data aggregation and trustworthy circulation, supporting smart governance and public services [4] - At the industry level, the bottleneck has shifted from "data scarcity" to "lack of high-quality corpus," necessitating the establishment of industry-level data sharing platforms to promote multi-source data integration [4] - At the enterprise level, the deployment of multi-agent systems requires the creation of AI data lakes to unify management of corporate knowledge bases and enhance collaboration and decision-making accuracy among intelligent agents [4] Group 2: Storage Innovation - The efficiency of AI inference is critically dependent on memory management, as AI needs to "remember" historical reasoning processes to improve response speed and accuracy [5] - Huawei has introduced the UCM inference memory data manager, a solution centered on KV Cache, which features a multi-level caching architecture to dynamically call historical data during inference, significantly enhancing response efficiency and economic viability in sectors like finance [5] Group 3: Open Ecosystem and Collaboration - Huawei has established itself as a trusted provider of data infrastructure, ranking second globally and first domestically in storage market revenue [6] - The company is actively opening its AI toolchain and inference frameworks through open-source methods to empower industry partners and build a prosperous technological ecosystem [6] - The development of AI cannot rely on isolated efforts; it requires collective industry collaboration to strengthen the strategic positioning of storage technology within the AI framework [6]
华为发布AI推理创新技术--UCM推理记忆数据管理器
Core Insights - Huawei launched the UCM inference memory data manager at the Financial AI Inference Application Forum, aiming to enhance AI inference experience and cost-effectiveness while accelerating the positive cycle of AI in business [1][2] - The UCM technology is being piloted in collaboration with China UnionPay in typical financial scenarios, showcasing its application in smart finance [1] Technology Overview - The UCM inference memory data manager consists of three main components: a connector for different engines and computing power, a library for multi-level KV Cache management and acceleration algorithms, and a high-performance KV Cache access adapter [1] - The technology enables a 90% reduction in latency for the first token by directly accessing KV cache data, avoiding redundant calculations [2] - UCM allows for a tenfold expansion of the inference context window, addressing long text processing needs by offloading ultra-long sequence cache to external professional storage [2] Performance Improvements - UCM's intelligent hierarchical caching capability allows for on-demand flow among HBM, DRAM, and SSD storage media based on memory heat [2] - The integration of various sparse attention algorithms enhances the collaboration between computation and storage, resulting in a 2 to 22 times increase in TPS (tokens processed per second) in long sequence scenarios, significantly lowering the inference cost per token [2] - In a pilot with China UnionPay, the UCM technology improved large model inference speed by 125 times, enabling precise identification of customer inquiries in just 10 seconds [2] Future Developments - UCM is set to be open-sourced in September 2023, with a unified interface to adapt to various inference engine frameworks, computing power, and storage systems [2] - The company aims to contribute UCM to mainstream inference engine communities, fostering the development of the AI inference ecosystem across the industry [2]
每Token成本显著降低 华为发布UCM技术破解AI推理难题
Huan Qiu Wang· 2025-08-18 07:40
Core Insights - The forum highlighted the launch of Huawei's UCM inference memory data manager, aimed at enhancing AI inference experiences and cost-effectiveness in the financial sector [1][5] - AI inference is entering a critical growth phase, with inference experience and cost becoming key metrics for model value [3][4] - Huawei's UCM technology has been validated through a pilot project with China UnionPay, demonstrating a 125-fold increase in inference speed [5][6] Group 1: AI Inference Development - AI inference is becoming a crucial area for explosive growth, with a focus on balancing efficiency and cost [3][4] - The transition from "model intelligence" to "data intelligence" is gaining consensus in the industry, emphasizing the importance of high-quality data [3][4] - The UCM data manager consists of three components designed to optimize inference experience and reduce costs [4] Group 2: UCM Technology Features - UCM technology reduces latency for the first token by up to 90% and expands context windows for long text processing by tenfold [4] - The intelligent caching capability of UCM allows for on-demand data flow across various storage media, significantly improving token processing speed [4] - UCM's implementation in financial applications addresses challenges such as long sequence inputs and high computational costs [5] Group 3: Industry Collaboration and Open Source - Huawei announced an open-source plan for UCM, aiming to foster collaboration across the industry and enhance the AI inference ecosystem [6][7] - The open-source initiative is expected to drive standardization and encourage more partners to join in improving inference experiences and costs [7] - The launch of UCM technology is seen as a significant breakthrough for AI inference and a boost for smart finance development [7]
2025金融AI推理应用落地与发展论坛在金融数据港成功举办
Sou Hu Cai Jing· 2025-08-15 17:35
Group 1 - The 2025 Financial AI Inference Application Landing and Development Forum was held at the Financial Data Port AI Innovation Center on August 12, with key figures from China UnionPay and Huawei in attendance [1] - Huawei's Vice President and President of the Data Storage Product Line, Dr. Zhou Yuefeng, introduced the AI inference innovation technology called UCM Inference Memory Data Manager at the forum [3] - China UnionPay plans to leverage the National Artificial Intelligence Application Pilot Base to collaborate with Huawei and other ecosystem partners to build "AI + Finance" demonstration applications, transitioning technology results from "laboratory validation" to "large-scale application" [5] Group 2 - Huawei and China UnionPay jointly released the application results of the Smart Financial AI Inference Acceleration Program during the forum [3][5]
破解效率与成本难题:华为UCM技术推动AI推理体验升级
Yang Guang Wang· 2025-08-13 06:13
Group 1 - The forum on the application and development of financial AI reasoning took place in Shanghai, featuring key figures from China UnionPay and Huawei [1] - Huawei introduced the UCM reasoning memory data manager, aimed at enhancing AI reasoning experiences and cost-effectiveness, while accelerating the positive cycle of AI in business [1][3] - AI reasoning is entering a critical growth phase, with reasoning experience and cost becoming key metrics for evaluating model value [3] Group 2 - The UCM reasoning memory data manager includes three main components: reasoning engine plugins, a function library for multi-level KV Cache management, and high-performance KV Cache access adapters [3][4] - UCM technology can reduce the latency of the first token by up to 90% and expand the reasoning context window by ten times, addressing long text processing needs [3][4] - The UCM's intelligent caching capabilities significantly enhance processing speed, achieving a 125-fold increase in reasoning speed for China UnionPay's "Voice of the Customer" scenario [4] Group 3 - Huawei announced an open-source plan for UCM, which will be available in September, allowing adaptation to various reasoning engine frameworks and storage systems [4] - The collaboration between Huawei and China UnionPay aims to build "AI + Finance" demonstration applications, transitioning technology from laboratory validation to large-scale application [4]
即将开源!华为发布AI推理黑科技,已在中国银联落地
Tai Mei Ti A P P· 2025-08-13 03:44
Core Insights - Huawei has launched the UCM inference memory data manager to enhance AI inference experiences, improve cost-effectiveness, and accelerate the commercial cycle of AI [2] - The UCM technology has been piloted in financial scenarios in collaboration with China UnionPay, showcasing its application in smart finance [2] Industry Trends - The focus in the large model industry is shifting from training to inference, with current inference computing power demand exceeding training by 58.5% [2] - The release of new models often leads to instability in service providers due to high user demand, necessitating optimizations to reduce inference costs without compromising user experience [3] Performance Comparison - Foreign mainstream large models achieve output speeds in the range of 200 tokens/s with a latency of 5ms, while Chinese models generally fall below 60 tokens/s with latencies of 50-100ms, indicating a maximum disparity of 10 times [4] - Chinese models also support fewer tokens in context windows compared to their foreign counterparts, with a significant probability of missing key information during long text analysis [4] Technical Innovations - The UCM system consists of three main components: a connector for popular inference frameworks, an accelerator for multi-level KV cache management, and an adapter for high-performance KV cache access [6] - By caching previously processed results and data in a high-performance external shared storage, UCM can reduce the first token delay by 90% and significantly speed up inference processes [8][9] Financial Sector Applications - The financial industry is rapidly adopting large models, with a focus on reducing high costs and latency associated with AI inference, which is critical for risk control and transaction security [10] - A collaboration between China UnionPay and Huawei has led to a significant reduction in inference time for label classification from 600 seconds to under 10 seconds, achieving over a 50-fold improvement in efficiency [11] Future Developments - Huawei plans to open-source the UCM technology in September, aiming to create a unified interface that can adapt to various inference engine frameworks, computing power, and storage systems [11]
贴息政策来了!事关个人消费贷款、服务业经营主体贷款丨盘前情报
Market Overview - On August 12, the A-share market experienced a steady rise, with all three major indices reaching new highs for the year. The Shanghai Composite Index rose by 0.5%, the Shenzhen Component Index increased by 0.53%, and the ChiNext Index gained 1.24% [2][3] - The total trading volume in the Shanghai and Shenzhen markets was 1.88 trillion yuan, an increase of 54.5 billion yuan compared to the previous trading day. Despite the overall market rise, over 3,100 stocks declined, indicating a mixed performance among individual stocks [2] Sector Performance - Semiconductor stocks surged in the afternoon, while A-hardware stocks showed strength. The leading sectors included semiconductors, ports, CPO, and Xinjiang-related stocks, while PEEK materials, rare earth permanent magnets, and lithium mining sectors faced declines [2] International Market - The U.S. stock market saw gains on August 12, with the Dow Jones Industrial Average rising by 483.52 points (1.10%) to close at 44,458.61 points, the S&P 500 increasing by 72.31 points (1.13%) to 6,445.76 points, and the Nasdaq Composite up by 296.50 points (1.39%) to 21,681.90 points [4][5] - In Europe, the FTSE 100 rose by 0.20%, the CAC 40 increased by 0.71%, while the DAX index fell by 0.23% [4][5] Commodity Prices - International oil prices declined on August 12, with light crude oil futures for September dropping by $0.79 to $63.17 per barrel (1.24% decrease) and Brent crude for October falling by $0.51 to $66.12 per barrel (0.77% decrease) [4] Policy Announcements - Nine departments, including the Ministry of Finance and the People's Bank of China, issued a policy implementation plan for interest subsidies on loans to service industry entities, effective from March 16, 2025, to December 31, 2025 [6] - A separate plan for personal consumption loan interest subsidies was announced, applicable from September 1, 2025, to August 31, 2026, covering various consumer sectors [6] Corporate Developments - Huawei announced the upcoming open-source release of its AI inference technology, UCM, which is set to launch in September [7] - The Ministry of Commerce initiated an anti-dumping investigation into imported pea starch from Canada, with the investigation period set from January 1, 2024, to December 31, 2024 [8] - Key players in the dry-process lithium battery separator industry reached consensus on several measures to promote healthy competition and industry cooperation [8] Economic Indicators - The U.S. Consumer Price Index (CPI) for July increased by 2.7% year-on-year, with a month-on-month rise of 0.2%. The core CPI, excluding food and energy, rose by 3.1% year-on-year [9] Investment Insights - Analysts from Great Wall Securities noted that the A-share market's upward momentum is supported by infrastructure and policy expectations, while the Hang Seng Technology Index has lagged behind [10] - Zhongyin International emphasized the acceleration of AI application commercialization, suggesting a focus on revenue growth and user expansion in AI sectors [11] Company Announcements - Zhenlei Technology reported a 1007% year-on-year increase in net profit for the first half of the year [12] - Baiyun Airport signed a cooperation contract for duty-free operations at T3 terminal with China Duty Free Group [12]
华为在沪发布AI推理创新技术UCM 9月将正式开源
Sou Hu Cai Jing· 2025-08-12 11:53
Core Insights - The article discusses the advancements in AI reasoning technology, particularly focusing on Huawei's UCM reasoning memory data manager, which aims to enhance AI inference efficiency and reduce costs [2][3]. Group 1: AI Technology Development - AI reasoning is entering a critical growth phase, with the UCM reasoning memory data manager being a key innovation [2]. - UCM integrates various caching acceleration algorithms and manages KV Cache memory data to improve inference experiences [2][3]. Group 2: Performance Enhancements - UCM technology can reduce the first token latency by up to 90% and expand the inference context window by ten times, addressing long text processing needs [3]. - The TPS (tokens per second) can increase by 2 to 22 times in long sequence scenarios, significantly lowering the cost per token for enterprises [3]. Group 3: Industry Collaboration - Huawei and China UnionPay have successfully validated UCM's technology, achieving a 125-fold increase in inference speed for customer service applications [4]. - Future plans include building "AI + Finance" demonstration applications in collaboration with industry partners to transition from experimental validation to large-scale application [4]. Group 4: Open Source Initiative - Huawei announced an open-source plan for UCM, which will be available in September, aiming to contribute to mainstream inference engine communities [4].
华为:AI推理创新技术UCM将于今年9月正式开源
Xin Lang Ke Ji· 2025-08-12 11:21
Group 1 - The forum on the application and development of financial AI reasoning in 2025 featured speeches from executives of China UnionPay and Huawei, highlighting the importance of AI in the financial sector [2] - Huawei introduced the UCM reasoning memory data manager, aimed at enhancing AI reasoning experiences and improving cost-effectiveness, while accelerating the positive cycle of AI in business [2] - The UCM technology was piloted in typical financial scenarios with China UnionPay, showcasing its application in smart financial AI reasoning acceleration [2] Group 2 - The UCM technology demonstrated significant value in a pilot with China UnionPay, achieving a 125-fold increase in large model reasoning speed, allowing for precise identification of customer issues in just 10 seconds [3] - China UnionPay plans to collaborate with Huawei and other partners to build "AI + Finance" demonstration applications, transitioning technology from laboratory validation to large-scale application [3] - Huawei announced the UCM open-source plan, which will be officially launched in September, aiming to contribute to mainstream reasoning engine communities and promote the development of the AI reasoning ecosystem [3]