Workflow
Cambricon(688256)
icon
Search documents
智谱联手寒武纪,推出模型芯片一体解决方案
Di Yi Cai Jing· 2025-09-30 07:38
Core Insights - The latest model GLM-4.6 from the domestic AI startup Zhipu has been released, showcasing improvements in programming, long context handling, reasoning capabilities, information retrieval, writing skills, and agent applications [3] Model Enhancements - GLM-4.6 aligns its coding capabilities with Claude Sonnet 4 in public benchmarks and real programming tasks [3] - The context window has been increased from 128K to 200K, allowing for longer code and agent tasks [3] - The new model enhances reasoning abilities and supports tool invocation during reasoning processes [3] - There is an improvement in the model's tool invocation and search capabilities [3] Chip Integration - The "MoCore linkage" is a key focus of the new model, with GLM-4.6 achieving FP8+Int4 mixed quantization deployment on domestic Cambricon chips, marking the industry's first production of an FP8+Int4 model chip solution on domestic hardware [3] - This approach maintains accuracy while reducing inference costs, exploring feasible paths for localized operation of large models on domestic chips [3] Quantization Techniques - FP8 (Floating-Point 8) offers a wide dynamic range with minimal precision loss, while Int4 (Integer 4) provides high compression ratios with lower memory usage but more noticeable precision loss [4] - The "FP8+Int4 mixed" mode allocates quantization formats based on the functional differences of the model's modules, optimizing memory usage [4] Memory Efficiency - Core parameters of the large model, which account for 60%-80% of total memory, can be compressed to 1/4 of FP16 size through Int4 quantization, significantly reducing the memory pressure on chips [5] - Temporary dialogue data accumulated during inference can also be compressed using Int4 while keeping precision loss minimal [5] - FP8 is used for numerically sensitive modules to minimize precision loss and retain fine semantic information [5] Ecosystem Development - Cambricon and Moore Threads have successfully adapted GLM-4.6 based on the vLLM inference framework, demonstrating the capabilities of the new generation of GPUs to run the model stably at native FP8 precision [5] - This adaptation signifies that domestic GPUs are now capable of collaborating and iterating with cutting-edge large models, accelerating the development of a self-controlled AI technology ecosystem [5] - The combination of GLM-4.6 and domestic chips will be offered to enterprises and the public through the Zhipu MaaS platform [5]
寒武纪、摩尔线程完成智谱GLM-4.6适配
Xin Lang Cai Jing· 2025-09-30 07:33
Core Insights - The article highlights the official release and open-sourcing of the new generation large model GLM-4.6 by Zhiyuan on September 30, showcasing significant improvements in core capabilities such as Agentic Coding and code generation, aligning with Claude Sonnet 4 [1] Group 1: Product Development - GLM-4.6 has achieved substantial enhancements in its core functionalities, particularly in code generation capabilities [1] - The model has been deployed on domestic AI chips from Cambrian, utilizing a mixed quantization inference solution of FP8+Int4, marking the first instance of such a model-chip integration on domestic chips [1] Group 2: Technological Adaptation - Moore Threads has adapted GLM-4.6 using the vLLM inference framework, enabling the new generation GPU to operate stably under native FP8 precision [1]
智谱宣布 GLM-4.6发布,寒武纪、摩尔线程已完成适配
Xin Lang Ke Ji· 2025-09-30 07:25
Core Insights - The domestic large model company Zhipu has released and open-sourced its next-generation large model GLM-4.6, achieving significant advancements in core capabilities such as Agentic Coding [1] - GLM-4.6's code generation ability has fully aligned with Claude Sonnet 4, making it the strongest coding model in China, while also surpassing DeepSeek-V3.2-Exp in various aspects including long context processing and reasoning capabilities [1] - The model has been deployed on domestic AI chips with an FP8+Int4 mixed-precision inference solution, marking the first instance of such a model-chip integration on domestic chips, significantly reducing inference costs while maintaining model accuracy [1] Industry Developments - Moore Threads has adapted GLM-4.6 based on the vLLM inference framework, demonstrating the advantages of the MUSA architecture and full-featured GPUs in ecological compatibility and rapid adaptation [2] - The combination of GLM-4.6 with domestic chips will be offered through the Zhipu MaaS platform, aiming to release broader social and industrial value [2] - The deep collaboration between the domestically developed GLM series large models and domestic chips is expected to continuously enhance performance and efficiency in model training and inference, contributing to a more open, controllable, and efficient AI infrastructure [2]
智谱旗舰模型GLM-4.6上线 寒武纪、摩尔线程已完成适配
Hua Er Jie Jian Wen· 2025-09-30 07:13
Core Insights - The latest GLM-4.6 model has been launched, showcasing a 27% improvement in coding capabilities compared to its predecessor GLM-4.5, excelling in real programming, long context handling, and reasoning abilities [1] - GLM-4.6 achieved the highest domestic standard in public benchmark tests and surpassed other domestic models in 74 real programming tasks [1] - The model has been deployed on leading domestic AI chips from Cambrian using FP8+Int4 mixed-precision quantization, marking the first production of an FP8+Int4 model integrated with chip solutions on domestic chips [1] - Additionally, Moore Threads has adapted GLM-4.6 to the vLLM inference framework, enabling the new generation of GPUs to run the model stably at native FP8 precision [1]
智谱发布国内最强Coding模型「GLM-4.6」,寒武纪、摩尔线程完成对其适配
IPO早知道· 2025-09-30 07:13
Core Viewpoint - The article discusses the significant advancements in domestic AI models and chips, particularly focusing on the release of the GLM-4.6 model by Zhiyu, which showcases enhanced capabilities in coding and other AI applications, marking a new phase of collaboration between domestic large models and chips [2][5]. Group 1: Model Advancements - Zhiyu officially released and open-sourced the new generation large model GLM-4.6 on September 30, achieving substantial improvements in core capabilities such as Agentic Coding [2]. - In public benchmark tests and real programming tasks, GLM-4.6's code generation ability has fully aligned with Claude Sonnet 4, making it the strongest coding model in China [5]. - The model has undergone comprehensive upgrades in long context processing, reasoning ability, information retrieval, text generation, and intelligent applications, surpassing the performance of DeepSeek-V3.2-Exp [5]. Group 2: Chip Integration - GLM-4.6 has been deployed on leading domestic AI chips from Cambrian using FP8+Int4 mixed-precision inference, marking the first production of an FP8+Int4 model-chip integrated solution on domestic chips [7]. - This solution significantly reduces inference costs while maintaining model accuracy, providing a feasible path for local operation of large models on domestic chips [7]. - The adaptation of GLM-4.6 by Cambrian and Moore Threads indicates that domestic GPUs are now capable of iterating in sync with cutting-edge large models, accelerating the construction of a self-controlled AI technology ecosystem [7]. Group 3: Future Implications - The combination of the domestically developed GLM series large models and domestic chips is expected to continuously drive dual optimization of performance and efficiency in model training and inference [7]. - This collaboration aims to build a more open, controllable, and efficient artificial intelligence infrastructure, releasing broader social and industrial value through the Zhiyu MaaS platform [7].
DeepSeek新模型上线,昇腾、寒武纪、海光等宣布适配
Guan Cha Zhe Wang· 2025-09-30 06:16
Core Insights - The release and open-sourcing of the DeepSeek-V3.2-Exp model on September 29 marks a significant advancement in AI technology, featuring a sparse Attention architecture that reduces computational resource consumption and enhances inference efficiency [1] - The model's API pricing has been reduced by over 50%, making it more accessible for developers and users [1] - Major companies such as Huawei, Cambricon, and Haiguang have announced successful adaptations of the DeepSeek-V3.2-Exp model, showcasing its compatibility and performance across different platforms [1] Group 1 - The DeepSeek-V3.2-Exp model introduces a sparse Attention mechanism that significantly lowers training and inference costs in long-sequence scenarios [1] - Huawei's Ascend has quickly adapted the model for deployment, providing open-source inference code and operator implementations for developers [1] - Cambricon has also achieved adaptation of the DeepSeek-V3.2-Exp model, leveraging its computing efficiency to enhance performance [1] Group 2 - Following the announcement, the market saw a strong performance in the Sci-Tech 50 Index, with notable gains in AI chip stocks and Huawei Ascend concepts [2] - The dual advancement of hardware and software signifies a shift in the domestic AI ecosystem from "usable" to "user-friendly," creating a closed loop from foundational computing power to upper-layer applications [2] - Analysts suggest that the integration of large models and generative AI into consumer endpoints will enhance product value and user engagement, positioning companies that develop "AI + hardware" applications competitively in the next computing platform race [2]
寒武纪-U成交额达100亿元,现涨0.2%。
Xin Lang Cai Jing· 2025-09-30 06:12
寒武纪-U成交额达100亿元,现涨0.2%。 ...
寒武纪、华为昇腾适配DeepSeek最新模型,科创半导体ETF(588170)连续9日获资金加仓!
Mei Ri Jing Ji Xin Wen· 2025-09-30 05:49
Group 1 - The core viewpoint of the news highlights the positive performance of the semiconductor sector, particularly the rise of the Sci-Tech Innovation Board Semiconductor Materials and Equipment Index and related ETFs, indicating strong investor interest and market momentum [1][3] - The Sci-Tech Semiconductor ETF (588170) has seen a significant increase of 15.52% over the past week, reaching a new high in both scale at 2.604 billion yuan and shares at 1.725 billion [1] - Continuous net inflows into the Sci-Tech Semiconductor ETF over the past nine days, with a peak single-day inflow of 632 million yuan, totaling 1.721 billion yuan, demonstrate robust investor confidence [1] Group 2 - The adaptation of the DeepSeek V3.2 model by Cambrian and Huawei Ascend signifies advancements in domestic computing power, enhancing the efficiency of domestic chips and reducing training costs in long-sequence scenarios [2] - Analysts from Shengan Securities predict that ongoing investments in computing infrastructure will lead to sustained breakthroughs in domestic computing power, potentially outpacing overseas growth [2] - The semiconductor equipment and materials sector is identified as a key area for domestic substitution, benefiting from low domestic replacement rates and high ceilings for domestic alternatives, driven by the AI revolution and technological advancements [3]
DeepSeek、寒武纪接连发布重磅消息!AI芯片持续走强,科创50ETF龙头、科创100ETF广发、科创成长ETF一键打包科技主线代表企业
Xin Lang Cai Jing· 2025-09-30 05:11
相关ETF: 科创50ETF龙头(588060),紧密跟踪上证科创板50成份指数,由上海证券交易所科创板中市值大、流动 性好的50只证券组成,反映最具市场代表性的一批科创企业的整体表现。定位科创板"硬科技龙头"标 杆。场外联接(A类:013810;C类:013811;F类:021768)。 受此消息影响,AI芯片相关ETF整体表现强势!具体来看,截至2025年9月30日 10:22,科创50ETF龙头 (588060) 强势上涨2.24%。标的指数半导体行业权重达65.99%(申万二级),成分股佰维存储上涨 14.76%,澜起科技上涨8.18%,思特威上涨6.06%,西部超导,华大智造等个股跟涨。该基金最新规模 达68.15亿元,创近半年新高。 科创100ETF广发(588980) 上涨1.91%,标的指数半导体行业权重达32.87%(申万二级),成分股派能科 技上涨17.39%,华秦科技上涨7.37%,翱捷科技上涨5.62%,云天励飞,固德威等个股跟涨。拉长时间 看,该基金8月成立至今,涨幅超26%。 科创成长ETF(588110) 上涨1.95%,近3月规模增长1.77亿元,实现显著增长。标的指数半导体行 ...
国产AI产业链正深度协同,人工智能ETF(159819)助力一键布局AI产业链龙头
Mei Ri Jing Ji Xin Wen· 2025-09-30 04:29
Group 1 - The artificial intelligence sector experienced a rapid increase in early trading, with memory and ASIC chip concepts leading the gains. As of 9:48, the China Securities Artificial Intelligence Theme Index rose by 1.8%, and the AI ETF (159819) had a real-time transaction volume exceeding 200 million yuan [1] - DeepSeek announced the official release of the DeepSeek-V3.2-Exp model, which introduces a sparse attention mechanism to optimize training and inference efficiency for long texts. Following the release, Cambrian Technology announced its adaptation to the latest DeepSeek model within five minutes [1] - According to Zhongyin International, the domestic computing power industry chain reported rapid growth in mid-year performance, indicating a phase of industry prosperity. The optimization of DeepSeek model performance and its distillation technology is expected to significantly benefit AI edge applications, leading to potential gains for the AI industry chain [1] Group 2 - The China Securities Artificial Intelligence Theme Index consists of 50 stocks involved in providing foundational resources, technology, and application support for artificial intelligence, covering leading companies in various segments of the AI industry chain [1] - The AI ETF (159819) tracks this index, with a latest scale of 24.7 billion yuan, ranking first among its peers, and a management fee rate of only 0.15% per year, facilitating low-cost investment in leading companies within the AI industry chain [1]