Workflow
Cambricon(688256)
icon
Search documents
智谱发布GLM-4.6 寒武纪、摩尔线程已适配
Mei Ri Jing Ji Xin Wen· 2025-09-30 07:47
Core Insights - The domestic large model key enterprise, Zhipu, has officially released and open-sourced its next-generation large model GLM-4.6, achieving significant advancements in core capabilities such as Agentic Coding [1] - This release follows the major technology launches of DeepSeek-V3.2-Exp and Claude Sonnet4.5, marking another significant development in the industry before the National Day holiday [1] - Zhipu announced that GLM-4.6 has been deployed on leading domestic AI chips from Cambrian using FP8+Int4 mixed-precision quantization inference, representing the first production of an FP8+Int4 model-chip integrated solution on domestic chips [1] - Additionally, Moore Threads has completed the adaptation of GLM-4.6 based on the vLLM inference framework, allowing the new generation of GPUs to stably run the model at native FP8 precision [1]
智谱正式发布并开源新一代大模型GLM-4.6 寒武纪、摩尔线程完成适配
Mei Ri Jing Ji Xin Wen· 2025-09-30 07:42
Core Insights - The domestic large model company Zhipu has officially released and open-sourced its next-generation large model GLM-4.6, achieving significant advancements in core capabilities such as Agentic Coding [1] Group 1: Model Development - GLM-4.6 has been deployed on Cambricon AI chips using FP8+Int4 mixed precision computing technology, marking the first production of an FP8+Int4 model on domestic chips [1] - This mixed-precision solution significantly reduces inference costs while maintaining model accuracy, providing a feasible path for localized operation of large models on domestic chips [1] Group 2: Ecosystem Compatibility - Moore Threads has adapted GLM-4.6 based on the vLLM inference framework, demonstrating that the new generation of GPUs can stably run the model at native FP8 precision [1] - This adaptation validates the advantages of the MUSA (Meta-computing Unified System Architecture) and full-function GPUs in terms of ecological compatibility and rapid adaptability [1] Group 3: Industry Implications - The collaboration between Cambricon and Moore Threads on GLM-4.6 signifies that domestic GPUs are now capable of iterating in tandem with cutting-edge large models, accelerating the construction of a self-controlled AI technology ecosystem [1] - The combination of GLM-4.6 and domestic chips will initially be offered to enterprises and the public through the Zhipu MaaS platform [1]
科创人工智能ETF(588730)涨3.14%,DeepSeek、寒武纪同步发布相关重要事项
Ge Long Hui· 2025-09-30 07:39
Core Insights - The semiconductor and AI sectors are experiencing significant growth, with the Sci-Tech Innovation AI ETF rising by 3.14% and reaching a historical net asset value high, driven by strong performances from key stocks like Cambrian and Lattice Power [1] Group 1: Market Performance - On the last trading day before the holiday, the chip and AI sectors led the market, with Lattice Technology increasing over 7% [1] - The Sci-Tech Innovation AI ETF, which tracks the Shanghai Stock Exchange Sci-Tech Innovation Board AI Index, has a semiconductor weight of 54.1%, with top three holdings being Cambrian (16.62%), Lattice Technology (10%), and Chip Original [1] Group 2: Fund Inflows - There has been a significant inflow of funds into the Sci-Tech Innovation AI ETF, with a net inflow of 114 million yuan over the past five days, bringing the total fund size to 1.747 billion yuan [1] Group 3: Industry Developments - DeepSeek announced updates to its official app and services, significantly reducing API costs by over 50%, which is expected to enhance developer engagement [1] - Several domestic chip manufacturers have completed adaptations for DeepSeek-V3.2-Exp, with Cambrian announcing the synchronization of its latest model and the open-sourcing of its large model inference engine [2] - Tencent has launched and open-sourced its native multimodal image generation model, HunyuanImage 3.0, which has a parameter scale of 80 billion, marking a significant advancement in the industry [2] - Huaxin Securities has expressed optimism about the domestic AI chip industry, highlighting the complete integration of the AI industry chain from advanced processes to model acceleration by major companies like ByteDance, Alibaba, and Tencent [2]
智谱联手寒武纪,推出模型芯片一体解决方案
Di Yi Cai Jing· 2025-09-30 07:38
Core Insights - The latest model GLM-4.6 from the domestic AI startup Zhipu has been released, showcasing improvements in programming, long context handling, reasoning capabilities, information retrieval, writing skills, and agent applications [3] Model Enhancements - GLM-4.6 aligns its coding capabilities with Claude Sonnet 4 in public benchmarks and real programming tasks [3] - The context window has been increased from 128K to 200K, allowing for longer code and agent tasks [3] - The new model enhances reasoning abilities and supports tool invocation during reasoning processes [3] - There is an improvement in the model's tool invocation and search capabilities [3] Chip Integration - The "MoCore linkage" is a key focus of the new model, with GLM-4.6 achieving FP8+Int4 mixed quantization deployment on domestic Cambricon chips, marking the industry's first production of an FP8+Int4 model chip solution on domestic hardware [3] - This approach maintains accuracy while reducing inference costs, exploring feasible paths for localized operation of large models on domestic chips [3] Quantization Techniques - FP8 (Floating-Point 8) offers a wide dynamic range with minimal precision loss, while Int4 (Integer 4) provides high compression ratios with lower memory usage but more noticeable precision loss [4] - The "FP8+Int4 mixed" mode allocates quantization formats based on the functional differences of the model's modules, optimizing memory usage [4] Memory Efficiency - Core parameters of the large model, which account for 60%-80% of total memory, can be compressed to 1/4 of FP16 size through Int4 quantization, significantly reducing the memory pressure on chips [5] - Temporary dialogue data accumulated during inference can also be compressed using Int4 while keeping precision loss minimal [5] - FP8 is used for numerically sensitive modules to minimize precision loss and retain fine semantic information [5] Ecosystem Development - Cambricon and Moore Threads have successfully adapted GLM-4.6 based on the vLLM inference framework, demonstrating the capabilities of the new generation of GPUs to run the model stably at native FP8 precision [5] - This adaptation signifies that domestic GPUs are now capable of collaborating and iterating with cutting-edge large models, accelerating the development of a self-controlled AI technology ecosystem [5] - The combination of GLM-4.6 and domestic chips will be offered to enterprises and the public through the Zhipu MaaS platform [5]
寒武纪、摩尔线程完成智谱GLM-4.6适配
Xin Lang Cai Jing· 2025-09-30 07:33
Core Insights - The article highlights the official release and open-sourcing of the new generation large model GLM-4.6 by Zhiyuan on September 30, showcasing significant improvements in core capabilities such as Agentic Coding and code generation, aligning with Claude Sonnet 4 [1] Group 1: Product Development - GLM-4.6 has achieved substantial enhancements in its core functionalities, particularly in code generation capabilities [1] - The model has been deployed on domestic AI chips from Cambrian, utilizing a mixed quantization inference solution of FP8+Int4, marking the first instance of such a model-chip integration on domestic chips [1] Group 2: Technological Adaptation - Moore Threads has adapted GLM-4.6 using the vLLM inference framework, enabling the new generation GPU to operate stably under native FP8 precision [1]
智谱宣布 GLM-4.6发布,寒武纪、摩尔线程已完成适配
Xin Lang Ke Ji· 2025-09-30 07:25
Core Insights - The domestic large model company Zhipu has released and open-sourced its next-generation large model GLM-4.6, achieving significant advancements in core capabilities such as Agentic Coding [1] - GLM-4.6's code generation ability has fully aligned with Claude Sonnet 4, making it the strongest coding model in China, while also surpassing DeepSeek-V3.2-Exp in various aspects including long context processing and reasoning capabilities [1] - The model has been deployed on domestic AI chips with an FP8+Int4 mixed-precision inference solution, marking the first instance of such a model-chip integration on domestic chips, significantly reducing inference costs while maintaining model accuracy [1] Industry Developments - Moore Threads has adapted GLM-4.6 based on the vLLM inference framework, demonstrating the advantages of the MUSA architecture and full-featured GPUs in ecological compatibility and rapid adaptation [2] - The combination of GLM-4.6 with domestic chips will be offered through the Zhipu MaaS platform, aiming to release broader social and industrial value [2] - The deep collaboration between the domestically developed GLM series large models and domestic chips is expected to continuously enhance performance and efficiency in model training and inference, contributing to a more open, controllable, and efficient AI infrastructure [2]
智谱旗舰模型GLM-4.6上线 寒武纪、摩尔线程已完成适配
Hua Er Jie Jian Wen· 2025-09-30 07:13
Core Insights - The latest GLM-4.6 model has been launched, showcasing a 27% improvement in coding capabilities compared to its predecessor GLM-4.5, excelling in real programming, long context handling, and reasoning abilities [1] - GLM-4.6 achieved the highest domestic standard in public benchmark tests and surpassed other domestic models in 74 real programming tasks [1] - The model has been deployed on leading domestic AI chips from Cambrian using FP8+Int4 mixed-precision quantization, marking the first production of an FP8+Int4 model integrated with chip solutions on domestic chips [1] - Additionally, Moore Threads has adapted GLM-4.6 to the vLLM inference framework, enabling the new generation of GPUs to run the model stably at native FP8 precision [1]
智谱发布国内最强Coding模型「GLM-4.6」,寒武纪、摩尔线程完成对其适配
IPO早知道· 2025-09-30 07:13
Core Viewpoint - The article discusses the significant advancements in domestic AI models and chips, particularly focusing on the release of the GLM-4.6 model by Zhiyu, which showcases enhanced capabilities in coding and other AI applications, marking a new phase of collaboration between domestic large models and chips [2][5]. Group 1: Model Advancements - Zhiyu officially released and open-sourced the new generation large model GLM-4.6 on September 30, achieving substantial improvements in core capabilities such as Agentic Coding [2]. - In public benchmark tests and real programming tasks, GLM-4.6's code generation ability has fully aligned with Claude Sonnet 4, making it the strongest coding model in China [5]. - The model has undergone comprehensive upgrades in long context processing, reasoning ability, information retrieval, text generation, and intelligent applications, surpassing the performance of DeepSeek-V3.2-Exp [5]. Group 2: Chip Integration - GLM-4.6 has been deployed on leading domestic AI chips from Cambrian using FP8+Int4 mixed-precision inference, marking the first production of an FP8+Int4 model-chip integrated solution on domestic chips [7]. - This solution significantly reduces inference costs while maintaining model accuracy, providing a feasible path for local operation of large models on domestic chips [7]. - The adaptation of GLM-4.6 by Cambrian and Moore Threads indicates that domestic GPUs are now capable of iterating in sync with cutting-edge large models, accelerating the construction of a self-controlled AI technology ecosystem [7]. Group 3: Future Implications - The combination of the domestically developed GLM series large models and domestic chips is expected to continuously drive dual optimization of performance and efficiency in model training and inference [7]. - This collaboration aims to build a more open, controllable, and efficient artificial intelligence infrastructure, releasing broader social and industrial value through the Zhiyu MaaS platform [7].
DeepSeek新模型上线,昇腾、寒武纪、海光等宣布适配
Guan Cha Zhe Wang· 2025-09-30 06:16
Core Insights - The release and open-sourcing of the DeepSeek-V3.2-Exp model on September 29 marks a significant advancement in AI technology, featuring a sparse Attention architecture that reduces computational resource consumption and enhances inference efficiency [1] - The model's API pricing has been reduced by over 50%, making it more accessible for developers and users [1] - Major companies such as Huawei, Cambricon, and Haiguang have announced successful adaptations of the DeepSeek-V3.2-Exp model, showcasing its compatibility and performance across different platforms [1] Group 1 - The DeepSeek-V3.2-Exp model introduces a sparse Attention mechanism that significantly lowers training and inference costs in long-sequence scenarios [1] - Huawei's Ascend has quickly adapted the model for deployment, providing open-source inference code and operator implementations for developers [1] - Cambricon has also achieved adaptation of the DeepSeek-V3.2-Exp model, leveraging its computing efficiency to enhance performance [1] Group 2 - Following the announcement, the market saw a strong performance in the Sci-Tech 50 Index, with notable gains in AI chip stocks and Huawei Ascend concepts [2] - The dual advancement of hardware and software signifies a shift in the domestic AI ecosystem from "usable" to "user-friendly," creating a closed loop from foundational computing power to upper-layer applications [2] - Analysts suggest that the integration of large models and generative AI into consumer endpoints will enhance product value and user engagement, positioning companies that develop "AI + hardware" applications competitively in the next computing platform race [2]
寒武纪-U成交额达100亿元,现涨0.2%。
Xin Lang Cai Jing· 2025-09-30 06:12
寒武纪-U成交额达100亿元,现涨0.2%。 ...