寒武纪国产AI芯片
Search documents
智谱正式发布并开源新一代大模型GLM-4.6 寒武纪、摩尔线程完成对智谱GLM-4.6的适配
Zheng Quan Shi Bao Wang· 2025-09-30 07:58
Core Insights - The release of GLM-4.6 by Zhipu marks a significant advancement in large model capabilities, particularly in Agentic Coding and other core functionalities [1] - GLM-4.6 has achieved comprehensive alignment with Claude Sonnet4 in code generation, establishing itself as the strongest coding model in China [1] - The model has undergone extensive upgrades in long context processing, reasoning, information retrieval, text generation, and agent applications, surpassing the performance of DeepSeek-V3.2-Exp [1] - As an open-source model, GLM-4.6 is one of the strongest general-purpose large models globally, enhancing the position of domestic large models in the global competitive landscape [1] Technological Developments - Zhipu has implemented FP8+Int4 mixed-precision quantization inference deployment on leading domestic AI chips from Cambrian, marking the first production of an FP8+Int4 model-chip integrated solution on domestic chips [1] - This solution significantly reduces inference costs while maintaining model accuracy, providing a feasible path for local operation of large models on domestic chips [1] - Moore Threads has adapted GLM-4.6 based on the vLLM inference framework, demonstrating the advantages of the MUSA architecture and full-featured GPUs in ecological compatibility and rapid adaptation [2] Industry Implications - The collaboration between Cambrian and Moore Threads signifies that domestic GPUs are now capable of iterating in synergy with cutting-edge large models, accelerating the construction of a self-controlled AI technology ecosystem [2] - The combination of GLM-4.6 and domestic chips will initially be offered to enterprises and the public through the Zhipu MaaS platform, unlocking broader social and industrial value [2] - The deep collaboration between domestically developed GLM series large models and domestic chips will continue to drive dual optimization of performance and efficiency in model training and inference, fostering a more open, controllable, and efficient AI infrastructure [2]
智谱发布GLM-4.6 寒武纪、摩尔线程已适配
Mei Ri Jing Ji Xin Wen· 2025-09-30 07:47
Core Insights - The domestic large model key enterprise, Zhipu, has officially released and open-sourced its next-generation large model GLM-4.6, achieving significant advancements in core capabilities such as Agentic Coding [1] - This release follows the major technology launches of DeepSeek-V3.2-Exp and Claude Sonnet4.5, marking another significant development in the industry before the National Day holiday [1] - Zhipu announced that GLM-4.6 has been deployed on leading domestic AI chips from Cambrian using FP8+Int4 mixed-precision quantization inference, representing the first production of an FP8+Int4 model-chip integrated solution on domestic chips [1] - Additionally, Moore Threads has completed the adaptation of GLM-4.6 based on the vLLM inference framework, allowing the new generation of GPUs to stably run the model at native FP8 precision [1]
寒武纪、摩尔线程完成智谱GLM-4.6适配
Xin Lang Cai Jing· 2025-09-30 07:33
Core Insights - The article highlights the official release and open-sourcing of the new generation large model GLM-4.6 by Zhiyuan on September 30, showcasing significant improvements in core capabilities such as Agentic Coding and code generation, aligning with Claude Sonnet 4 [1] Group 1: Product Development - GLM-4.6 has achieved substantial enhancements in its core functionalities, particularly in code generation capabilities [1] - The model has been deployed on domestic AI chips from Cambrian, utilizing a mixed quantization inference solution of FP8+Int4, marking the first instance of such a model-chip integration on domestic chips [1] Group 2: Technological Adaptation - Moore Threads has adapted GLM-4.6 using the vLLM inference framework, enabling the new generation GPU to operate stably under native FP8 precision [1]
智谱宣布 GLM-4.6发布,寒武纪、摩尔线程已完成适配
Xin Lang Ke Ji· 2025-09-30 07:25
Core Insights - The domestic large model company Zhipu has released and open-sourced its next-generation large model GLM-4.6, achieving significant advancements in core capabilities such as Agentic Coding [1] - GLM-4.6's code generation ability has fully aligned with Claude Sonnet 4, making it the strongest coding model in China, while also surpassing DeepSeek-V3.2-Exp in various aspects including long context processing and reasoning capabilities [1] - The model has been deployed on domestic AI chips with an FP8+Int4 mixed-precision inference solution, marking the first instance of such a model-chip integration on domestic chips, significantly reducing inference costs while maintaining model accuracy [1] Industry Developments - Moore Threads has adapted GLM-4.6 based on the vLLM inference framework, demonstrating the advantages of the MUSA architecture and full-featured GPUs in ecological compatibility and rapid adaptation [2] - The combination of GLM-4.6 with domestic chips will be offered through the Zhipu MaaS platform, aiming to release broader social and industrial value [2] - The deep collaboration between the domestically developed GLM series large models and domestic chips is expected to continuously enhance performance and efficiency in model training and inference, contributing to a more open, controllable, and efficient AI infrastructure [2]
智谱旗舰模型GLM-4.6上线 寒武纪、摩尔线程已完成适配
Hua Er Jie Jian Wen· 2025-09-30 07:13
Core Insights - The latest GLM-4.6 model has been launched, showcasing a 27% improvement in coding capabilities compared to its predecessor GLM-4.5, excelling in real programming, long context handling, and reasoning abilities [1] - GLM-4.6 achieved the highest domestic standard in public benchmark tests and surpassed other domestic models in 74 real programming tasks [1] - The model has been deployed on leading domestic AI chips from Cambrian using FP8+Int4 mixed-precision quantization, marking the first production of an FP8+Int4 model integrated with chip solutions on domestic chips [1] - Additionally, Moore Threads has adapted GLM-4.6 to the vLLM inference framework, enabling the new generation of GPUs to run the model stably at native FP8 precision [1]