Core Insights - The domestic large model company Zhipu has officially released and open-sourced its next-generation large model GLM-4.6, achieving significant advancements in core capabilities such as Agentic Coding [1] Group 1: Model Development - GLM-4.6 has been deployed on Cambricon AI chips using FP8+Int4 mixed precision computing technology, marking the first production of an FP8+Int4 model on domestic chips [1] - This mixed-precision solution significantly reduces inference costs while maintaining model accuracy, providing a feasible path for localized operation of large models on domestic chips [1] Group 2: Ecosystem Compatibility - Moore Threads has adapted GLM-4.6 based on the vLLM inference framework, demonstrating that the new generation of GPUs can stably run the model at native FP8 precision [1] - This adaptation validates the advantages of the MUSA (Meta-computing Unified System Architecture) and full-function GPUs in terms of ecological compatibility and rapid adaptability [1] Group 3: Industry Implications - The collaboration between Cambricon and Moore Threads on GLM-4.6 signifies that domestic GPUs are now capable of iterating in tandem with cutting-edge large models, accelerating the construction of a self-controlled AI technology ecosystem [1] - The combination of GLM-4.6 and domestic chips will initially be offered to enterprises and the public through the Zhipu MaaS platform [1]
智谱正式发布并开源新一代大模型GLM-4.6 寒武纪、摩尔线程完成适配