Core Insights - The latest GLM-4.6 model has been launched, showcasing a 27% improvement in coding capabilities compared to its predecessor GLM-4.5, excelling in real programming, long context handling, and reasoning abilities [1] - GLM-4.6 achieved the highest domestic standard in public benchmark tests and surpassed other domestic models in 74 real programming tasks [1] - The model has been deployed on leading domestic AI chips from Cambrian using FP8+Int4 mixed-precision quantization, marking the first production of an FP8+Int4 model integrated with chip solutions on domestic chips [1] - Additionally, Moore Threads has adapted GLM-4.6 to the vLLM inference framework, enabling the new generation of GPUs to run the model stably at native FP8 precision [1]
智谱旗舰模型GLM-4.6上线 寒武纪、摩尔线程已完成适配