智谱新一代旗舰模型再达开源SOTA：持续探索AGI上限，直接叫板OpenAI

Core Viewpoint - The article discusses the launch of GLM-4.5, an open-source state-of-the-art model designed for reasoning, coding, and agent applications, marking a significant technological breakthrough for the company [2]. Group 1: Model Overview - GLM-4.5 integrates reasoning, coding, and agent capabilities into a single model, achieving state-of-the-art performance in open-source [3]. - The model features a mixed expert (MoE) architecture with a total parameter count of 355 billion and 32 billion active parameters for GLM-4.5, while GLM-4.5-Air has 106 billion total parameters and 12 billion active parameters [3]. - GLM-4.5 offers two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for immediate responses [3]. Group 2: Performance Metrics - The model was evaluated against 12 representative benchmarks, achieving the third-best global score, the first among domestic models, and the top position among open-source models [4]. - GLM-4.5 has half the parameter count of DeepSeek-R1 and one-third of Kimi-K2, yet outperforms them in several benchmark tests due to higher parameter efficiency [6]. Group 3: Cost and Efficiency - GLM-4.5 demonstrates significant cost and efficiency improvements, with API call prices as low as 0.8 yuan per million tokens for input and 2 yuan per million tokens for output [8]. - The model was tested against Claude Code, Claude-4-Sonnet, Kimi-K2, and Qwen3-Coder across 52 programming tasks, showing competitive advantages, particularly in tool invocation reliability and task completion rates [10]. Group 4: Broader Implications - In July, the company also released GLM-4.1V-Thinking, a new generation multimodal model that excels in reasoning capabilities, achieving top scores in 23 out of 28 authoritative evaluations [11]. - The company is rapidly expanding its overseas business, providing infrastructure solutions to various countries in Southeast Asia, the Middle East, and Africa [13]. - The company aims to establish a "verifiable, responsible, and standardized" technological image in emerging markets ahead of its Western competitors [14].