Core Insights - The next-generation flagship model GLM-5 is set to be released, featuring a total parameter count of 745 billion, which is double that of its predecessor GLM-4.7 [2][4] - The model architecture has been confirmed through GitHub code submissions, revealing that GLM-5 utilizes the DeepSeek-V3/V3.2 architecture, incorporating sparse attention mechanisms and multi-token prediction [2][4] Model Architecture - GLM-5 employs DeepSeek Sparse Attention (DSA) to enhance long text processing efficiency while maintaining output quality, utilizing a two-phase process to evaluate token relevance [5] - The model consists of 78 hidden layers and follows a mixture of experts (MoE) architecture with 256 experts, activating 8 experts per inference, resulting in approximately 44 billion active parameters [5][6] - The context window supports up to 202,000 tokens, indicating a significant increase in capacity compared to previous models [5] Market Impact - The emergence of an anonymous model named "Pony Alpha" on the OpenRouter platform has led to speculation that it is a test version of GLM-5, causing a 60% surge in the stock price of Zhizhu AI over two days [2][4][11] - The timing of Pony Alpha's release aligns with predictions from Zhizhu's chief scientist regarding GLM-5's launch window, which is expected around mid-February 2026 [11][15] Community Response - The developer community has shown significant interest in Pony Alpha, noting its strong programming capabilities and performance in complex reasoning tasks, leading to discussions about its origins [9][11] - Concerns have been raised regarding the potential lack of multimodal capabilities in GLM-5, as the DeepSeek-V3 architecture primarily focuses on text [7]
GLM-5架构曝光,智谱两日涨60%:采用DeepSeek同款稀疏注意力