Workflow
认知型生成
icon
Search documents
港股异动丨智谱高开超7%,联合华为开源首个国产芯片训练的多模态SOTA模型
Ge Long Hui· 2026-01-14 17:31
Core Viewpoint - Zhizhu (2513.HK) opened 7.1% higher at HKD 194.7, following the announcement of a collaboration with Huawei to launch the new generation image generation model GLM-Image, which is the first SOTA multimodal model fully trained on domestic chips [1] Group 1: Product Development - GLM-Image is based on the Ascend Atlas 800T A2 device and the MindSpore AI framework, completing the entire process from data to training [1] - The model employs an innovative "autoregressive + diffusion decoder" hybrid architecture, achieving a combination of image generation and language modeling [1] Group 2: Technological Significance - This development represents an important exploration for Zhizhu towards the new generation of "cognitive generation" technology paradigm, exemplified by the Nano Banana Pro [1]
英伟达H200“解禁”次日,智谱联手华为发布全国产开源多模态模型!
Guan Cha Zhe Wang· 2026-01-14 09:34
Core Viewpoint - The launch of the GLM-Image model by Zhiyuan in collaboration with Huawei marks a significant advancement in the domestic AI landscape, demonstrating that high-end computing power no longer needs to rely on imports for top-tier model training [1][16]. Group 1: Model Development and Performance - GLM-Image is the first state-of-the-art (SOTA) multimodal model trained entirely on domestic chips, showcasing the feasibility of training cutting-edge models on a fully domestic computing stack [1][12]. - The model employs a hybrid architecture of "autoregressive + diffusion decoder," achieving a combination of image generation and language modeling [1][13]. - In performance benchmarks, GLM-Image outperforms competitors like Qwen-Image and Z-Image, achieving top scores in various metrics, including a Word Accuracy of 0.9116 and a NED of 0.9557 [6][7][8]. Group 2: Economic Impact and Market Response - Following the announcement, Zhiyuan's stock surged by 18%, nearly doubling from its initial public offering price of 116.2 HKD, with a market capitalization exceeding 100 billion HKD [5]. - The model's ability to generate commercial-grade images at a cost of only 0.1 yuan per image demonstrates the economic viability of domestic computing power against international standards [15]. Group 3: Technological Innovation and Training Process - The training process for GLM-Image is optimized through a custom-built training suite that leverages Huawei's Ascend Atlas 800T A2 devices and the MindSpore AI framework, ensuring end-to-end optimization from data preprocessing to large-scale pre-training [10][12]. - The model's architecture allows for flexible image size generation without post-processing, accommodating various formats such as social media covers and movie posters [13]. Group 4: Industry Context and Future Implications - The timing of the GLM-Image launch coincides with the U.S. lifting export restrictions on NVIDIA's H200, indicating a shift in the competitive landscape where domestic solutions are now viable alternatives [16]. - This development signifies a potential turning point in China's AI industry, moving from imitation to innovation, as domestic models begin to dominate in complex Chinese language and visual generation tasks [17].
智谱高开超7%,联合华为开源首个国产芯片训练的多模态SOTA模型
Ge Long Hui· 2026-01-14 02:24
Core Viewpoint - The company Zhiyu (2513.HK) opened 7.1% higher at HKD 194.7 following the announcement of a collaboration with Huawei to launch a new generation image generation model, GLM-Image, which is the first state-of-the-art multimodal model fully trained on domestic chips [1] Group 1 - GLM-Image is based on the Ascend Atlas 800T A2 device and the MindSpore AI framework, completing the entire process from data to training [1] - The model employs an innovative "autoregressive + diffusion decoder" hybrid architecture, achieving a combination of image generation and language modeling [1] - This development represents an important exploration for Zhiyu towards a new generation of "cognitive generation" technology paradigm, exemplified by the Nano Banana Pro [1]
联合华为开源新模型,智谱涨超16%
Xin Lang Cai Jing· 2026-01-14 02:21
Core Viewpoint - The article highlights the launch of GLM-Image, a new open-source image generation model developed by Zhiyuan in collaboration with Huawei, which is the first SOTA multimodal model fully trained on domestic chips [1][9]. Group 1: Model Features and Innovations - GLM-Image is based on the Ascend Atlas 800T A2 device and the MindSpore AI framework, completing the entire training process from data to model [1][9]. - The model employs a hybrid architecture combining a 9 billion parameter autoregressive model and a 7 billion parameter DiT diffusion decoder, enhancing its ability to understand complex instructions and accurately render text [2][12]. - GLM-Image can adaptively handle various resolutions, natively supporting image generation tasks from 1024x1024 to 2048x2048 without the need for retraining [3][12]. Group 2: Performance and Applications - The model has achieved open-source SOTA levels in text rendering, demonstrating its capabilities in generating complex illustrations and diagrams with logical processes and textual explanations [6][15]. - In generating e-commerce images and multi-panel comics, GLM-Image maintains consistency in style and subject while ensuring high accuracy in text generation [8][17]. - The cost for generating an image using GLM-Image via API is only 0.1 yuan [17]. Group 3: Technological Significance - GLM-Image represents a significant exploration into the "cognitive generation" technology paradigm, marking a shift from traditional image generation to models that integrate world knowledge and reasoning capabilities [2][11]. - The model's development validates the feasibility of training high-performance multimodal generation models on domestic full-stack computing power [17].
联合华为开源新模型 智谱涨超16%
Core Viewpoint - The collaboration between Zhipu and Huawei has led to the development of GLM-Image, a new generation open-source image generation model, which is the first SOTA multimodal model trained entirely on domestic chips [2][9]. Group 1: Model Development and Features - GLM-Image is based on the Ascend Atlas 800T A2 device and the MindSpore AI framework, completing the entire training process from data to model [2][8]. - The model employs an innovative architecture that combines a 9B autoregressive model with a 7B DiT diffusion decoder, enhancing its ability to understand complex instructions and accurately render text [5][8]. - GLM-Image supports image generation tasks at various resolutions, natively accommodating sizes from 1024x1024 to 2048x2048 without the need for retraining [5][8]. Group 2: Performance and Comparisons - In authoritative rankings for text rendering, GLM-Image has achieved open-source SOTA levels, outperforming several other models such as Seedream and Nano Banana Pro [6]. - The model excels in generating illustrations that involve complex logical processes and textual explanations, maintaining consistency in style and accuracy in text generation across various formats [7]. Group 3: Economic Aspects - The cost for generating an image using GLM-Image via API is only 0.1 yuan, showcasing its affordability and accessibility [8]. - The successful training of GLM-Image on domestic chips validates the feasibility of developing high-performance multimodal generation models on a fully domestic computing stack [9].
智谱联合华为开源图像生成模型GLM-Image:首个在国产芯片完成全程训练的SOTA模型
IPO早知道· 2026-01-14 01:57
Core Viewpoint - The article discusses the successful development and open-sourcing of the GLM-Image model by Zhiyu and Huawei, marking a significant advancement in training state-of-the-art (SOTA) models on domestic chips, specifically the Ascend Atlas 800T A2 device and MindSpore AI framework [4][10]. Group 1: Model Development and Architecture - GLM-Image is the first SOTA model trained entirely on domestic chips, utilizing a self-regressive structure and a hybrid architecture combining "self-regression + diffusion encoder" to enhance global instruction understanding and local detail depiction [4][5]. - The model addresses challenges in generating knowledge-intensive scenes such as posters and PPTs, representing a step towards a new generation of "knowledge + reasoning" cognitive generative models [5][10]. - The innovative architecture allows GLM-Image to understand complex instructions and accurately render text, integrating a 9B self-regressive model with a 7B DiT diffusion decoder [7]. Group 2: Performance Metrics - GLM-Image achieved top performance in the CVTG-2K benchmark with a Word Accuracy of 0.9116 and a Normalized Edit Distance (NED) of 0.9557, indicating high accuracy in generating text within images [6][8]. - In the LongText-Bench benchmark, GLM-Image scored 0.952 in English and 0.979 in Chinese, leading among open-source models for rendering long and multi-line texts [8]. Group 3: Technical Innovations - The model supports adaptive processing of various resolutions, generating images from 1024x1024 to 2048x2048 without the need for retraining [7]. - GLM-Image's training process was optimized through advanced techniques such as dynamic graph multi-level pipeline dispatch and high-performance fusion operators, enhancing both training stability and performance [10]. Group 4: Market Implications - The introduction of GLM-Image signifies a deep exploration of domestic computing ecosystems, showcasing the potential of domestic computing power in training high-performance multimodal generative models [10]. - The model's open-source nature aims to share technological pathways and practical insights with the open-source community, potentially driving further advancements in the field [5][10].
智谱联合华为开源图像生成模型 GLM-Image
Group 1 - GLM-Image is a new generation image generation model co-developed by Zhiyuan and Huawei, applicable in various fields such as scientific illustrations, multi-panel comics, social media graphics, commercial posters, and realistic photography [2] - GLM-Image is the first state-of-the-art (SOTA) multimodal model trained entirely on domestic chips, demonstrating the feasibility of training cutting-edge models on a domestic full-stack computing foundation [2] - The model training suite developed by Zhiyuan optimizes the end-to-end process of data preprocessing, pre-training, SFT, and post-training using features like dynamic graph multi-level pipeline dispatch and high-performance fusion operators [2] Group 2 - The recent trend in image generation models, represented by Nano Banana Pro, is moving towards a deep integration of image generation and large language models, evolving from single image generation to cognitive generation with world knowledge and reasoning capabilities [3] - GLM-Image employs an innovative "autoregressive + diffusion decoder" hybrid architecture, allowing for the combination of image generation and language models, with an API call cost of only 0.1 yuan per image [3] - The "autoregressive" architecture enhances semantic understanding of instructions and overall composition of images, while the "diffusion decoder" focuses on restoring high-frequency details and text strokes, addressing the model's issue of "forgetting words while drawing" [3]
智谱联合华为开源首个国产芯片训练的多模态SOTA模型
Ge Long Hui· 2026-01-14 00:31
Core Viewpoint - The collaboration between Zhiyuan and Huawei has led to the development of GLM-Image, a new generation image generation model that is the first SOTA multimodal model trained entirely on domestic chips [1] Group 1: Model Development - GLM-Image is based on the Ascend Atlas 800T A2 device and the MindSpore AI framework, completing the entire process from data to training [1] - The model employs an innovative "autoregressive + diffusion decoder" hybrid architecture, enabling the integration of image generation and language modeling [1] Group 2: Technological Significance - This development represents a significant exploration for Zhiyuan towards a new generation of "cognitive generation" technology paradigm, exemplified by the Nano Banana Pro [1]