Core Insights - The article discusses the release of the Zhidong Taichu 4.0 multimodal reasoning model developed by the Institute of Automation, Chinese Academy of Sciences, and Wuhan Artificial Intelligence Research Institute, marking its fourth iteration since its initial launch in 2021 [1] Group 1: Model Development - Zhidong Taichu has evolved from "pure text thinking" and "simple operations with image thinking" to "fine-grained multimodal semantic thinking," indicating a significant advancement towards deep multimodal reasoning [1] - The model is designed to actively and deeply think like a human, dynamically adapting to and processing more complex tasks while providing clear and interpretable reasoning processes at the visual semantic level [1] Group 2: Practical Applications - In audio understanding, the model can execute tasks such as booking a respiratory specialist appointment based on user symptoms through an app [1] - In video understanding, it can accurately locate segments and summarize content from long videos, such as those lasting 180 minutes [1] - The model is also capable of "hands-on operations" in real-world scenarios using vehicles and robots [1] Group 3: Industry Impact - Zhidong Taichu has been deployed across various industries, including embodied intelligence, low-altitude economy, and smart healthcare, providing customized solutions for urban infrastructure and industry needs [1]
国产大模型紫东太初4.0发布
Xin Hua Wang·2025-10-14 02:40