Workflow
多模态深度推理
icon
Search documents
国产大模型紫东太初4.0发布!
Huan Qiu Wang Zi Xun· 2025-10-05 04:16
Core Insights - The release of ZDTC 4.0 marks a significant upgrade in the deep reasoning capabilities of domestic large models, transitioning from basic text processing to advanced multimodal reasoning [1] Group 1: Model Development - ZDTC has undergone four iterations since its initial launch in 2021, evolving from "pure text thinking" to "fine-grained multimodal semantic thinking" [1] - The latest version enables the model to perform complex tasks dynamically and exhibit clear, interpretable reasoning processes at the visual semantic level [1] Group 2: Practical Applications - The model can understand audio commands, such as scheduling a medical appointment, and can operate applications automatically based on user symptoms [1] - In video comprehension, it can accurately locate segments and summarize content from lengthy videos, demonstrating its advanced processing capabilities [1] - ZDTC has been implemented in various industries, including embodied intelligence, low-altitude economy, and smart healthcare, providing customized solutions for urban infrastructure and industry needs [1]
紫东太初4.0发布 国产大模型深度推理能力再升级
Xin Hua She· 2025-10-05 02:27
据了解,紫东太初已在具身智能、低空经济、智慧医疗等多个产业中实现布局,为城市基础设施与行业 需求提供定制化解决方案。 近日,由中国科学院自动化研究所联合武汉人工智能研究院研发的紫东太初4.0多模态推理大模型发 布。自2021年首次推出以来,紫东太初已完成4次迭代,实现了从"纯文本思考""简单操作带图思 考"到"细粒度多模态语义思考"的跃迁,迈向多模态深度推理的新阶段。 中国科学院自动化研究所研究员、武汉人工智能研究院院长王金桥介绍,"细粒度多模态语义思考"是指 大模型能像人一样主动深度思考,不仅能动态适应和处理更复杂的任务,还能在视觉语义层面展现出清 晰且可解释的推理过程,实现"边看、边识、边思"。 "比如在音频理解中,用户对紫东太初说'我想挂一个呼吸科的号',它能自动操作APP并根据症状选择门 诊;在视频理解中,它能对180分钟的长视频进行片段精准定位和内容总结。"王金桥说,此外,它还能 在真实场景中通过汽车、机器人等"动手操作"。 (文章来源:新华社) ...
不靠价格战,豆包大模型靠技术杀出重围
Jing Ji Guan Cha Wang· 2025-06-12 13:51
Core Insights - ByteDance's subsidiary Volcano Engine launched new AI models, including Doubao 1.6 and Seedance 1.0 pro, at the Force Original Power Conference, marking a significant step towards the Agentic AI era [1][2] - The Doubao model has achieved a daily token usage of over 16.4 trillion, a 137-fold increase since its initial release, and holds a 46.4% market share in China's public cloud model market [1][2] - The company emphasizes long-term investment in technology innovation to enhance industrial applications and maintain a competitive edge in the AI landscape [2][13] Product Development - Doubao 1.6 supports multi-modal understanding and graphical interface operations, allowing it to perform tasks such as booking hotels and organizing receipts into Excel [3][5] - Seedance 1.0 pro can generate high-quality 1080P videos with seamless transitions, ranking first globally in video generation tasks [3][5] - The introduction of a pricing model based on input length significantly reduces costs, making advanced AI capabilities more accessible to enterprises [5][8] Market Positioning - Doubao models are utilized by 9 out of the top 10 global smartphone manufacturers, 80% of mainstream automotive brands, and 70% of systemically important banks in China [2][6] - The rapid growth in token consumption across various applications indicates a deepening integration of AI models in multiple industries, including finance, automotive, and education [4][6] Strategic Vision - The company aims to redefine the role of AI in business processes, transitioning from traditional software to Agent-based systems that enhance productivity [13][16] - ByteDance's commitment to technology innovation and cost reduction reflects a balanced approach to achieving commercial success while addressing social responsibilities [14][15] Industry Impact - The rise of Agentic AI is seen as a pivotal moment for digital transformation across industries, with the potential to reshape business processes and industry dynamics [16] - ByteDance's advancements in AI technology are expected to drive significant changes in how enterprises operate, enhancing efficiency and fostering innovation [16]