智谱唐杰:2025年可能是多模态模型的适应年

Core Viewpoint - The year 2025 may be a disappointing year for multimodal models, as many of them have not garnered significant attention and are still focused on enhancing text intelligence limits [1] Group 1: Multimodal Models - Many multimodal models are currently not receiving much attention and are primarily working on improving text intelligence [1] - The challenge for large models is to collect and unify multimodal information, which remains a shortcoming [1] Group 2: Human Sensory Integration - The concept of native multimodal models is compared to human sensory integration, which involves collecting visual, auditory, and tactile information [1] - The next functionality for models is to advance in the area of sensory integration, similar to how humans sometimes experience sensory coordination issues [1]