美团开源原生多模态大模型LongCat-Next
Core Viewpoint - Meituan has launched and fully open-sourced its native multimodal model LongCat-Next, which breaks the traditional language-centric architecture of large models by unifying images, speech, and text into a common discrete token format [1] Group 1 - The LongCat-Next model utilizes a pure "next token prediction" paradigm, allowing visual and auditory inputs to become the "native language" of AI [1] - This development represents a significant step by the Meituan LongCat team towards achieving AI that interacts with the physical world [1]