Workflow
美团自研560B大模型并开源,性能赶超DeepSeek
Guan Cha Zhe Wang·2025-09-08 02:46

Core Insights - Meituan has launched the LongCat-Flash-Chat model, a 560 billion parameter mixture of experts (MoE) model, showcasing its aggressive stance in the AI sector [1] - The model has achieved significant performance metrics, including training on 20 trillion tokens in 30 days and a cost of only $0.7 per million tokens, making it competitive with top industry models [1][4] Group 1: Architectural Innovations - The LongCat model utilizes a "Zero-Computation Experts" mechanism to dynamically allocate computational resources, allowing for efficient processing of low-information tokens without complex calculations [2] - Despite having 560 billion parameters, the model only activates between 18.6 billion to 31.3 billion parameters per task, achieving a balance between cost and efficiency [2] - The introduction of "Shortcut-connected MoE" allows for parallel computation and communication, significantly enhancing the model's throughput during training and inference [3] Group 2: Performance Metrics - The model boasts a single-card inference speed exceeding 100 tokens per second and supports long-context text of up to 128k tokens, indicating its high performance and low operational costs [4] - LongCat's inference speed surpasses that of other mainstream models like DeepSeek, Kimi, and Qwen3, while also demonstrating advanced agent capabilities for complex tasks [4] - The model has shown to match or exceed the performance of leading models in various aspects, including its ability to generate professional code and provide technical and legal risk assessments [4]