速递｜Meta发布Llama 4，首批采用混合专家模型，但非真正的推理模型

Core Insights - Meta has released a new series of AI models called Llama 4, which includes Llama 4 Scout, Llama 4 Maverick, and Llama 4 Behemoth, trained on a vast amount of unlabelled text, images, and video data to enhance their visual understanding [1][3] - The development of Llama models has accelerated due to the success of open-source models from China's DeepSeek, prompting Meta to establish a war room to analyze cost reductions in running and deploying models [1][2] - Llama 4 models represent a new era for the Llama ecosystem, utilizing a mixture of experts (MoE) architecture for improved computational efficiency [3] Model Performance and Capabilities - According to internal testing, Maverick excels in general assistant and chat scenarios, outperforming OpenAI's GPT-4o and Google's Gemini 2.0 in various benchmarks, although it still lags behind more advanced models like Google’s Gemini 2.5 Pro and Anthropic’s Claude 3.7 Sonnet [4] - Scout is particularly strong in document summarization and reasoning over large codebases, featuring a unique context window of 10 million tokens, allowing it to handle extremely long documents [4] - Behemoth, which is still in training, is expected to require more powerful hardware and has 288 billion active parameters, surpassing GPT-4.5 and Claude 3.7 Sonnet in STEM skill evaluations [5] Licensing and Regulatory Considerations - Developers may raise concerns regarding the licensing of Llama 4, as users and companies registered in the EU are prohibited from using or distributing these models, likely due to AI and data privacy laws [2] - Companies with over 700 million monthly active users must apply for special permission from Meta to use the models, with Meta having discretion over granting such permissions [2]