Core Viewpoint - Mistral AI has launched the Mistral 3 series of open models, which are positioned as high-performance, cost-effective alternatives in the AI model landscape, particularly in response to competition from DeepSeek [2][4][28]. Model Details - The Mistral 3 series includes multiple models: Mistral 3 (14B, 8B, 3B) with base, instruction-tuned, and reasoning versions [5][19]. - Mistral Large 3, a state-of-the-art open model, features a total parameter count of 675 billion and 41 billion active parameters, trained on 3000 NVIDIA H200 GPUs [7][5]. Performance and Benchmarking - Mistral Large 3 ranks second in the OSS non-inference model category on the LMArena leaderboard, indicating it is one of the best-performing open models available [14]. - The model demonstrates strong performance in general prompt tasks and excels in image understanding and multilingual dialogue [7][14]. Collaboration and Optimization - Mistral has partnered with vLLM and Red Hat to enhance accessibility and efficiency for developers using Mistral Large 3, utilizing optimized checkpoints for better performance [17][18]. - The collaboration with NVIDIA focuses on advanced optimization techniques, ensuring that Mistral models leverage high-bandwidth memory for demanding workloads [17][18]. Cost-Effectiveness - Mistral claims that its models offer the best cost-performance ratio among open-source models, with instruction models performing comparably or better than competitors while generating tokens at a significantly lower rate [22][28]. Availability and Customization - Mistral 3 models are available on various platforms including Mistral AI Studio, Amazon Bedrock, and Azure Foundry, among others [25]. - The company also offers custom model training services to organizations seeking tailored AI solutions for specific tasks or environments [27].
刚刚,「欧洲的DeepSeek」发布Mistral 3系列模型,全线回归Apache 2.0
机器之心·2025-12-03 00:06