Workflow
时隔两月,Mistral AI终于上新Medium 3,近期还有「One more thing」
机器之心·2025-05-08 05:51

Core Insights - Mistral AI has launched Mistral Medium 3, a new language model that enhances efficiency and usability, outperforming GPT-4o and Claude 3.7 Sonnet in key benchmark tests [2][4]. Model Performance - Mistral Medium 3 is positioned between lightweight and large-scale models, achieving over 90% performance compared to Claude 3.7 Sonnet while costing only 1/8 of its price, with input costs at $0.4 per million tokens and output costs at $2 per million tokens [2]. - In coding benchmarks, Mistral Medium 3 shows strong performance, with a 92.1% score in HumanEval, comparable to Claude 3.7 Sonnet and GPT-4o [5][6]. - The model excels in various programming tasks, outperforming Llama 4 Maverick in 82% of scenarios and Command-A in nearly 70% of cases [7]. Multimodal and Language Capabilities - Mistral Medium 3 demonstrates competitive performance across multiple languages, achieving higher success rates than Llama 4 Maverick in English (67%), French (71%), Spanish (73%), and Arabic (65%) [8]. - The model has shown outstanding results in multimodal tasks, scoring 0.953 in DocVQA, 0.937 in AI2D, and 0.826 in ChartQA [8]. Enterprise Solutions - Mistral AI has introduced Le Chat Enterprise, a chatbot service designed for businesses, which integrates AI capabilities into a privacy-focused environment [10][12]. - Le Chat Enterprise supports deep customization and integration with third-party services like Gmail, Google Drive, and SharePoint, and will soon adopt the MCP standard for connecting AI assistants with data systems [13].