Core Insights - Alibaba has officially launched its flagship reasoning model Qwen3-Max-Thinking, which boasts over 1 trillion parameters and a pre-training data volume of 36 trillion tokens, making it the largest and most capable model in the Qwen series [1][2] - The model has set new state-of-the-art (SOTA) records in 19 recognized benchmark tests, demonstrating performance comparable to leading models such as GPT-5.2-Thinking, Claude Opus 4.5, and Gemini 3 Pro [1][4] Model Capabilities - Qwen3-Max-Thinking introduces adaptive tool invocation capabilities, allowing the model to autonomously select and utilize built-in functions like search, memory, and code interpreter without manual user input [3] - The model employs a testing expansion technique that reduces computational waste from repetitive tasks by focusing on unresolved uncertainties, enhancing context utilization efficiency [3] Performance Metrics - In the C-Eval Chinese authoritative assessment, Qwen3-Max-Thinking achieved a score of 93.7, ranking first globally and outperforming foreign models in understanding complex Chinese contexts [6] - The model scored 90.2 in the Arena-Hard v2 adversarial interaction test, significantly ahead of GPT-5.2's 85.3 and Gemini 3 Pro's 81.7, showcasing its ability to capture user subtleties and provide human-like responses [6] - In the intelligent agent tool search test (HLE (w/tools)), Qwen3-Max-Thinking scored 49.8, surpassing GPT-5.2-Thinking, demonstrating its capability to autonomously solve problems [6] Application Integration - The Qwen APP is set to integrate the new model, allowing all users to experience its capabilities, with over 400 AI service functions being launched [7] - The app has already connected with Alibaba's ecosystem, enabling functionalities such as food delivery, shopping, and flight booking, enhancing user interaction with AI [7] - The "Task Assistant" feature has been initiated for testing, which includes multi-step planning capabilities across various applications, with plans for broader user access post-testing [8] Future Developments - Alibaba plans to expand its AI capabilities globally through an overseas version, with significant investments in AI infrastructure anticipated by 2025 [9] - The company is mobilizing over a hundred developers to support the project, reflecting its commitment to both service development and the underlying technology infrastructure [9]
阿里发布千问最强模型,多项测试获全球第一