Core Insights - Alibaba has launched its flagship reasoning model Qwen3-Max-Thinking, which is the largest and most powerful model in its series, boasting over 1 trillion parameters and a pre-training data volume of 36 trillion tokens [1][6]. Model Performance - The new model has achieved significant performance improvements, setting global records in key benchmarks related to scientific knowledge, mathematical reasoning, and code programming [3]. - Qwen3-Max-Thinking employs a novel Test-time Scaling mechanism that enhances reasoning performance while being more economical compared to traditional methods, which often lead to redundant reasoning [4]. Reasoning Efficiency - The innovative reasoning technology allows for "experience extraction" from previous results, enabling more efficient multi-round self-iteration within the same context, resulting in smarter reasoning outcomes [4]. - In the "Human Last Evaluation" (HLE) test, Qwen scored 58.3, surpassing GPT-5.2-Thinking's score of 45.5 and Gemini 3 Pro's score of 45.8, marking the highest score among all models [4]. Tool Utilization - Qwen3-Max-Thinking has significantly enhanced its native agent capabilities for autonomous tool usage, allowing it to intelligently combine tools for various tasks [5]. - The model has undergone joint reinforcement learning training based on rule rewards and model rewards, improving its ability to think while using tools [5]. Accessibility - Developers can experience Qwen3-Max-Thinking for free on QwenChat, while enterprises can access the new model's API services through Alibaba Cloud [7]. - The Qwen app is set to integrate the new model, allowing all users to experience the strongest model for free [7].
千问最强模型来了!多项性能破全球纪录