AIME'25满分炸场！Qwen一波七连发，全家桶大更新

Core Viewpoint - The new flagship model Qwen3-Max has achieved a perfect score of 100 in the AIME25 and HMMT mathematics evaluation rankings, marking a significant milestone for domestic large models [1][5]. Group 1: Model Performance - Qwen3-Max maintains a parameter scale exceeding one trillion, with improvements in both emotional and cognitive intelligence [3][4]. - The instruction version scored 69.6 in the SWE-Bench evaluation, ranking it among the global top tier [6]. - In the Tau2 Bench test, Qwen3-Max surpassed Claude Opus4 and DeepSeek V3.1, achieving a score of 74.8 [7]. Group 2: Visual Understanding Model - The visual understanding model Qwen3-VL has been open-sourced and is noted for its strong performance in mainstream visual perception evaluations, even exceeding Gemini 2.5 Pro [12][16]. - Qwen3-VL supports tasks such as generating HTML and CSS from sketches and identifying objects in images, showcasing its advanced capabilities [20][23]. Group 3: Technical Innovations - Qwen3-VL employs a new MRoPE-Interleave design for better temporal information distribution, enhancing long video understanding while maintaining image comprehension [31]. - The model integrates DeepStack for improved visual detail capture and text-image alignment, significantly boosting performance across various visual understanding tasks [32]. Group 4: Multi-Modal Capabilities - Qwen3-Omni, the first end-to-end multi-modal AI model, has been introduced, achieving state-of-the-art performance across 22 audio-visual benchmarks [33]. - The Qwen3-LiveTranslate model offers real-time translation capabilities in 18 languages, demonstrating its versatility in audio-visual tasks [36][37]. Group 5: Future Directions - The company aims to develop super artificial intelligence (ASI) through a four-stage process, with large models expected to become the next generation operating systems [62][63]. - The newly released Qwen3-Next model architecture boasts approximately 80 billion parameters, with significant improvements in computational efficiency and cost reduction for training [68][69].