原生多模态训练
Search documents
千问3.5除夕开源!可通过千问APP免费体验
Xin Lang Cai Jing· 2026-02-16 13:00
Core Insights - Alibaba has launched its new generation large model Qwen3.5-Plus, which reportedly rivals Gemini 3 Pro in performance, featuring a total parameter count of 397 billion and an activation of only 17 billion, achieving superior performance with reduced memory usage and significantly enhanced inference efficiency [1][3] Group 1: Model Performance and Features - Qwen3.5-Plus has achieved a major leap in performance, surpassing the previous Qwen3-Max model, with a maximum inference throughput improvement of up to 19 times [1][3] - The model has transitioned from a pure text model to a native multimodal model, incorporating visual and text mixed tokens for pre-training, which enhances its reasoning logic and world knowledge [1][2] - In various authoritative evaluations, Qwen3.5 has demonstrated superior performance in multimodal reasoning, visual question answering, and video understanding, outperforming previous specialized models [2] Group 2: Technical Innovations - The performance improvements of Qwen3.5 are attributed to significant innovations in the classic Transformer architecture, including the integration of gating technology and a hybrid architecture that combines linear attention mechanisms with sparse mixture of experts (MoE) [3] - The model's training efficiency has been enhanced through advanced techniques, achieving a throughput increase of 8.6 times in common contexts and up to 19 times in ultra-long contexts [3][5] Group 3: Application and Market Impact - Qwen3.5's multimodal training has been efficiently executed on Alibaba Cloud's AI infrastructure, significantly lowering the difficulty threshold for native multimodal training [5] - The model has been integrated into the Qwen App and PC, enabling it to autonomously perform tasks on mobile and desktop platforms, thus enhancing operational efficiency [6] - The Qwen App has successfully executed 120 million orders in just six days during the Spring Festival, marking a significant milestone in real-world task execution and commercialization [6] Group 4: Future Developments - Alibaba plans to continue releasing various sizes and functionalities of the Qwen3.5 series models, with a more powerful flagship model, Qwen3.5-Max, set to be launched soon [7] - The Qwen model family has seen over 400 models open-sourced since 2023, with a global download count exceeding 1 billion, indicating strong developer interest and engagement [6][7]
千问3.5,除夕开源!
Shang Hai Zheng Quan Bao· 2026-02-16 11:08
Core Insights - Alibaba has launched the new generation model Qwen3.5-Plus, which performs comparably to Gemini 3 Pro, with plans to release various sizes and functionalities of the Qwen3.5 series models soon [2][6] - The Qwen3.5 model represents a significant leap from previous versions, transitioning from a pure text model to a native multimodal model, enhancing its capabilities in reasoning and knowledge acquisition [4][8] Performance Metrics - Qwen3.5 achieved a score of 87.8 in the MMLU-Pro knowledge reasoning evaluation, surpassing GPT-5.2, and scored 88.4 in the GPQA assessment, exceeding Claude 4.5 [4] - In the IFBench instruction-following evaluation, Qwen3.5 set a record with a score of 76.5, outperforming all other models [4] - The model's performance in various benchmarks, including BFCL-V4 and Browsecomp, also exceeded that of Gemini 3 Pro and GPT-5.2 [4] Technical Innovations - The Qwen3.5 model features a total of 397 billion parameters, with only 17 billion activated, achieving high efficiency while reducing deployment memory usage by 60% [6][8] - Innovations in the Transformer architecture, including self-developed gating technology and a hybrid architecture combining linear attention and sparse mixture of experts (MoE), contribute to the model's efficiency [8][10] Multimodal Capabilities - Qwen3.5 has made significant advancements in visual capabilities, excelling in various evaluations such as MathVision, RealWorldQA, and CC_OCR [6] - The model supports direct input of videos up to 2 hours long, enhancing its ability to analyze and summarize long video content [6] Market Impact - The Qwen3.5-Plus model's API pricing is significantly lower, at 0.8 yuan per million tokens, which is only 1/18 of the cost of Gemini 3 Pro [6] - Since its open-source launch, Alibaba has released over 400 Qwen models, achieving over 1 billion downloads globally, with a monthly download volume surpassing that of the next seven competitors combined [12]