Qwen深度研究一夜升级！可生成网页和音频播客，新模型能认医生手写体

Core Insights - The article discusses the advancements in the Qwen deep research capabilities, highlighting the addition of auditory and visual outputs, enabling the generation of web pages and audio content [1][2]. Group 1: New Features and Functionalities - The Qwen deep research tool can now convert lengthy text into audio podcasts, facilitating easier consumption of information during fragmented time [3]. - Compared to the previously popular NoteBookLM, the deep research tool eliminates the need for users to provide content to the AI, streamlining the input process [4]. - The latest visual language model, Qwen3 VL, can even recognize difficult handwritten notes, showcasing significant improvements in model capabilities [7]. Group 2: User Interaction and Experience - Upon activating the deep research feature, the system defaults to the most powerful Qwen3-Max model and first confirms the user's specific intent before proceeding [9][10]. - The entire operation takes approximately six minutes, resulting in a traditional AI text response and a downloadable PDF file [12][15]. Group 3: Performance Metrics and Comparisons - The Qwen3-VL series has been updated with a maximum parameter version of 32 billion and a minimum of 2 billion, with the team indicating this is the final update for this series [28][29]. - Performance evaluations show that the 32 billion version surpasses the previous Qwen2.5-VL's 72 billion version and competes favorably against closed-source solutions from OpenAI and Anthropic [30]. Group 4: Deployment and Accessibility - Users can generate a simple and aesthetically pleasing web page with dynamic effects, including a day/night mode, enhancing the visual presentation of AI-generated research results [19][20]. - The deployment feature allows users to publish their web content either publicly or privately, providing flexibility in sharing information [22].