Qwen2.5
Search documents
别直接训,给主模型加个错题本,6B轻松超越8B
3 6 Ke· 2025-12-25 07:05
【导读】传统训练只关注模型输出对错,最新研究在大模型训练中引入「错题本」,记录了模型犯错时的内部思考状态,包括问题、推理过程和错误位 置,更接近人类反思学习。通过辅助模型学习这些「错题本」,能实时校正主模型预测,提升性能。 很多人回顾自己的学习经历时都会发现:能力真正产生跃迁,并不是刷题数量最多的时候,而是开始系统整理「错题本」的阶段。 关键并不在于把错误答案抄下来,而在于持续追问——当时为什么会这么想?是哪一步的判断出现了偏差?这种错误是偶发的,还是反复出现的思维模 式? 正是通过这种反思式学习,人类逐渐学会识别自身的「错误规律」,在复杂和不确定问题面前变得更加稳健。 那么,一个问题随之而来:大语言模型有没有属于自己的「错题本」? 在当前主流训练范式中,大模型的学习过程高度简化为一个循环: 从本质上看,这一过程强调的是「如何更好地拟合正确答案」。 模型只需要知道结果对不对,而并不真正关心:我当时是通过怎样的内部推理路径走到这个错误结论的? 这也揭示了一个关键缺失:当前的大模型并不缺数据,也不缺算力,而是缺少一种类似人类的深度反思能力——即围绕错误本身展开的结构化复盘。 伊利诺伊大学厄巴纳-香槟分校、普林斯顿 ...
港大领衔DrivePI:统一自动驾驶理解、感知、预测和规划的空间智能4D MLLM
自动驾驶之心· 2025-12-22 09:20
编辑 | 自动驾驶之心 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Zhe Liu等 尽管多模态大语言模型(MLLMs)在各种领域展示了强大的能力,但它们在自动驾驶中生成精细化3D感知和预测输出的应用仍有待探索。本文提出了DrivePI,一种新 型的空间感知4D MLLM,作为统一的视觉-语言-行为(VLA)框架,同时兼容视觉-行为(VA)模型。我们的方法通过端到端优化,并行执行空间理解、3D感知(如3D占用 体素)、预测(如占用流)和规划(如动作输出)任务。为了获取精确的几何信息和丰富的视觉外观,我们的方法在统一的MLLM架构中集成了点云、多视角图像和语言指 令。我们还开发了一个数据引擎,用于生成文本-占用和文本-流问答对,以实现4D空间理解。 值得注意的是,仅使用0.5B参数的Qwen2.5模型作为MLLM主干网络,DrivePI作为单一统一模型,性能已经匹配或超越了现有的VLA模型和专业的VA模型。具体而 言,与VLA模型相比,DrivePI在nuScenes-QA上的平均准确率比 ...
普元信息:公司相关产品与阿里云专有云产品通过产品生态集成认证
Zheng Quan Ri Bao Wang· 2025-11-26 13:41
Core Insights - The company, Puyuan Information, confirmed on November 26 that its products have achieved integration certification with Alibaba Cloud's private cloud products [1] - Currently, the company's products are connected to open-source models such as Qwen2.5, Qwen3.0, and QwQ-32B [1]
普元信息:截至目前公司产品已接入Qwen2.5、Qwen3.0、QwQ-32B等开源模型
Ge Long Hui· 2025-11-26 09:41
Core Viewpoint - The company has integrated its products with Alibaba Cloud's private cloud products through product ecosystem integration certification [1] Group 1 - The company's products are now connected to open-source models Qwen2.5, Qwen3.0, and QwQ-32B [1]
普元信息:公司产品已接入Qwen2.5、Qwen3.0、QwQ-32B等开源模型
Mei Ri Jing Ji Xin Wen· 2025-11-26 09:41
Group 1 - The core viewpoint of the article highlights the collaboration between Puyuan Information and Alibaba's ecosystem, specifically regarding product integration and certification [2]. - Puyuan Information confirmed that its products have achieved integration certification with Alibaba Cloud's proprietary cloud products [2]. - As of now, Puyuan Information's products have been integrated with open-source models such as Qwen2.5, Qwen3.0, and QwQ-32B [2].
普元信息(688118.SH):截至目前公司产品已接入Qwen2.5、Qwen3.0、QwQ-32B等开源模型
Ge Long Hui· 2025-11-26 09:40
Core Viewpoint - The company has integrated its products with Alibaba Cloud's private cloud products through product ecosystem integration certification [1] Group 1 - The company's products are now connected to open-source models Qwen2.5, Qwen3.0, and QwQ-32B [1]
淘宝终于对搜索动刀了
虎嗅APP· 2025-11-11 23:53
Core Viewpoint - The article discusses the rapid evolution of AI in the e-commerce sector, particularly focusing on Alibaba's Taobao platform, highlighting the integration of AI tools to enhance user experience and operational efficiency during the 2024 Double Eleven shopping festival [4][11][30]. Group 1: AI Integration in E-commerce - Taobao's AI tools, such as "Xiao Wan Assistant," have significantly improved sales performance, with some brands reporting over 35% increase in orders after implementing AI-driven strategies [4][11]. - The establishment of the "Search and Promotion Intelligent Product Division" under the leadership of Zhang Kaifu marks a strategic shift towards AI-driven enhancements in Taobao's search and recommendation systems [7][12]. - The urgency for improving user experience in search functionalities was driven by customer feedback on social media, indicating a need for immediate action to address dissatisfaction [8][10]. Group 2: Challenges and Strategic Focus - The search experience issues stem from 22 years of accumulated complexities in Taobao's search engine, necessitating a comprehensive upgrade involving business, technology, and supply chain collaboration [9][19]. - The team identified three key focus areas for AI evolution by 2025: upgrading search and promotion systems, enhancing efficiency for merchants, and launching new AI-driven shopping products for consumers [16][21]. - The transition to AI-driven systems requires a complete overhaul of the existing product database to ensure compatibility with AI technologies, which is a significant undertaking [20][21]. Group 3: Organizational Changes and Talent Development - The internal structure has shifted to support AI initiatives, with a focus on creating flexible project teams that can innovate without being constrained by traditional metrics [24][25]. - A significant recruitment drive has targeted young talent from local universities, emphasizing the importance of nurturing creativity and technical skills in the AI domain [26][27]. - The company has implemented a systematic training approach for new hires, ensuring they are equipped with the necessary skills to contribute effectively to AI projects [27]. Group 4: Performance Metrics and Future Outlook - As of November 8, 2024, the AI-driven search and promotion capabilities have led to a 12% increase in advertising ROI and a 20% improvement in search relevance under complex queries [29][30]. - Despite initial successes, challenges remain in educating traditional merchants about AI tools, indicating a need for ongoing support and training [31]. - The company views AI as a long-term strategic focus, with plans for increased investment and further development of AI capabilities in the coming years [32][33].
清华唐杰新作:大模型能打掼蛋吗?
量子位· 2025-09-10 10:01
Core Viewpoint - The research indicates that large models can effectively play various card games, demonstrating their capabilities in complex decision-making scenarios [2][4][52]. Group 1: Model Performance - Different models exhibit varying performance across different card games, with fine-tuned models showing superior results compared to API-based and base models [3][40]. - Among the API-based models, GPT-4o performs the best overall, while GLM-4 demonstrates strong capabilities in games like DouDizhu and GuanDan [39][40]. - Fine-tuned models, particularly GLM4-9B-Chat-mix, excel in multiple games, including DouDizhu, GuanDan, and Uno, indicating their versatility [42][40]. Group 2: Game Selection and Learning Methodology - The research team selected eight popular card games based on their complexity and the availability of high-quality models and data [8]. - The learning process involved generating high-quality interaction data through teacher models and opponents, allowing the large language models to learn effectively [14][16]. - The complexity of the games influenced the number of training instances collected, with more complex games like DouDizhu and GuanDan requiring larger datasets [20][21]. Group 3: Inter-Game Influence - The study found that models trained on similar games can enhance each other's performance, while those trained on games with significant rule differences may experience performance conflicts [52][49]. - For instance, models trained on GuanDan showed good performance in DouDizhu, suggesting a positive transfer of skills between these games [45]. Group 4: Generalization and Capability - The research indicates that while training on card games, the general capabilities of the models may decline, but this can be mitigated by incorporating general data into the training process [56][54]. - The mixed training approach allowed for some recovery of general capabilities, demonstrating the balance between specialized game skills and broader knowledge [56].
自搜索强化学习SSRL:Agentic RL的Sim2Real时刻
机器之心· 2025-09-02 01:27
Core Insights - The article discusses the development and effectiveness of SSRL (Structured Search Reinforcement Learning) in enhancing the training efficiency and stability of Search Agents using large language models (LLMs) [6][28] - SSRL demonstrates superior performance over traditional methods that rely on external search engines, achieving effective transfer from simulation to real-world applications (Sim2Real) [6][28] Group 1 - SSRL utilizes structured prompts and format rewards to effectively extract world knowledge from models, leading to improved performance across various benchmarks and reduced hallucination [2][6] - The research highlights the high costs and inefficiencies associated with current RL training methods for Search Agents, which include full-real and semi-real search approaches [7][13] - The introduction of SSRL allows for a significant increase in training efficiency, estimated at approximately 5.6 times, while maintaining a continuous increase in training rewards without collapse [31][32] Group 2 - Experiments show that models trained with SSRL outperform those relying on external engines, particularly in real-world search scenarios, indicating the importance of integrating real-world knowledge [28][31] - The article presents findings that suggest the combination of self-generated knowledge and real-world knowledge can enhance model performance, particularly through entropy-guided search strategies [34] - The integration of SSRL with TTRL (Task-Driven Reinforcement Learning) has shown to improve generalization and effectiveness, achieving up to a 67% performance increase in certain tasks [38][39]
吴伟:中国科技崛起吹响AI平权的号角
Huan Qiu Wang Zi Xun· 2025-09-01 22:53
Group 1 - The 2025 Global AI Influence List by Time magazine features several Chinese entrepreneurs and scholars, indicating a significant increase in representation and diversity compared to previous years [1] - The rise of Chinese figures on the list reflects the rapid development of China's AI industry and its increasing presence on the international stage, as well as the global trend of "de-geographicalization" in technology [1] - The open-source technology path taken by DeepSeek contributes to a more inclusive global technology landscape, enhancing the openness and participation of the AI industry [1] Group 2 - Southeast Asia is actively seizing opportunities from the "de-geographicalization" wave in AI, with the region's digital economy projected to reach $2 trillion by 2030, and the AI market expected to exceed $580 billion [2] - Countries like Singapore, Malaysia, and Indonesia are implementing national AI strategies and attracting significant investments from major tech companies, indicating a shift towards technological self-sufficiency [2] - The rise of local innovation in developing countries is seen as a way to dismantle external technological monopolies and empower these nations as creators of AI technology [2] Group 3 - Despite the concentration of top AI talent in the U.S., Chinese talent now accounts for 38% of the top AI research institutions in the U.S., surpassing the 37% of local talent [3] - The increase in homegrown talent and the return of overseas scholars signal a promising future for China's talent strategy focused on local cultivation and talent repatriation [3] - China's AI industry is characterized by a systematic innovation paradigm driven by top-level policies, autonomous innovation, and a commitment to long-termism [3] Group 4 - The performance gap between Chinese and U.S. large models has dramatically decreased from 17.5% in 2023 to just 0.3% [4] - China's unique advantages in open-source ecosystem development and vertical application innovation have contributed to this rapid advancement [4] - The success of China's AI rise is attributed to the establishment of an open, symbiotic ecosystem that fosters talent and continuous innovation, providing a valuable model for global AI development [4]