Qwen2 - filings, earnings calls, financial reports, news

Qwen2

Search documents

2026年，谁还能在AI牌桌上坐得住？

创业邦· 2026-01-06 00:07

以下文章来源于快鲤鱼，作者快鲤鱼快鲤鱼 . 如果将 2023 年定义为 AI 的 " 起跑年 " ， 2024 年视为 " 加速年 " ，那么刚刚过去的 2025 年，无疑是这场狂热竞赛中真正意义上的筛选赛。创业邦旗下AGI矩阵号，寻找海内外创新性的AGI高成长公司，记录AGI商业领袖的成长轨迹。当资本的热浪开始有序退潮，融资的喧嚣逐渐平息，行业终于迎来了它最残酷也最清醒的 " 成年礼 " 。模型的神话在实验室与工厂的反复碰撞中不断被戳破； IPO 的钟声与裁员公告几乎在同一时刻响起。越来越多的入局者意识到一个冰冷的现实：并非所有做 AI 的公司，都还有资格继续留在牌桌上。图源丨Midjourney 对创业者而言，这意味着：进入 2026 年，一个更现实的命题摆在所有人面前：谁能继续坐在牌桌上？在这场耐力比拼的新周期里，他们的底气究竟来自哪里？对创业者而言，答案不再是 " 我有一个大模型 " ，而是 —— 我能否用最低成本、最高效率，把 AI 变成客户愿意付费的解决方案？ 2026 年不再有 " 通用大模型创业 " 2025 年之后，一个残酷但清晰的信号已经传递到每一位 AI 创业者 ...

Artificial Intelligence

具身智能

AGI

Artificial Intelligence

通义灵码

DeepSeek - R1

Artificial Intelligence

具身智能

AGI

Artificial Intelligence

通义灵码

DeepSeek - R1

吴伟：中国科技崛起吹响AI平权的号角

Huan Qiu Wang Zi Xun· 2025-09-01 22:53

Group 1 - The 2025 Global AI Influence List by Time magazine features several Chinese entrepreneurs and scholars, indicating a significant increase in representation and diversity compared to previous years [1] - The rise of Chinese figures on the list reflects the rapid development of China's AI industry and its increasing presence on the international stage, as well as the global trend of "de-geographicalization" in technology [1] - The open-source technology path taken by DeepSeek contributes to a more inclusive global technology landscape, enhancing the openness and participation of the AI industry [1] Group 2 - Southeast Asia is actively seizing opportunities from the "de-geographicalization" wave in AI, with the region's digital economy projected to reach $2 trillion by 2030, and the AI market expected to exceed $580 billion [2] - Countries like Singapore, Malaysia, and Indonesia are implementing national AI strategies and attracting significant investments from major tech companies, indicating a shift towards technological self-sufficiency [2] - The rise of local innovation in developing countries is seen as a way to dismantle external technological monopolies and empower these nations as creators of AI technology [2] Group 3 - Despite the concentration of top AI talent in the U.S., Chinese talent now accounts for 38% of the top AI research institutions in the U.S., surpassing the 37% of local talent [3] - The increase in homegrown talent and the return of overseas scholars signal a promising future for China's talent strategy focused on local cultivation and talent repatriation [3] - China's AI industry is characterized by a systematic innovation paradigm driven by top-level policies, autonomous innovation, and a commitment to long-termism [3] Group 4 - The performance gap between Chinese and U.S. large models has dramatically decreased from 17.5% in 2023 to just 0.3% [4] - China's unique advantages in open-source ecosystem development and vertical application innovation have contributed to this rapid advancement [4] - The success of China's AI rise is attributed to the establishment of an open, symbiotic ecosystem that fosters talent and continuous innovation, providing a valuable model for global AI development [4]

ARPO：智能体强化策略优化，让Agent在关键时刻多探索一步

机器之心· 2025-08-09 06:02

Core Viewpoint - The article introduces a novel method called Agentic Reinforced Policy Optimization (ARPO), designed to enhance the performance of large language models (LLMs) in multi-round interactions by addressing the challenges of uncertainty and exploration during tool usage [3][41]. Group 1: Research Motivation and Background - The emergence of Agentic Reinforcement Learning (RL) is driven by the need for LLMs to engage in dynamic multi-round interactions with external tools, moving from static problem-solving to a more interactive agent-environment reasoning paradigm [8]. - Existing Agentic RL methods often underestimate the value of multi-round interactions due to sparse rewards and overuse of tools, leading to a lack of fine-grained exploration of tool usage [8][41]. - The study identifies a significant increase in entropy (uncertainty) after tool calls, indicating an opportunity for exploration that current methods do not fully leverage [14][16]. Group 2: ARPO Methodology - ARPO introduces an entropy-driven adaptive rollout strategy that enhances exploration during high-entropy tool usage phases, allowing for more diverse reasoning paths [11][20]. - The method includes four key steps: initialization of global rollout, monitoring entropy changes, adaptive branching based on entropy, and defining termination conditions for the rollout process [24][27]. - ARPO incorporates advantage attribution estimation to help the model better internalize the value differences in tool usage at each step [28][30]. Group 3: Experimental Results - ARPO outperforms existing sample-level RL methods, achieving better performance with only half the tool call budget across 13 challenging benchmarks, demonstrating its efficiency in training multi-round reasoning agents [21][41]. - The method shows consistent improvements in performance metrics such as Pass@3 and Pass@5, particularly in dynamic, multi-round tasks [37][39]. - In comparative tests, ARPO achieves higher accuracy than GRPO and DAPO in various tasks, including deep search and knowledge-intensive reasoning [41][42]. Group 4: Future Directions - Future research may explore the application of ARPO in multi-modal tasks, expanding its capabilities beyond text-based reasoning to include images and videos [42]. - There is potential for integrating a broader range of external tools to enhance complex task performance through optimized tool usage strategies [42]. - The scalability and real-time deployment of ARPO in larger models and dynamic environments could further improve its practical value and cost-effectiveness [42].

大语言模型

多轮推理智能体

Agentic Reinforced Policy Optimization

Agentic Reinforced Policy Optimization

ARPO

Qwen2

Qwen2.5