Workflow
AI下半场
icon
Search documents
姚顺雨腾讯首篇论文:给AI下半场指路“上下文学习”
Sou Hu Cai Jing· 2026-02-04 10:20
Core Insights - The research aligns with Yao Shunyu's perspective that AI is currently in a "halftime" phase, where evaluation will become more important than training, emphasizing the need for models to be tested in real-world tasks rather than just increasing model size [2]. Group 1: Model Performance and Evaluation - The evaluation results from CL-bench reveal that the current leading model, GPT-5.1 (High), has a task-solving rate of only 23.7%, indicating that it fails in over three-quarters of tasks even when provided with all necessary information [4][19]. - A total of ten advanced language models were assessed, with an average task-solving rate of only 17.2%, highlighting a significant gap in their ability to learn from complex contexts [19][27]. - The models struggle to learn from context, with GPT-5.1 (High) ignoring context in 55.3% of cases and misusing it in 1.5% of cases, demonstrating a reliance on static knowledge rather than adapting to new information [24]. Group 2: Context Learning Challenges - The CL-bench framework includes 500 complex contexts and 18,999 tasks designed to require models to learn new knowledge from context, which current models fail to do effectively [6][8]. - The knowledge required for tasks spans various domains, including new field knowledge, unfamiliar rule systems, and complex workflows, which are often not represented in the training data of leading models [8][14]. - Models perform poorly in tasks requiring inductive reasoning from experimental data, with success rates typically below 10%, indicating a need for improved contextual learning capabilities [25][29]. Group 3: Future Directions and Implications - The research emphasizes the necessity for models to genuinely learn from context rather than merely providing it, suggesting that simply offering context is insufficient for task success [27]. - The collaboration between Tencent Hunyuan and Fudan University aims to advance the understanding of context learning in AI, with a clear goal of making contextual learning applicable in real-world scenarios [27]. - The findings suggest that enhancing reasoning capabilities alone is not enough; models must also effectively absorb and organize contextual information to improve performance [29].
姚顺雨腾讯首篇论文:给AI下半场指路“上下文学习”
量子位· 2026-02-04 01:01
Core Insights - The article discusses the launch of CL-bench, a benchmark designed to evaluate the ability of large models to learn from context, led by Yao Shunyu, Tencent's Chief AI Scientist [1][2][4] - The research emphasizes that the focus should shift from merely increasing model size to ensuring models can effectively learn and apply knowledge in real-world tasks [5][10] - Current leading models, including GPT-5.1, show disappointing performance, with a task-solving rate of only 23.7%, indicating a significant gap in their contextual learning capabilities [7][29] Summary by Sections Context Learning Importance - The research highlights that while advanced models excel in standardized tests, they struggle in real-world applications where contextual learning is crucial [9][10] - Human learning relies on real-time context rather than static knowledge, which current models fail to replicate [11][14] CL-bench Design and Objectives - CL-bench consists of 500 complex contexts, 1899 tasks, and 31607 validation criteria, designed to require models to learn new knowledge from context [15][19] - The benchmark aims to assess models' abilities to apply knowledge from unfamiliar domains, rule systems, and procedural tasks [18][22] Model Performance Evaluation - Ten leading models were evaluated on CL-bench, with an average task-solving rate of only 17.2%, underscoring their inability to learn from complex contexts [28][29] - The best-performing model, GPT-5.1, achieved a maximum of 23.7%, revealing a widespread issue across models in contextual learning [30] Error Analysis - The analysis identified that ignoring or misusing context is a primary reason for model failures, with many errors stemming from the models' reliance on pre-trained static knowledge [31][32] - Models performed poorly in tasks requiring inductive reasoning from experimental data, often achieving less than 10% success [32] Future Directions - The research team aims to advance contextual learning in AI, moving beyond merely providing context to ensuring models can genuinely learn from it [36][40] - The collaboration between Tencent and Fudan University reflects a commitment to enhancing AI's practical applications in real-world scenarios [39]
千问会是阿里的豆包时刻吗?
3 6 Ke· 2026-01-15 11:32
Core Viewpoint - Alibaba's "Qianwen" product launch emphasizes the shift towards AI-driven user services that directly fulfill user needs without requiring task breakdown or application switching, marking a significant evolution in user interaction with technology [1][2]. Group 1: Product Features and Capabilities - The Qianwen App has launched over 400 new features, focusing on three main scenarios: super lifestyle assistant, super work partner, and super tutor, indicating a comprehensive approach to integrating AI into daily tasks [2][3]. - Qianwen allows users to complete tasks within the app without needing to switch to other platforms, exemplified by a live demonstration where 40 cups of tea were ordered directly through voice commands [3][4]. - The app supports multi-tasking by linking various services, such as travel planning, where it can generate integrated solutions for flights, hotels, navigation, and dining based on user input [6][9]. Group 2: Integration with Alibaba Ecosystem - Qianwen integrates with Alibaba's ecosystem, including Taobao, Alipay, and Fliggy, creating a seamless user experience that connects various services into a unified operational framework [3][15]. - The app has also incorporated 50 common public service tasks, such as visa applications and health insurance inquiries, expanding its utility beyond consumer transactions to include essential life services [9][10]. Group 3: Market Position and Competitive Landscape - The launch of Qianwen positions Alibaba as a leader in the AI application space, contrasting with competitors like Tencent, which focuses on embedding AI capabilities rather than creating a comprehensive user interface [15][18]. - Within two months of its launch, Qianwen achieved over 100 million monthly active users, showcasing its rapid adoption and the effectiveness of its integrated approach [15].
27岁掌舵腾讯大模型,非典型天才定义AI下半场
Sou Hu Cai Jing· 2025-12-23 17:06
Core Insights - Yao Shunyu, a prominent figure in AI, has made significant contributions to the development of intelligent agents and large language models, showcasing a trajectory from academic excellence to industry leadership [1][11]. Group 1: Academic Background and Early Career - Yao Shunyu entered Tsinghua University with a strong academic record and later pursued advanced studies at Princeton University, focusing on natural language processing and reinforcement learning [1][3]. - He was recognized as a young innovator, being included in MIT Technology Review's list of 35 Innovators Under 35 in China [3]. Group 2: Research Focus and Contributions - Yao's research primarily revolves around intelligent agents, which are systems capable of self-decision-making and interaction with their environment [7]. - He shifted his focus from computer vision to language processing, believing that language holds greater potential for achieving general intelligence [4][5]. - Yao's work on the ReAct method, which combines reasoning and action, has become a mainstream approach in building language agents, enhancing their controllability and applicability across various fields [9][10]. Group 3: Industry Impact and Future Directions - In 2024, Yao joined OpenAI, where he played a key role in developing the company's first intelligent agent products and participated in deep research projects [10][11]. - His upcoming role at Tencent as Chief AI Scientist will involve leading the AI Infra department, focusing on large model training and inference capabilities, aligning with Tencent's strategic emphasis on AI [11][12]. - Yao believes that the next phase of AI will prioritize defining problems over merely solving them, indicating a shift in focus towards creating practical applications of AI technology [12][13].
腾讯AI大消息!
Zheng Quan Shi Bao· 2025-12-18 04:54
Group 1 - Tencent has upgraded its large model research architecture by establishing new departments: AI Infra, AI Data, and Data Computing Platform, to enhance its core capabilities in large model development [2][8] - Vinces Yao, a prominent AI talent and former OpenAI researcher, has been appointed as the Chief AI Scientist and will oversee the AI Infra and Large Language Model departments [2][4] - The AI Infra department will focus on building technical capabilities for large model training and inference, while the AI Data and Data Computing Platform departments will handle data and evaluation system construction [8][9] Group 2 - Yao Shunyu, at 27 years old, is recognized as a leading talent in the AI field, having graduated from Tsinghua University and Princeton University, and was a core member at OpenAI [4][5] - Yao's "AI Second Half" theory emphasizes that the focus of AI development is shifting from model training to defining and solving real-world problems, highlighting the importance of evaluation over training [5][6] - Tencent's recent restructuring and talent acquisition aim to support intensive technical advancements in AI, with over 900 applications of its large model already implemented internally [10][9]
腾讯AI,大消息!
证券时报· 2025-12-18 04:50
Core Viewpoint - Tencent is accelerating its AI strategy by upgrading its large model research framework, establishing new departments to enhance its core capabilities in AI model development [1][5]. Group 1: Tencent's AI Strategy - Tencent has established the AI Infra Department, AI Data Department, and Data Computing Platform Department to strengthen its large model research system and core capabilities [1]. - Vinces Yao, a prominent AI talent and former OpenAI researcher, has been appointed as the Chief AI Scientist, overseeing multiple departments and reporting directly to Tencent's president [1][3]. - The AI Infra Department will focus on building technical capabilities for large model training and inference, while the AI Data Department will handle data and evaluation system construction [6]. Group 2: Vinces Yao's Contributions - Vinces Yao, at 27 years old, is recognized as a leading talent in the AI field, having graduated from Tsinghua University and Princeton University, and was a core member at OpenAI [3]. - Yao's "AI Second Half" theory emphasizes the shift from model training to defining real-world problems and optimizing user interaction, highlighting the importance of evaluation over training [4]. Group 3: Recent Developments in AI Models - Over the past year, Tencent has released more than 30 new models, with the latest version, Mix Yuan 2.0, showing significant improvements in pre-training data and reinforcement learning strategies [7]. - Tencent's AI capabilities have been integrated into popular products like WeChat and QQ, enhancing user experience without altering existing habits [7]. - Internally, Tencent has implemented AI across over 900 applications, with more than 90% of engineers using AI tools, indicating a strong push towards AI-driven efficiency [7].
出自“清华姚班”的姚顺雨带队,腾讯升级大模型研发架构
Nan Fang Du Shi Bao· 2025-12-17 12:09
Core Insights - Tencent is enhancing its AI model development framework by establishing new departments, including AI Infra, AI Data, and Data Computing Platform, to strengthen its core capabilities in AI model research [2][6] - Renowned OpenAI researcher Yao Shunyu has joined Tencent as the Chief AI Scientist and will lead the AI Infra and Large Language Model departments, indicating a significant talent acquisition for Tencent's AI initiatives [3][4] Group 1: Organizational Changes - Tencent has appointed Yao Shunyu as the Chief AI Scientist, who will report directly to Tencent's President Liu Chiping, and will also oversee the AI Infra and Large Language Model departments [2][3] - The newly formed AI Infra department will focus on building technical capabilities for large model training and inference platforms, while the AI Data and Data Computing Platform departments will handle data and evaluation systems [6] Group 2: Talent Acquisition and Strategy - Yao Shunyu's recruitment is seen as a signal of Tencent's commitment to strengthening its AI capabilities, as he is recognized as a top talent in the AI field [4][5] - Tencent's strategy includes a focus on young talent, with plans to rapidly promote young professionals within the AI sector, emphasizing the need for sufficient talent to create valuable innovations [4][7] Group 3: AI Model Development - Tencent's core AI research team, known as the Mix Yuan team, has released over 30 new models in the past year, with the recent Mix Yuan 2.0 showing significant improvements in pre-training data and reinforcement learning strategies [4][5] - The Mix Yuan 3D model has achieved a leading position globally, with over 3 million downloads from the open-source community, reflecting the team's strong technical capabilities [5][6] Group 4: Internal AI Integration - Tencent is undergoing a comprehensive AI-driven efficiency transformation, with the Mix Yuan model being implemented in over 900 internal applications, including Tencent Meeting, WeChat, advertising, and gaming [7] - More than 90% of Tencent engineers are utilizing the Tencent Cloud Code Assistant, CodeBuddy, with AI assisting in generating 50% of new code and participating in 94% of code review processes [7]
腾讯调整大模型组织架构:姚顺雨加盟,向总裁刘炽平汇报
量子位· 2025-12-17 10:00
Core Viewpoint - Tencent has announced a significant organizational restructuring in its AI division, with the notable addition of Yao Shunyu, a prominent figure in the AI research community, as the Chief AI Scientist [1][4][11]. Group 1: Yao Shunyu's Background and Role - Yao Shunyu, a former OpenAI researcher and a distinguished academic, has joined Tencent as the Chief AI Scientist in the CEO's office, reporting directly to Tencent's president, Liu Chiping [2][4]. - At only 28 years old, Yao has made substantial contributions to the field of AI, particularly in the area of large models and agent-based research, with notable works including Tree of Thoughts and ReAct [3][19]. - His recent departure from OpenAI and subsequent move to Tencent has garnered significant attention, highlighting his status as a leading talent in the AI sector [3][11]. Group 2: Organizational Changes at Tencent - Tencent has restructured its AI organization, establishing new departments such as AI Infra, AI Data, and Data Computing Platform to enhance its large model development capabilities [6][8]. - The AI Infra department, led by Yao, will focus on building the technical capabilities for large model training and inference, aiming to create a competitive edge in AI infrastructure [8][10]. - The restructuring aims to strengthen Tencent's engineering advantages and improve the efficiency of AI large model research, aligning with the company's strategic goals in AI [8][12]. Group 3: Tencent's AI Product Development - Over the past year, Tencent has launched more than 30 new models under its Mix Yuan series, with Mix Yuan 2.0 showing significant improvements in pre-training data and reinforcement learning strategies [9]. - Tencent's AI product, Yuanbao, has rapidly gained user acceptance, becoming one of the top AI applications in China, and is integrated into major platforms like WeChat and QQ [10]. - The company is undergoing a comprehensive AI-driven efficiency transformation, with over 900 applications utilizing its Mix Yuan models across various internal services [10][12]. Group 4: Strategic Importance of AI for Tencent - Tencent's advancements in AI are closely tied to its extensive resources, including rich scenarios, vast data, and a strategic approach, positioning the company favorably in the AI landscape [14][15]. - The recruitment of top talent like Yao Shunyu signifies Tencent's commitment to accelerating its AI initiatives and enhancing its capabilities in the competitive AI market [11][12].
阿里吴泳铭为什么现在站出来造词?
Hu Xiu· 2025-09-24 23:25
Core Viewpoint - Alibaba's CEO, Wu Yongming, emphasizes that achieving Artificial General Intelligence (AGI) is just the beginning, with the ultimate goal being the development of Artificial Superintelligence (ASI) that can self-iterate and surpass human capabilities [2] Group 1: Market Reaction - Following Wu's announcement, Alibaba's stock price surged by 9% on September 24, reaching a four-year high [5] - The market's positive response indicates strong investor confidence in Alibaba's future prospects in the AI sector [5] Group 2: Business Strategy - Wu highlights that the AI business in China has entered a new phase, characterized by emerging commercial opportunities [6] - The focus is on transforming intelligence into useful products, potentially creating multi-billion dollar companies [6] - Alibaba Cloud aims to capture as many of these emerging companies as possible as potential clients [6] Group 3: Financial Performance - Alibaba Cloud reported a revenue of 33.398 billion yuan for Q2 2025, marking a 26% year-on-year increase, the highest growth rate in three years [8] - AI revenue now constitutes over 20% of Alibaba Cloud's external commercialization income [8] Group 4: Product Development - Wu identifies two key products: 1. Large models as the next-generation operating system, with Tongyi Qianwen open-sourcing over 300 models [11] 2. AI cloud as the next-generation computer [12] - The strategy involves using the free large models to establish market presence and developer ecosystems, followed by monetization through cloud services [13] Group 5: Investment Plans - Alibaba plans to invest 380 billion yuan over the next three years in AI and cloud computing infrastructure, averaging over 10 billion yuan per month [13] - This significant investment underscores the company's commitment to building a robust AI ecosystem [13] Group 6: Competitive Advantage - The company's competitive edge may also stem from Jack Ma's determination and the resulting market confidence [14]
高阶程序,让AI从技术可行到商业可信的最后一公里
机器之心· 2025-09-16 11:57
Core Viewpoint - The article discusses the transition to the "second half" of AI, emphasizing the need for reliability and engineering frameworks to ensure AI applications are trustworthy and effective [1][4][57]. Group 1: Importance of Data and Reliability - Data is crucial for AI application capabilities, but it does not automatically create value without a reliable processing engine [3][4]. - Reliability encompasses various metrics, including accuracy, speed, and the ability to avoid "hallucinations," which are misleading outputs generated by AI models [4][8]. Group 2: Transition from Model Competition to Engineering Competition - The shift in focus from "what AI can do" to "how to make AI do it correctly" marks a significant change in the industry [4][5]. - Various frameworks, such as LangChain and DSPy, are emerging to address these challenges, but they often lack robust reliability guarantees [4][9]. Group 3: High-Order Programs (HOP) - HOP is introduced as a new paradigm that integrates engineering principles into AI applications, aiming to mitigate hallucinations and enhance reliability [6][20]. - HOP is not a new programming language but a framework that combines symbolic logic with neural networks to create a reliable control system for AI [22][25]. Group 4: Mechanisms of HOP - HOP utilizes a structured approach to express business logic in programming languages, ensuring clarity and reducing ambiguity [23]. - The HopLogic execution framework within HOP allows for the breakdown of complex tasks into verifiable steps, enhancing reliability to over 99% in professional applications [28][37]. Group 5: Practical Applications and Industry Impact - HOP has demonstrated its potential in sectors like finance and healthcare, significantly improving reliability and reducing development time [39][43]. - The framework allows for agile iterations without the need for extensive retraining of models, making it a cost-effective solution for businesses [52][53]. Group 6: Future of AI Engineering - The article concludes that the future of AI will depend on high-quality data and reliable engineering frameworks, with HOP serving as a key driver for scalable professional productivity [54][64]. - The establishment of a reliable framework and the development of high-quality data will enable AI to evolve from a supportive role to a core driver of industry transformation [64][65].