上下文学习
Search documents
腾讯的AI阳谋:10亿红包与NBA免费直播的「背后」
硬AI· 2026-02-17 03:59
Core Viewpoint - Tencent's recent initiatives, including the distribution of 1 billion yuan in red envelopes and the NBA All-Star Game live streaming, are not merely promotional tactics but strategic moves to dominate the "context" landscape in AI [5][24]. Group 1: Product Overview - The "Yuanbao Pai" product is more than just a chat room; it serves as a "super container" for context, integrating various media and social interactions into a single AI-driven platform [8][11]. - Unlike traditional applications that operate in silos, Yuanbao Pai combines resources from Tencent's video, music, and sports platforms, allowing AI to understand user interactions across different contexts [11][21]. Group 2: AI Context Learning - Current AI models struggle with complex contexts, achieving only a 23.7% accuracy rate when faced with new and intricate scenarios, as highlighted in the "CL-bench" paper by Tencent's AI team [15][16]. - The challenge lies in AI's reliance on past knowledge, which often leads to misunderstandings in real-time interactions, necessitating a shift towards context learning rather than mere memorization [18][20]. Group 3: Strategic Implications - The integration of AI into high-frequency usage scenarios, such as social interactions and content consumption, is seen as a sustainable competitive advantage for Tencent [19][21]. - By leveraging user interactions within Yuanbao Pai, Tencent aims to gather valuable reinforcement learning data to improve AI's contextual understanding, potentially increasing accuracy from 23.7% to much higher levels [25][26].
姚顺雨的最新成果,才是腾讯发完 10 亿红包后决战 AI 的关键
3 6 Ke· 2026-02-07 08:46
Core Insights - Tencent has recruited Yao Shunyu, a former OpenAI researcher, to lead its AI initiatives, indicating a long-term strategic vision for AI development [2] - A recent study by Tencent's Mixed Yuan team and Fudan University highlights the critical importance of context in AI performance, revealing that even the most advanced AI models score poorly in real-time context learning [3][10] Group 1: AI Model Performance - The study found that the highest-performing model, GPT-5.1, achieved only a 23.7% accuracy rate in context learning, while other models like Claude Opus 4.5 scored around 21.1% [8][9] - The research indicates that AI struggles to adapt to new contexts, often defaulting to pre-trained knowledge, which can lead to inaccuracies in responses [10][11] Group 2: Context Learning Challenges - AI models face significant challenges when processing complex and lengthy contexts, resulting in a sharp decline in performance during logical reasoning tasks [11] - The inability to effectively manage context can lead to "hallucinations," where AI generates incorrect information based on its pre-existing knowledge rather than the new context provided [10] Group 3: Implications for Tencent - Tencent's focus on context learning aligns with its core business in social and content-driven applications, where understanding nuanced conversations is crucial [12][14] - The demand for effective context handling is particularly relevant in Tencent's gaming and enterprise services, where real-time responses based on specific scenarios are essential for user satisfaction [16]
姚顺雨腾讯首篇论文:给AI下半场指路“上下文学习”
Sou Hu Cai Jing· 2026-02-04 10:20
Core Insights - The research aligns with Yao Shunyu's perspective that AI is currently in a "halftime" phase, where evaluation will become more important than training, emphasizing the need for models to be tested in real-world tasks rather than just increasing model size [2]. Group 1: Model Performance and Evaluation - The evaluation results from CL-bench reveal that the current leading model, GPT-5.1 (High), has a task-solving rate of only 23.7%, indicating that it fails in over three-quarters of tasks even when provided with all necessary information [4][19]. - A total of ten advanced language models were assessed, with an average task-solving rate of only 17.2%, highlighting a significant gap in their ability to learn from complex contexts [19][27]. - The models struggle to learn from context, with GPT-5.1 (High) ignoring context in 55.3% of cases and misusing it in 1.5% of cases, demonstrating a reliance on static knowledge rather than adapting to new information [24]. Group 2: Context Learning Challenges - The CL-bench framework includes 500 complex contexts and 18,999 tasks designed to require models to learn new knowledge from context, which current models fail to do effectively [6][8]. - The knowledge required for tasks spans various domains, including new field knowledge, unfamiliar rule systems, and complex workflows, which are often not represented in the training data of leading models [8][14]. - Models perform poorly in tasks requiring inductive reasoning from experimental data, with success rates typically below 10%, indicating a need for improved contextual learning capabilities [25][29]. Group 3: Future Directions and Implications - The research emphasizes the necessity for models to genuinely learn from context rather than merely providing it, suggesting that simply offering context is insufficient for task success [27]. - The collaboration between Tencent Hunyuan and Fudan University aims to advance the understanding of context learning in AI, with a clear goal of making contextual learning applicable in real-world scenarios [27]. - The findings suggest that enhancing reasoning capabilities alone is not enough; models must also effectively absorb and organize contextual information to improve performance [29].
姚顺雨腾讯首篇论文:给AI下半场指路“上下文学习”
量子位· 2026-02-04 01:01
Core Insights - The article discusses the launch of CL-bench, a benchmark designed to evaluate the ability of large models to learn from context, led by Yao Shunyu, Tencent's Chief AI Scientist [1][2][4] - The research emphasizes that the focus should shift from merely increasing model size to ensuring models can effectively learn and apply knowledge in real-world tasks [5][10] - Current leading models, including GPT-5.1, show disappointing performance, with a task-solving rate of only 23.7%, indicating a significant gap in their contextual learning capabilities [7][29] Summary by Sections Context Learning Importance - The research highlights that while advanced models excel in standardized tests, they struggle in real-world applications where contextual learning is crucial [9][10] - Human learning relies on real-time context rather than static knowledge, which current models fail to replicate [11][14] CL-bench Design and Objectives - CL-bench consists of 500 complex contexts, 1899 tasks, and 31607 validation criteria, designed to require models to learn new knowledge from context [15][19] - The benchmark aims to assess models' abilities to apply knowledge from unfamiliar domains, rule systems, and procedural tasks [18][22] Model Performance Evaluation - Ten leading models were evaluated on CL-bench, with an average task-solving rate of only 17.2%, underscoring their inability to learn from complex contexts [28][29] - The best-performing model, GPT-5.1, achieved a maximum of 23.7%, revealing a widespread issue across models in contextual learning [30] Error Analysis - The analysis identified that ignoring or misusing context is a primary reason for model failures, with many errors stemming from the models' reliance on pre-trained static knowledge [31][32] - Models performed poorly in tasks requiring inductive reasoning from experimental data, often achieving less than 10% success [32] Future Directions - The research team aims to advance contextual learning in AI, moving beyond merely providing context to ensuring models can genuinely learn from it [36][40] - The collaboration between Tencent and Fudan University reflects a commitment to enhancing AI's practical applications in real-world scenarios [39]
担任腾讯首席AI科学家后,姚顺雨带领团队揭晓首个研究成果
Nan Fang Du Shi Bao· 2026-02-03 15:35
Core Insights - Tencent's first research outcome under Chief AI Scientist Yao Shunyu has been revealed, focusing on the challenges of learning from context in AI models [1][6] - The competitive landscape is shifting from improving model training to providing rich and relevant context for tasks [1][7] Group 1: Research Findings - The joint research by Tencent's Mixyuan team and Fudan University highlights that enabling large models to learn from context is more challenging than previously thought [6][7] - A benchmark called CL-bench was created to assess language models' ability to learn new knowledge from context, consisting of 500 complex contexts, 1,899 tasks, and 31,607 validation standards [7] - The top ten language models achieved an average task-solving rate of only 17.2% on CL-bench, indicating significant shortcomings in utilizing context effectively [7] Group 2: Future Directions - The research suggests that enhancing models' ability to learn from context could be a key direction for future iterations of large language models [7] - The role of humans in AI systems may evolve from being primary data providers to context providers as models improve their contextual learning capabilities [7] - Memory mechanisms in models are expected to become a core theme in the development of large models by 2026, potentially leading to autonomous learning capabilities [7]
刚刚,腾讯姚顺雨署名首篇论文发布,「下半场」先搞上下文学习
机器之心· 2026-02-03 10:35
Core Insights - The core argument of the article emphasizes that the key bottleneck for models to achieve high-value applications lies in their ability to effectively utilize context [1][5][7]. Group 1: Context Learning Challenges - Recent research indicates that even when context is provided, models may still struggle to solve tasks, highlighting a significant shortfall in their learning capabilities [5][32]. - The article discusses the difference in learning abilities among models, comparing it to individuals with varying talents who learn from the same material [5]. - Current models primarily rely on "parameterized knowledge," which is static and does not adapt to new information from the context [12][34]. Group 2: CL-bench Benchmark - The CL-bench benchmark was developed to assess how well language models can learn new knowledge from context and apply it correctly [16][26]. - It includes 500 complex contexts, 1,899 tasks, and 31,607 validation standards, all designed to require models to learn from the provided context [16][27]. - The benchmark covers four main real-world context learning scenarios: domain knowledge reasoning, rule system application, procedural task execution, and empirical discovery [28][29]. Group 3: Model Performance Evaluation - Evaluation results show that even the best-performing model, GPT-5.1 (High), only solved 23.7% of tasks, indicating a significant gap in context learning capabilities [31][32]. - The majority of errors stem from models ignoring or misusing context, rather than a lack of information [34][35]. - The article notes that models struggle particularly with tasks requiring inductive reasoning from experimental data, often achieving less than 10% success [39]. Group 4: Future Directions - The article suggests that improving context learning could shift the role of humans from data providers to context providers in AI systems [43]. - It raises the challenge of how to make knowledge learned from context persistent, as current models lose this knowledge once the context window is cleared [43][46]. - The potential for models to achieve autonomous learning through effective context learning and memory consolidation is highlighted as an exciting future prospect [47][48].
圣母大学团队打造分子设计新利器:让AI像写文章一样创造分子
仪器信息网· 2025-11-19 09:08
Core Insights - The article discusses the breakthrough AI system DemoDiff developed by a team from the University of Notre Dame, which can design new molecular structures by learning from a few examples, significantly accelerating drug and material development processes [7][8][10]. Group 1: AI Understanding of Molecular Design - DemoDiff mimics human learning by analyzing a few successful molecular examples to understand design patterns, allowing it to generate new candidates quickly [10][11]. - The system can even learn from negative examples, generating high-quality molecules based on poorly performing ones, showcasing its advanced reasoning capabilities [21][22]. Group 2: Innovative Molecular Representation - The team introduced a new method called "node-pair encoding," which simplifies complex molecular structures, improving efficiency by 5.5 times [9][12]. - This method allows for a significant reduction in the number of atoms needed to describe a molecule, enhancing the AI's ability to process more examples [12][13]. Group 3: Comprehensive Molecular Database - DemoDiff was trained on an extensive database containing over 1 million molecular structures and 155,000 different molecular properties, providing a rich resource for learning [14][15]. - The database includes various sources, such as the ChEMBL database, which records millions of drug molecules and their biological activities [14][15]. Group 4: Diffusion Model for Molecular Generation - The core technology of DemoDiff is based on a "diffusion model," which generates molecular structures through a progressive refinement process, ensuring chemical validity [16][17]. - This model incorporates context learning, allowing the AI to adapt its output based on different sets of example molecules [18]. Group 5: Performance Testing and Validation - DemoDiff underwent rigorous testing across 33 different molecular design tasks, demonstrating performance comparable to much larger AI models [19][20]. - The system excels in generating diverse molecular structures, providing researchers with multiple options for further exploration [20]. Group 6: Negative Learning Capability - The AI's ability to learn from negative examples allows it to infer what makes a successful molecule, enhancing its design capabilities [21][22]. - This feature is particularly valuable in early drug development stages, where researchers often have more negative examples than positive ones [21][22]. Group 7: Technical Innovations - The system employs a "graph attention mechanism" to focus on multiple important parts of a molecule simultaneously, ensuring a holistic understanding during generation [23]. - A multi-layer validation mechanism checks the generated molecules against fundamental chemical rules, ensuring their feasibility [23][24]. Group 8: Implications for Molecular Design - DemoDiff represents a paradigm shift in molecular design, potentially reducing the time and cost associated with drug development significantly [25][26]. - The technology may democratize molecular design, allowing a broader range of researchers to participate in innovation [26]. Group 9: Future Considerations - While DemoDiff shows impressive capabilities, there is recognition of the need for further improvements, particularly in handling specific design tasks [27]. - Future developments may include expanding the model's scale and enhancing data quality to tackle more complex challenges [27][28].
NeurIPS 2025 | 上下文元学习实现不微调跨被试脑活动预测
机器之心· 2025-11-19 04:07
Core Insights - The article discusses the development of BraInCoRL, a novel brain encoding model that utilizes meta-learning and context learning to predict brain responses from visual stimuli with minimal data requirements [3][32]. - This model addresses the limitations of traditional visual encoding models, which require extensive data collection for each individual, making them costly and difficult to implement in clinical settings [6][32]. Background and Innovation - The research highlights significant functional differences in the human higher visual cortex among individuals, necessitating the creation of brain encoding models that can effectively represent these differences [2][6]. - BraInCoRL allows for the prediction of brain responses using only a small number of example images and their corresponding brain activity data, eliminating the need for model fine-tuning [3][32]. Methodology - The BraInCoRL framework treats each voxel as an independent function mapping visual stimuli to neural responses, leveraging meta-learning and context learning to enhance data efficiency and generalization [7][10]. - During training, the model learns shared structures of visual cortex responses from multiple subjects, and during testing, it can generate a subject-specific voxel encoder using just a few image-brain response pairs [11][20]. Experimental Results - BraInCoRL demonstrates high data efficiency, achieving comparable variance explanation to models trained on thousands of images while only using 100 context images [20][22]. - The model shows robust performance across different datasets and scanning protocols, confirming its cross-device and cross-protocol generalization capabilities [22][23]. - Semantic clustering visualizations reveal clear functional organization within the visual cortex, with distinct areas for faces, scenes, and other categories [26][27]. Conclusion - BraInCoRL introduces in-context learning to computational neuroscience, creating a data-efficient, interpretable, and language-interactive framework for visual cortex encoding [32]. - This innovation significantly lowers the barriers for constructing individualized brain encoding models, paving the way for applications in clinical neuroscience and other data-limited scenarios [32].
深度|Andrej Karpathy:行业对Agent的发展过于乐观,一个能真正帮你工作的Agent还需要十年发展时间
Z Potentials· 2025-11-05 02:57
Core Insights - The article discusses the evolution of AI, particularly focusing on the development of agent systems and the challenges they face in achieving true intelligence [4][5][6][7][8][9][10]. Group 1: Future of AI Agents - Andrej Karpathy emphasizes that the next decade will be crucial for the development of AI agents, suggesting that current systems are not yet mature enough to be fully utilized in practical applications [5][6][7]. - The concept of a "cognitive core" is introduced, which refers to a stripped-down version of knowledge that retains intelligent algorithms and problem-solving strategies, highlighting the need for better data quality in training models [5][16]. - Karpathy expresses concern that society may lose understanding and control over AI systems as they become more integrated into daily life, leading to a disconnect between users and the underlying mechanisms of these systems [5][6]. Group 2: Historical Context and Learning Mechanisms - The article outlines significant milestones in AI development, such as the introduction of AlexNet and the Atari reinforcement learning era, which shaped the current landscape of AI research [8][9][10]. - Karpathy argues that human learning differs fundamentally from reinforcement learning, suggesting that humans build rich world models through experience rather than relying solely on reward signals [40]. - The discussion includes the limitations of current AI models in terms of continuous learning and the need for a more sophisticated understanding of context and memory [22][23]. Group 3: AI's Current Limitations - Karpathy critiques the current state of AI, stating that many generated code outputs are of mediocre quality and that the industry is experiencing a phase of over-optimism regarding AI capabilities [5][6][37]. - The article highlights the challenges AI faces in understanding complex code structures and the limitations of code generation models in producing original, contextually appropriate code [30][31][36]. - The need for a more nuanced approach to AI development is emphasized, suggesting that improvements must occur across multiple dimensions, including algorithms, data, and computational power [24][25][27].
Meta拆掉AI持续学习路上的最大炸弹,“微调”又有了一战之力
3 6 Ke· 2025-10-27 05:13
Core Insights - The article discusses the recent advancements in large language models (LLMs) regarding their ability to achieve continual learning and self-evolution, addressing criticisms about their lack of genuine learning capabilities [1][2]. Group 1: Paths to Continual Learning - The ability of LLMs to learn continuously is fundamentally linked to their memory depth and plasticity, with three main paths identified for enhancing this capability [2]. - The first path involves modifying the "context" or "working memory" of the model through In-Context Learning (ICL), where new information is provided in prompts to help the model learn to solve specific problems [4][6]. - The second path introduces an "external memory bank" (RAG), allowing models to access and maintain an external database for comparison and retrieval, exemplified by Google's DeepMind's "Reasoningbank" [7]. - The third path focuses on parameter-level continual learning, which has faced challenges due to the complexities and instabilities associated with methods like Reinforcement Learning (RL) and Low-Rank Adaptation (LoRA) [10][11]. Group 2: Sparse Memory Fine-Tuning - Meta AI's recent paper introduces Sparse Memory Fine-Tuning (SFT) as a solution to the challenges of traditional SFT, particularly addressing the issue of catastrophic forgetting [11][28]. - The proposed method involves a three-step process: modifying the architecture to include a memory layer, using TF-IDF to identify which parameters to update, and performing sparse updates to only the most relevant parameters [12][22][23]. - This new approach has shown significant improvements, with models experiencing only an 11% drop in performance on original tasks after learning new facts, compared to 71% and 89% drops with LoRA and full fine-tuning, respectively [23][25]. Group 3: Implications for the Future of LLMs - The advancements in SFT suggest a potential shift in how models can be updated safely and effectively, moving away from static tools to dynamic agents capable of continuous learning [31][32]. - The successful implementation of these methods could mark the beginning of a new era for self-evolving models, aligning with the vision of models that grow and adapt through experience [31][32].