Workflow
元学习
icon
Search documents
告别KV Cache枷锁,将长上下文压入权重,持续学习大模型有希望了?
机器之心· 2026-01-02 01:55
人类已经走上了创造 AGI(通用人工智能)的道路,而其中一个关键方面是持续学习,即 AI 能通过与环境互动而不断学习新的知识和能力。 想象一下你生命中的第一次机器学习讲座:你或许记不清教授开口说的第一个单词,但那场讲座留给你的直觉和逻辑,此刻正潜移默化地帮助你理解这篇复杂的 论文。这种能力的本质在于 压缩 。 近日,Astera 研究所、英伟达、斯坦福大学、加州大学伯克利分校、加州大学圣地亚哥分校的一个联合团队提出的 TTT-E2E(端到端测试时训练) 沿着这条 AGI 的必经之路迈出了重要一步。它彻底打破了传统模型在推理时静态不变的局限,让长上下文建模从一种「架构设计」进化为一种「学习问题」。 为此,研究社区已经在探索多种不同的道路,比如开发能够实时更新状态的循环神经网络(RNN),或者试图通过极大的缓存空间来容纳海量历史。然而,真正 的 AGI 或许不应仅仅被动地「存储」信息,而应像人类一样在阅读中「进化」。 该方法可以在测试阶段通过给定上下文的下一个 token 预测持续学习, 将读取的上下文信息压缩至权重参数中 。 编辑|Panda 论文标题:End-to-End Test-Time Training ...
NeurIPS 2025 | 上下文元学习实现不微调跨被试脑活动预测
机器之心· 2025-11-19 04:07
Core Insights - The article discusses the development of BraInCoRL, a novel brain encoding model that utilizes meta-learning and context learning to predict brain responses from visual stimuli with minimal data requirements [3][32]. - This model addresses the limitations of traditional visual encoding models, which require extensive data collection for each individual, making them costly and difficult to implement in clinical settings [6][32]. Background and Innovation - The research highlights significant functional differences in the human higher visual cortex among individuals, necessitating the creation of brain encoding models that can effectively represent these differences [2][6]. - BraInCoRL allows for the prediction of brain responses using only a small number of example images and their corresponding brain activity data, eliminating the need for model fine-tuning [3][32]. Methodology - The BraInCoRL framework treats each voxel as an independent function mapping visual stimuli to neural responses, leveraging meta-learning and context learning to enhance data efficiency and generalization [7][10]. - During training, the model learns shared structures of visual cortex responses from multiple subjects, and during testing, it can generate a subject-specific voxel encoder using just a few image-brain response pairs [11][20]. Experimental Results - BraInCoRL demonstrates high data efficiency, achieving comparable variance explanation to models trained on thousands of images while only using 100 context images [20][22]. - The model shows robust performance across different datasets and scanning protocols, confirming its cross-device and cross-protocol generalization capabilities [22][23]. - Semantic clustering visualizations reveal clear functional organization within the visual cortex, with distinct areas for faces, scenes, and other categories [26][27]. Conclusion - BraInCoRL introduces in-context learning to computational neuroscience, creating a data-efficient, interpretable, and language-interactive framework for visual cortex encoding [32]. - This innovation significantly lowers the barriers for constructing individualized brain encoding models, paving the way for applications in clinical neuroscience and other data-limited scenarios [32].
AlphaGo之父找到创造强化学习算法新方法:让AI自己设计
机器之心· 2025-10-28 04:31
Core Insights - The article discusses a significant advancement in reinforcement learning (RL) where Google's DeepMind team has demonstrated that machines can autonomously discover state-of-the-art RL algorithms, outperforming human-designed rules [1][5]. Methodology - The research employs meta-learning based on the experiences of numerous agents in complex environments to discover RL rules [4][7]. - The team utilized two types of optimization: agent optimization and meta-optimization, allowing the agent to update its parameters to minimize the distance between its predictions and the targets set by a meta-network [7][19][22]. Experimental Results - The discovered RL rule, named DiscoRL, was evaluated using the Atari benchmark, achieving a normalized score of 13.86, surpassing all existing RL methods [26][29]. - Disco57, a variant of DiscoRL, demonstrated superior performance on previously unseen benchmarks, including ProcGen, indicating its strong generalization capabilities [33][34]. Generalization and Robustness - Disco57 showed robustness across various agent-specific settings and environments, achieving competitive results without using domain-specific knowledge [36][35]. - The research highlights the importance of diverse and complex environments for the discovery process, leading to stronger and more generalizable rules [39][40]. Efficiency and Scalability - The discovery process was efficient, requiring significantly fewer experiments compared to traditional methods, thus saving time and resources [40]. - The performance of the discovered rules improved with the number and diversity of environments used for discovery, indicating a scalable approach [40]. Qualitative and Information Analysis - Qualitative analysis revealed that the discovered predictions could identify significant events before they occurred, enhancing the learning process [45]. - Information analysis indicated that the discovered predictions contained unique information about upcoming rewards and strategies, which were not captured by traditional methods [46]. Emergence of Bootstrapping Mechanism - Evidence of a bootstrapping mechanism was found, where future predictions influenced current targets, demonstrating the interconnectedness of the learning process [47]. - The performance of the discovered rules was significantly impacted by the use of these predictions for strategy updates, emphasizing their importance in the learning framework [47]. Conclusion - This research marks a pivotal step towards machine-designed RL algorithms that can compete with or exceed the performance of human-designed algorithms in challenging environments [48].
Meta拆掉AI持续学习路上的最大炸弹,“微调”又有了一战之力
3 6 Ke· 2025-10-27 05:13
Core Insights - The article discusses the recent advancements in large language models (LLMs) regarding their ability to achieve continual learning and self-evolution, addressing criticisms about their lack of genuine learning capabilities [1][2]. Group 1: Paths to Continual Learning - The ability of LLMs to learn continuously is fundamentally linked to their memory depth and plasticity, with three main paths identified for enhancing this capability [2]. - The first path involves modifying the "context" or "working memory" of the model through In-Context Learning (ICL), where new information is provided in prompts to help the model learn to solve specific problems [4][6]. - The second path introduces an "external memory bank" (RAG), allowing models to access and maintain an external database for comparison and retrieval, exemplified by Google's DeepMind's "Reasoningbank" [7]. - The third path focuses on parameter-level continual learning, which has faced challenges due to the complexities and instabilities associated with methods like Reinforcement Learning (RL) and Low-Rank Adaptation (LoRA) [10][11]. Group 2: Sparse Memory Fine-Tuning - Meta AI's recent paper introduces Sparse Memory Fine-Tuning (SFT) as a solution to the challenges of traditional SFT, particularly addressing the issue of catastrophic forgetting [11][28]. - The proposed method involves a three-step process: modifying the architecture to include a memory layer, using TF-IDF to identify which parameters to update, and performing sparse updates to only the most relevant parameters [12][22][23]. - This new approach has shown significant improvements, with models experiencing only an 11% drop in performance on original tasks after learning new facts, compared to 71% and 89% drops with LoRA and full fine-tuning, respectively [23][25]. Group 3: Implications for the Future of LLMs - The advancements in SFT suggest a potential shift in how models can be updated safely and effectively, moving away from static tools to dynamic agents capable of continuous learning [31][32]. - The successful implementation of these methods could mark the beginning of a new era for self-evolving models, aligning with the vision of models that grow and adapt through experience [31][32].
外滩大会速递(1):萨顿提出AI发展新范式,强化学习与多智能体协作成关键
Investment Rating - The report does not explicitly provide an investment rating for the industry or specific companies within it. Core Insights - Richard Sutton proposes that we are entering an "Era of Experience" characterized by autonomous interaction and environmental feedback, emphasizing the need for systems that can create new knowledge through direct interaction with their environments [1][8] - Sutton argues that public fears regarding AI, such as bias and unemployment, are overstated, and that multi-agent cooperation can lead to win-win outcomes [9] - The report highlights the importance of continual learning and meta-learning as key areas for unlocking the potential of reinforcement learning [3][13] Summary by Sections Event - Sutton's presentation at the 2025 INCLUSION Conference outlines a shift from static knowledge transfer to dynamic agent-environment interactions, marking a transition to an "Era of Experience" [1][8] - He identifies reinforcement learning as crucial for this transition, but notes that its full potential is contingent on advancements in continual and meta-learning [1][8] Commentary - The report discusses the shift from "data as experience" to "capability as interaction," suggesting that firms need to develop systems that can actively engage with their environments to generate new knowledge [2][11] - It emphasizes that the real bottleneck in reinforcement learning is not model parameters but the ability to handle time and task sequences, highlighting the need for continual and meta-learning capabilities [3][13] Technical Bottlenecks - The report identifies two main constraints in reinforcement learning: the need for continual learning to avoid catastrophic forgetting and the need for meta-learning to enable rapid adaptation across tasks [3][13] - It suggests that R&D should focus on long-horizon evaluation and the integration of memory mechanisms and planning architectures [3][13] Decentralized Collaboration - The report posits that decentralized collaboration is not only a technical choice but also a governance issue, requiring clear incentives and transparent protocols to function effectively [4][12] - It outlines three foundational institutional requirements for effective decentralized collaboration: open interfaces, cooperation-competition testbeds, and auditability [4][12] Replacement Dynamics - Sutton's view on "replacement" suggests that it will occur at the task level rather than entire job roles, urging organizations to proactively deconstruct tasks and redesign processes for human-AI collaboration [5][15] - The report recommends establishing a human-AI division of labor and reforming performance metrics to focus on collaborative efficiency [5][15]
外滩大会再证蚂蚁的底色:金融科技公司
Mei Ri Shang Bao· 2025-09-11 23:04
Group 1: Conference Overview - The 2025 Inclusion·Bund Conference opened in Shanghai with the theme "Reshaping Innovative Growth," featuring 550 guests from 16 countries and regions, including notable figures like Richard Sutton and Yuval Noah Harari [1] - The conference focused on five main topics: "Financial Technology," "Artificial Intelligence and Industry," "Innovation and Investment Ecology," "Global Dialogue and Cooperation," and "Responsible Innovation and Inclusive Future," comprising one main forum and 44 insight forums [1] - The event is recognized as one of Asia's three major financial technology conferences, attracting global attention for its openness, diversity, and forward-looking nature [1] Group 2: Insights from Richard Sutton - Richard Sutton, the 2024 Turing Award winner, emphasized that artificial intelligence is entering an "experience era," where the potential for AI exceeds previous capabilities [2] - He noted that current machine learning methods are reaching the limits of human data, and there is a need for new data sources generated through direct interaction between intelligent agents and the world [2] - Sutton defined "experience" as the interaction of observation, action, and reward, which is essential for learning and intelligence [2][3] Group 3: Insights from Wang Xingxing - Wang Xingxing, CEO of Yushutech, expressed regret for not pursuing AI earlier, highlighting the rapid development of large models that now allow for the integration of AI with robotics [4] - He discussed the emergence of a new embodied intelligence industry, where robots can possess AGI capabilities, enabling them to perceive, plan, and act autonomously [4] - Wang is optimistic about the future of innovation and entrepreneurship, stating that the barriers to entry have significantly lowered, creating a favorable environment for young innovators [4] Group 4: Ant Group's Technological Advancements - Ant Group is recognized as a leading technology financial company, with significant investments in AI and various sectors [5][6] - The conference showcased Ant Group's new AI assistant "Xiao Zheng," which integrates multiple large models to streamline government services [6] - Ant Group's CTO announced the launch of the "Agentic Contract," which will be natively deployed on their new Layer2 blockchain, Jovay [6]
对AI的恐惧被夸大了,“强化学习之父”萨顿外滩演讲:四条原则预言AI未来
3 6 Ke· 2025-09-11 08:34
Group 1 - The core idea presented is that the human data dividend is nearing its limit, and artificial intelligence (AI) is entering an "experience era" centered on continuous learning, which has the potential to exceed previous capabilities [1][9][44] - AI's current training methods are primarily focused on transferring existing human knowledge to static models without autonomous learning capabilities, leading to a recognition of the limitations of this approach [10][14] - The future of AI relies on the development of two currently immature technologies: continual learning and meta-learning, which are essential for unlocking the full potential of experience-based learning [16][14] Group 2 - AI has become a highly politicized issue, with public fears about bias, unemployment, and even human extinction being exaggerated and fueled by certain organizations and individuals [16][18][25] - The call for regulation and control of AI reflects a broader societal tendency to fear the unknown, which can hinder collaborative efforts necessary for progress [24][28] - The concept of decentralized collaboration is emphasized as a superior alternative to centralized control, allowing for coexistence among diverse intelligent agents with different goals [20][26][21] Group 3 - Four principles are proposed to predict the future of AI: the absence of a unified global opinion on how the world should operate, the eventual understanding and creation of intelligence by humans, the inevitable surpassing of current human intelligence by superintelligent entities, and the flow of power and resources towards the most intelligent agents [35][36][37] - The inevitability of AI's replacement of human roles is acknowledged, framing it as a natural progression in the evolution of intelligence [38][44] - The role of humans as catalysts and pioneers in the "design era" is highlighted, emphasizing the unique ability to push design to its limits through AI [42][43]
图灵奖得主理查德·萨顿:人类将开启“宇宙第四大时代”
Core Insights - Richard Sutton, the 2024 Turing Award winner, emphasizes the inevitability of AI replacing human roles in the development process of humanity [1][2] - Sutton introduces four realistic "predictive principles" regarding the future of AI, highlighting the need for decentralized collaboration and the importance of experience in learning [2][3] Group 1: AI and Learning - Sutton argues that current machine learning primarily focuses on transferring existing human knowledge to static AI, which lacks autonomous learning capabilities [1][2] - He identifies the need for a new data source generated through direct interaction between intelligent agents and the world, marking the transition into an "experience era" [1][2] - The core of intelligence lies in the ability to predict and control input signals based on experience, which is essential for the development of AI [2] Group 2: Future of AI - Sutton's four predictive principles include the lack of consensus on how the world operates, the potential for humans to understand and create intelligence through technology, the likelihood of superintelligent AI surpassing human intelligence, and the concentration of power and resources among the most intelligent agents [2][3] - He posits that humanity is currently in the "replicator era" and is on the verge of entering the "design era," where AI will play a crucial role [3][4] - Sutton encourages embracing AI as a necessary step in the evolution of the universe, advocating for courage and a spirit of adventure in facing its challenges [4]
图灵奖得主理查德·萨顿:人工智能进入“经验时代”,潜力超以往
Bei Ke Cai Jing· 2025-09-11 04:47
Core Insights - Richard Sutton, the 2024 Turing Award winner, emphasized that the human data dividend is nearing its limit, and artificial intelligence is entering an "experience era" centered on continuous learning, which has the potential to exceed previous capabilities [1][2] Group 1: AI and Learning - Sutton stated that most current machine learning aims to transfer existing human knowledge to static AI, which lacks autonomous learning capabilities. He believes we are reaching the limits of human data, and existing methods cannot generate new knowledge, making continuous learning essential for intelligence [2] - He defined "experience" as the interaction of observation, action, and reward, which is crucial for an intelligent agent's ability to predict and control its input signals. Experience is the core of all intelligence [2] Group 2: Collaboration and Future Predictions - Addressing fears about AI causing bias, unemployment, or even human extinction, Sutton argued that such fears are exaggerated and often fueled by those who profit from them. He highlighted that economic systems function best when individuals have different goals and abilities, similar to how decentralized collaboration among intelligent agents can lead to win-win outcomes [3] - Sutton proposed four predictive principles for the future of AI: 1. There is no consensus on how the world should operate, and no single view can dominate [3] 2. Humanity will truly understand intelligence and create it through technology [3] 3. Current human intelligence will soon be surpassed by superintelligent AI or enhanced humans [3] 4. Power and resources will flow to the most intelligent agents [3] Group 3: Historical Context and Future Outlook - Sutton categorized the history of the universe into four eras: the particle era, the star era, the replicator era, and the design era. He believes humanity's uniqueness lies in pushing design to its limits, which is the goal pursued through AI today [4] - He described AI as the inevitable next step in the evolution of the universe, urging society to embrace it with courage, pride, and a spirit of adventure [4] Group 4: Event Overview - The 2025 Inclusion Bund Conference, themed "Reshaping Innovative Growth," took place in Shanghai from September 10 to 13, featuring a main forum, over 40 open insight forums, global theme days, innovation stages, a technology exhibition, and various networking opportunities [4]
图灵奖得主理查德·萨顿2025外滩大会演讲:经验是一切智能的核心与基础
Yang Guang Wang· 2025-09-11 04:06
Core Insights - The 2025 Inclusion Bund Conference opened in Shanghai, featuring a keynote speech by Richard Sutton, the 2024 Turing Award winner and a pioneer in reinforcement learning [1] Group 1: Machine Learning and AI - Sutton emphasized that current machine learning primarily focuses on transferring existing human knowledge to static, non-autonomous AI, reaching the limits of human data [2] - He introduced the concept of the "experience era," advocating for new data sources generated through direct interaction between intelligent agents and the world [2] - Sutton defined "experience" as the interplay of observation, action, and reward, asserting that knowledge is derived from experience, which is fundamental to intelligence [2] Group 2: Future of AI - Sutton proposed four predictive principles regarding the future of AI: 1. There is no consensus on how the world operates, and no single perspective can dominate [3] 2. Humanity will truly understand intelligence and create it through technology [3] 3. Current human intelligence will soon be surpassed by superintelligent AI or enhanced humans [3] 4. Power and resources will gravitate towards the most intelligent agents [3] - He categorized the history of the universe into four eras: particle, star, replicator, and design, asserting that humanity's unique ability to push design to its limits is crucial in the pursuit of AI [3] Group 3: Embracing AI - Sutton stated that artificial intelligence is the inevitable next step in the evolution of the universe, and it should be embraced with courage, pride, and a spirit of adventure [4]