元学习
Search documents
平安银行取得基于元学习的自适应文字转语音方法专利
Sou Hu Cai Jing· 2026-02-03 07:33
Group 1 - The core point of the article is that Ping An Bank has obtained a patent for a "meta-learning-based adaptive text-to-speech method and its related devices," with the patent granted under announcement number CN114999442B and the application date being May 2022 [1] Group 2 - Ping An Bank, established in 1987 and located in Shenzhen, primarily engages in monetary financial services [1] - The registered capital of Ping An Bank is approximately 11.42 billion RMB [1] - According to data analysis, Ping An Bank has made investments in 11 companies, participated in 1,146 bidding projects, holds 497 trademark information records, and has 4,652 patent records, along with 71 administrative licenses [1]
谷歌刚掀了模型记忆的桌子,英伟达又革了注意力的命
3 6 Ke· 2026-01-20 01:12
Core Insights - Google's Nested Learning has sparked a significant shift in the understanding of model memory, allowing models to change parameters during inference rather than being static after training [1][5] - NVIDIA's research introduces a more radical approach with the paper "End-to-End Test-Time Training for Long Context," suggesting that memory is essentially learning, and "remembering" equates to "continuing to train" [1][10] Group 1: Nested Learning and Test-Time Training (TTT) - Nested Learning allows models to incorporate new information into their internal memory during inference, rather than just storing it temporarily [1][5] - TTT, which has roots dating back to 2013, enables models to adapt their parameters during inference, enhancing their performance based on the current context [5][9] - TTT-E2E proposes a method that eliminates the need for traditional attention mechanisms, allowing for constant latency regardless of context length [7][9] Group 2: Memory Redefined - Memory is redefined as a continuous learning process rather than a static storage structure, emphasizing the importance of how past information influences future predictions [10][34] - The TTT-E2E method aligns the model's learning objectives directly with its ultimate goal of next-token prediction, enhancing its ability to learn from context [10][16] Group 3: Engineering Stability and Efficiency - The implementation of TTT-E2E incorporates meta-learning to stabilize the model's learning process during inference, addressing issues of catastrophic forgetting and parameter drift [20][22] - Safety measures, such as mini-batch processing and sliding window attention, are introduced to ensure the model retains short-term memory while updating parameters [24][25] Group 4: Performance Metrics - TTT-E2E demonstrates superior performance in loss reduction across varying context lengths, maintaining efficiency even as context increases [27][29] - The model's ability to learn continuously from context without relying on traditional attention mechanisms results in significant improvements in prediction accuracy [31][34] Group 5: Future Implications - The advancements in TTT-E2E suggest a shift towards a more sustainable approach to continuous learning, potentially becoming a leading solution in the industry for handling long-context scenarios [34][35] - This approach aligns with the growing demand for models that can learn and adapt without the high computational costs associated with traditional attention mechanisms [33][34]
中美AI巨头都在描述哪种AGI叙事?
腾讯研究院· 2026-01-14 08:33
Core Insights - The article discusses the evolution of artificial intelligence (AI) in 2025, highlighting a shift from merely increasing model parameters to enhancing model intelligence through foundational research in four key areas: Fluid Reasoning, Long-term Memory, Spatial Intelligence, and Meta-learning [6][10]. Group 1: Key Areas of Technological Advancement - In 2025, technological progress focused on Fluid Reasoning, Long-term Memory, Spatial Intelligence, and Meta-learning due to diminishing returns from merely scaling model parameters [6]. - The current technological bottleneck is that models need to be knowledgeable, capable of reasoning, and able to retain information, addressing the previous imbalance in AI capabilities [6][10]. - The advancements in reasoning capabilities were driven by Test-Time Compute, allowing AI to engage in deeper reasoning processes [11][12]. Group 2: Memory and Learning Enhancements - The introduction of Titans architecture and Nested Learning significantly improved memory capabilities, enabling models to update parameters in real-time during inference [28][30]. - The Titans architecture allows for dynamic memory updates based on the surprise metric, enhancing the model's ability to retain important information [29][30]. - Nested Learning introduced a hierarchical structure that enables continuous learning and memory retention, addressing the issue of catastrophic forgetting [33][34]. Group 3: Reinforcement Learning Innovations - The rise of Reinforcement Learning with Verified Rewards (RLVR) and sparse reward metrics (ORM) has led to significant improvements in AI capabilities, particularly in structured domains like mathematics and coding [16][17]. - The GPRO algorithm emerged as a cost-effective alternative to traditional reinforcement learning methods, reducing memory usage while maintaining performance [19][20]. - The exploration of RL's limitations revealed that while it can enhance existing capabilities, it cannot infinitely increase model intelligence without further foundational innovations [23]. Group 4: Spatial Intelligence and World Models - The development of spatial intelligence was marked by advancements in video generation models, such as Genie 3, which demonstrated improved understanding of physical laws through self-supervised learning [46][49]. - The World Labs initiative aims to create large-scale world models that generate interactive 3D environments, enhancing the stability and controllability of generated content [53][55]. - The introduction of V-JEPA 2 emphasizes the importance of prediction in learning physical rules, showcasing a shift towards models that can understand and predict environmental interactions [57][59]. Group 5: Meta-learning and Continuous Learning - The concept of meta-learning gained traction, emphasizing the need for models to learn how to learn and adapt to new tasks with minimal examples [62][63]. - Recent research has explored the potential for implicit meta-learning through context-based frameworks, allowing models to reflect on past experiences to form new strategies [66][69]. - The integration of reinforcement learning with meta-learning principles has shown promise in enhancing models' ability to explore and learn from their environments effectively [70][72].
告别KV Cache枷锁,将长上下文压入权重,持续学习大模型有希望了?
机器之心· 2026-01-02 01:55
Core Viewpoint - The article discusses the development of AGI (Artificial General Intelligence) and emphasizes the importance of continuous learning, where AI can learn new knowledge and skills through interaction with the environment [1]. Group 1: TTT-E2E Development - A collaborative team from Astera, NVIDIA, Stanford University, UC Berkeley, and UC San Diego has proposed TTT-E2E (End-to-End Test-Time Training), which represents a significant step towards AGI by transforming long context modeling from an architectural design into a learning problem [2]. - TTT-E2E aims to overcome the limitations of traditional models that remain static during inference, allowing for dynamic learning during the testing phase [9][10]. Group 2: Challenges in Long Context Modeling - The article highlights the dilemma in long context modeling, where the full attention mechanism of Transformers performs well on long texts but incurs significant inference costs as the length increases [5]. - Alternatives like RNNs and state space models (SSM) have constant per-token computation costs but often suffer performance declines when handling very long texts [5][6]. Group 3: TTT-E2E Mechanism - TTT-E2E defines the model's behavior during testing as an online optimization process, allowing the model to perform self-supervised learning on already read tokens before predicting the next token [11]. - The approach incorporates meta-learning to optimize model initialization parameters, enabling the model to learn how to learn effectively [13]. - A hybrid architecture combines a sliding window attention mechanism for short-term memory with a dynamically updated MLP layer for long-term memory, mimicking biological memory systems [13][14]. Group 4: Experimental Results - Experimental results demonstrate that TTT-E2E exhibits performance scalability comparable to full attention Transformers, maintaining a consistent loss function even as context length increases from 8K to 128K [21]. - In terms of inference efficiency, TTT-E2E shows a significant advantage, processing speed at 128K context is 2.7 times faster than full attention Transformers [22]. Group 5: Future Implications - TTT-E2E signifies a shift from static models to dynamic individuals, where the process of handling long documents is akin to a micro self-evolution [27]. - This "compute-for-storage" approach envisions a future where models can continuously adjust themselves while processing vast amounts of information, potentially encapsulating human civilization's history within their parameters without hardware limitations [29].
NeurIPS 2025 | 上下文元学习实现不微调跨被试脑活动预测
机器之心· 2025-11-19 04:07
Core Insights - The article discusses the development of BraInCoRL, a novel brain encoding model that utilizes meta-learning and context learning to predict brain responses from visual stimuli with minimal data requirements [3][32]. - This model addresses the limitations of traditional visual encoding models, which require extensive data collection for each individual, making them costly and difficult to implement in clinical settings [6][32]. Background and Innovation - The research highlights significant functional differences in the human higher visual cortex among individuals, necessitating the creation of brain encoding models that can effectively represent these differences [2][6]. - BraInCoRL allows for the prediction of brain responses using only a small number of example images and their corresponding brain activity data, eliminating the need for model fine-tuning [3][32]. Methodology - The BraInCoRL framework treats each voxel as an independent function mapping visual stimuli to neural responses, leveraging meta-learning and context learning to enhance data efficiency and generalization [7][10]. - During training, the model learns shared structures of visual cortex responses from multiple subjects, and during testing, it can generate a subject-specific voxel encoder using just a few image-brain response pairs [11][20]. Experimental Results - BraInCoRL demonstrates high data efficiency, achieving comparable variance explanation to models trained on thousands of images while only using 100 context images [20][22]. - The model shows robust performance across different datasets and scanning protocols, confirming its cross-device and cross-protocol generalization capabilities [22][23]. - Semantic clustering visualizations reveal clear functional organization within the visual cortex, with distinct areas for faces, scenes, and other categories [26][27]. Conclusion - BraInCoRL introduces in-context learning to computational neuroscience, creating a data-efficient, interpretable, and language-interactive framework for visual cortex encoding [32]. - This innovation significantly lowers the barriers for constructing individualized brain encoding models, paving the way for applications in clinical neuroscience and other data-limited scenarios [32].
AlphaGo之父找到创造强化学习算法新方法:让AI自己设计
机器之心· 2025-10-28 04:31
Core Insights - The article discusses a significant advancement in reinforcement learning (RL) where Google's DeepMind team has demonstrated that machines can autonomously discover state-of-the-art RL algorithms, outperforming human-designed rules [1][5]. Methodology - The research employs meta-learning based on the experiences of numerous agents in complex environments to discover RL rules [4][7]. - The team utilized two types of optimization: agent optimization and meta-optimization, allowing the agent to update its parameters to minimize the distance between its predictions and the targets set by a meta-network [7][19][22]. Experimental Results - The discovered RL rule, named DiscoRL, was evaluated using the Atari benchmark, achieving a normalized score of 13.86, surpassing all existing RL methods [26][29]. - Disco57, a variant of DiscoRL, demonstrated superior performance on previously unseen benchmarks, including ProcGen, indicating its strong generalization capabilities [33][34]. Generalization and Robustness - Disco57 showed robustness across various agent-specific settings and environments, achieving competitive results without using domain-specific knowledge [36][35]. - The research highlights the importance of diverse and complex environments for the discovery process, leading to stronger and more generalizable rules [39][40]. Efficiency and Scalability - The discovery process was efficient, requiring significantly fewer experiments compared to traditional methods, thus saving time and resources [40]. - The performance of the discovered rules improved with the number and diversity of environments used for discovery, indicating a scalable approach [40]. Qualitative and Information Analysis - Qualitative analysis revealed that the discovered predictions could identify significant events before they occurred, enhancing the learning process [45]. - Information analysis indicated that the discovered predictions contained unique information about upcoming rewards and strategies, which were not captured by traditional methods [46]. Emergence of Bootstrapping Mechanism - Evidence of a bootstrapping mechanism was found, where future predictions influenced current targets, demonstrating the interconnectedness of the learning process [47]. - The performance of the discovered rules was significantly impacted by the use of these predictions for strategy updates, emphasizing their importance in the learning framework [47]. Conclusion - This research marks a pivotal step towards machine-designed RL algorithms that can compete with or exceed the performance of human-designed algorithms in challenging environments [48].
Meta拆掉AI持续学习路上的最大炸弹,“微调”又有了一战之力
3 6 Ke· 2025-10-27 05:13
Core Insights - The article discusses the recent advancements in large language models (LLMs) regarding their ability to achieve continual learning and self-evolution, addressing criticisms about their lack of genuine learning capabilities [1][2]. Group 1: Paths to Continual Learning - The ability of LLMs to learn continuously is fundamentally linked to their memory depth and plasticity, with three main paths identified for enhancing this capability [2]. - The first path involves modifying the "context" or "working memory" of the model through In-Context Learning (ICL), where new information is provided in prompts to help the model learn to solve specific problems [4][6]. - The second path introduces an "external memory bank" (RAG), allowing models to access and maintain an external database for comparison and retrieval, exemplified by Google's DeepMind's "Reasoningbank" [7]. - The third path focuses on parameter-level continual learning, which has faced challenges due to the complexities and instabilities associated with methods like Reinforcement Learning (RL) and Low-Rank Adaptation (LoRA) [10][11]. Group 2: Sparse Memory Fine-Tuning - Meta AI's recent paper introduces Sparse Memory Fine-Tuning (SFT) as a solution to the challenges of traditional SFT, particularly addressing the issue of catastrophic forgetting [11][28]. - The proposed method involves a three-step process: modifying the architecture to include a memory layer, using TF-IDF to identify which parameters to update, and performing sparse updates to only the most relevant parameters [12][22][23]. - This new approach has shown significant improvements, with models experiencing only an 11% drop in performance on original tasks after learning new facts, compared to 71% and 89% drops with LoRA and full fine-tuning, respectively [23][25]. Group 3: Implications for the Future of LLMs - The advancements in SFT suggest a potential shift in how models can be updated safely and effectively, moving away from static tools to dynamic agents capable of continuous learning [31][32]. - The successful implementation of these methods could mark the beginning of a new era for self-evolving models, aligning with the vision of models that grow and adapt through experience [31][32].
外滩大会速递(1):萨顿提出AI发展新范式,强化学习与多智能体协作成关键
Haitong Securities International· 2025-09-12 02:47
Investment Rating - The report does not explicitly provide an investment rating for the industry or specific companies within it. Core Insights - Richard Sutton proposes that we are entering an "Era of Experience" characterized by autonomous interaction and environmental feedback, emphasizing the need for systems that can create new knowledge through direct interaction with their environments [1][8] - Sutton argues that public fears regarding AI, such as bias and unemployment, are overstated, and that multi-agent cooperation can lead to win-win outcomes [9] - The report highlights the importance of continual learning and meta-learning as key areas for unlocking the potential of reinforcement learning [3][13] Summary by Sections Event - Sutton's presentation at the 2025 INCLUSION Conference outlines a shift from static knowledge transfer to dynamic agent-environment interactions, marking a transition to an "Era of Experience" [1][8] - He identifies reinforcement learning as crucial for this transition, but notes that its full potential is contingent on advancements in continual and meta-learning [1][8] Commentary - The report discusses the shift from "data as experience" to "capability as interaction," suggesting that firms need to develop systems that can actively engage with their environments to generate new knowledge [2][11] - It emphasizes that the real bottleneck in reinforcement learning is not model parameters but the ability to handle time and task sequences, highlighting the need for continual and meta-learning capabilities [3][13] Technical Bottlenecks - The report identifies two main constraints in reinforcement learning: the need for continual learning to avoid catastrophic forgetting and the need for meta-learning to enable rapid adaptation across tasks [3][13] - It suggests that R&D should focus on long-horizon evaluation and the integration of memory mechanisms and planning architectures [3][13] Decentralized Collaboration - The report posits that decentralized collaboration is not only a technical choice but also a governance issue, requiring clear incentives and transparent protocols to function effectively [4][12] - It outlines three foundational institutional requirements for effective decentralized collaboration: open interfaces, cooperation-competition testbeds, and auditability [4][12] Replacement Dynamics - Sutton's view on "replacement" suggests that it will occur at the task level rather than entire job roles, urging organizations to proactively deconstruct tasks and redesign processes for human-AI collaboration [5][15] - The report recommends establishing a human-AI division of labor and reforming performance metrics to focus on collaborative efficiency [5][15]
外滩大会再证蚂蚁的底色:金融科技公司
Mei Ri Shang Bao· 2025-09-11 23:04
Group 1: Conference Overview - The 2025 Inclusion·Bund Conference opened in Shanghai with the theme "Reshaping Innovative Growth," featuring 550 guests from 16 countries and regions, including notable figures like Richard Sutton and Yuval Noah Harari [1] - The conference focused on five main topics: "Financial Technology," "Artificial Intelligence and Industry," "Innovation and Investment Ecology," "Global Dialogue and Cooperation," and "Responsible Innovation and Inclusive Future," comprising one main forum and 44 insight forums [1] - The event is recognized as one of Asia's three major financial technology conferences, attracting global attention for its openness, diversity, and forward-looking nature [1] Group 2: Insights from Richard Sutton - Richard Sutton, the 2024 Turing Award winner, emphasized that artificial intelligence is entering an "experience era," where the potential for AI exceeds previous capabilities [2] - He noted that current machine learning methods are reaching the limits of human data, and there is a need for new data sources generated through direct interaction between intelligent agents and the world [2] - Sutton defined "experience" as the interaction of observation, action, and reward, which is essential for learning and intelligence [2][3] Group 3: Insights from Wang Xingxing - Wang Xingxing, CEO of Yushutech, expressed regret for not pursuing AI earlier, highlighting the rapid development of large models that now allow for the integration of AI with robotics [4] - He discussed the emergence of a new embodied intelligence industry, where robots can possess AGI capabilities, enabling them to perceive, plan, and act autonomously [4] - Wang is optimistic about the future of innovation and entrepreneurship, stating that the barriers to entry have significantly lowered, creating a favorable environment for young innovators [4] Group 4: Ant Group's Technological Advancements - Ant Group is recognized as a leading technology financial company, with significant investments in AI and various sectors [5][6] - The conference showcased Ant Group's new AI assistant "Xiao Zheng," which integrates multiple large models to streamline government services [6] - Ant Group's CTO announced the launch of the "Agentic Contract," which will be natively deployed on their new Layer2 blockchain, Jovay [6]
对AI的恐惧被夸大了,“强化学习之父”萨顿外滩演讲:四条原则预言AI未来
3 6 Ke· 2025-09-11 08:34
Group 1 - The core idea presented is that the human data dividend is nearing its limit, and artificial intelligence (AI) is entering an "experience era" centered on continuous learning, which has the potential to exceed previous capabilities [1][9][44] - AI's current training methods are primarily focused on transferring existing human knowledge to static models without autonomous learning capabilities, leading to a recognition of the limitations of this approach [10][14] - The future of AI relies on the development of two currently immature technologies: continual learning and meta-learning, which are essential for unlocking the full potential of experience-based learning [16][14] Group 2 - AI has become a highly politicized issue, with public fears about bias, unemployment, and even human extinction being exaggerated and fueled by certain organizations and individuals [16][18][25] - The call for regulation and control of AI reflects a broader societal tendency to fear the unknown, which can hinder collaborative efforts necessary for progress [24][28] - The concept of decentralized collaboration is emphasized as a superior alternative to centralized control, allowing for coexistence among diverse intelligent agents with different goals [20][26][21] Group 3 - Four principles are proposed to predict the future of AI: the absence of a unified global opinion on how the world should operate, the eventual understanding and creation of intelligence by humans, the inevitable surpassing of current human intelligence by superintelligent entities, and the flow of power and resources towards the most intelligent agents [35][36][37] - The inevitability of AI's replacement of human roles is acknowledged, framing it as a natural progression in the evolution of intelligence [38][44] - The role of humans as catalysts and pioneers in the "design era" is highlighted, emphasizing the unique ability to push design to its limits through AI [42][43]