Workflow
HOPE架构
icon
Search documents
Transformer已死?DeepMind正在押注另一条AGI路线
3 6 Ke· 2026-01-09 02:42
Core Insights - The article discusses the breakthrough of Nested Learning by Google's DeepMind, which may address the long-standing issue of "catastrophic forgetting" in AI, potentially leading to advancements towards Artificial General Intelligence (AGI) [1][52] - Nested Learning is positioned as a successor to the Transformer architecture, suggesting a shift from passive training to active evolution in AI systems [1][2] Group 1: Nested Learning and AGI - Nested Learning is highlighted as a significant research focus for DeepMind, with predictions that it could lead to minimal AGI by 2028 with a 50% confidence level [7][9] - The concept of Nested Learning is described as a framework that allows AI to build associative memory, enabling continuous learning without the need for retraining [1][19] - Shane Legg, co-founder of DeepMind, emphasizes that there are no current blockers to achieving continual learning, indicating progress in this area [5][7] Group 2: Technical Aspects of Nested Learning - The HOPE architecture is introduced as a mechanism for implementing Nested Learning, which combines fast self-updating systems with slow, multi-timescale memory [6][8] - The article outlines the importance of memory architecture, attentional bias, retention mechanisms, and learning rules in designing effective AI models [20][21] - The Nested Learning framework is said to unify various existing attention mechanisms and optimizers, allowing for a more dynamic understanding of memory in AI [21][24] Group 3: Performance and Implications - The HOPE architecture has shown superior performance in tasks requiring long context and continual learning compared to existing models, indicating its potential effectiveness [33][47] - The article raises concerns about the implications of AI systems that can learn continuously, suggesting that they may develop preferences based on past experiences, which could lead to ethical considerations [52]
为什么这篇谷歌论文被称为「Attention is all you need」V2
量子位· 2025-12-21 05:45
Core Insights - The article discusses a groundbreaking research paper by Google titled "Nested Learning: The Illusion of Deep Learning Architectures," which is being referred to as "Attention is All You Need" V2, emphasizing a new perspective on AI's learning capabilities [1][5]. Group 1: AI Limitations - Current large language models (LLMs) suffer from a condition termed "digital amnesia," where they forget recently learned information shortly after it is taught [2][3]. - The industry has focused on making models deeper and larger, believing that increasing scale would lead to emergent memory capabilities, but this approach has significant limitations [3][4]. Group 2: Nested Learning Paradigm - The research introduces the concept of "nested learning," which posits that effective intelligent learning requires two orthogonal dimensions: depth (model layers and capacity) and frequency (the rhythm and speed of internal component updates) [9][10]. - The paper argues that mainstream optimizers, traditionally viewed as mere training engines, actually function as associative memory systems that continuously record gradient changes [6]. Group 3: HOPE Architecture - The new architecture proposed, named HOPE, features a continuous memory system with multiple MLP modules arranged like a spectrum, each updating at different frequencies [14]. - This architecture mimics the human brain's memory processes, allowing new knowledge to be integrated without causing systemic collapse or forgetting [17][16]. Group 4: Future Implications - The value of "nested learning" lies not in immediately replacing existing models like Transformers but in providing a new design logic and framework for AI development [18]. - The exploration of memory and learning processes is still in its early stages, suggesting that future AI advancements may require systems capable of learning and evolving rather than being static repositories of knowledge [18].
通信行业周观点:谷歌嵌套学习架构革新,Claude Opus4.5高性价比-20251202
Changjiang Securities· 2025-12-02 09:42
Investment Rating - The report maintains a "Positive" investment rating for the communication industry [9]. Core Insights - The communication sector saw an increase of 8.71% in the 48th week of 2025, ranking first among major industries in the Yangtze River region. Year-to-date, the sector has risen by 64.42%, also ranking first [2][4]. - Google's introduction of the Nested Learning theory and HOPE architecture significantly enhances long-term memory and reasoning efficiency, addressing the memory bottlenecks of traditional Transformers in long sequences, which can greatly reduce training and inference costs [5][7]. - Anthropic's Claude Opus 4.5 has achieved state-of-the-art performance in software engineering, with aggressive pricing strategies that lower input and output costs by 67%, while also integrating deeply with existing office workflows [6][7]. Summary by Sections Market Performance - The communication sector's performance in the 48th week of 2025 was highlighted, with notable individual stock performances, including Guangku Technology (+39.2%), Tongyu Communication (+39.1%), and Taicheng Light (+22.3%) [4]. Technological Advancements - Google's Nested Learning paradigm optimizes memory processes by breaking down large models into nested sub-optimization problems, enhancing memory management and reducing inference costs [5]. - The HOPE architecture, based on this theory, separates high-frequency and low-frequency memory tasks, improving efficiency in long-sequence processing [5]. Product Launches - Anthropic's Claude Opus 4.5 supports a context window of approximately 200k and has outperformed competitors in software engineering tasks, with a significant reduction in usage costs for enterprises [6]. Investment Recommendations - The report recommends several companies across various segments, including: - Telecom Operators: China Mobile, China Telecom, China Unicom - Optical Modules: Zhongji Xuchuang, Xinyi Sheng, Tianfu Communication - AI Applications: Boshi Jie, Heertai, Tuobang Co., Yiyuan Communication - Satellite Applications: Huace Navigation, Haige Communication, Canqin Technology [7].
LLM 语境下,「持续学习」是否是 「记忆」 问题的最优解?
机器之心· 2025-11-16 01:30
Group 1 - The article discusses the concept of "Nested Learning" proposed by Google, which aims to address the memory management issues in LLMs (Large Language Models) and the challenges of catastrophic forgetting [5][6][8] - Nested Learning is presented as a multi-layered optimization problem, where models are seen as a series of interconnected sub-problems, allowing for the simultaneous learning of new skills while avoiding the loss of previously acquired knowledge [6][7] - The research introduces the "Continuous Memory System" (CMS), which treats memory as a system of multiple modules that update at different frequencies, enhancing the model's ability to manage memory effectively [6][7] Group 2 - The article highlights the importance of improving LLMs' memory capabilities to enable continual learning, allowing AI to retain contextual experiences, semantic knowledge, and procedural skills [8] - A proposed three-layer memory architecture includes Model Weights for general knowledge, KV Cache for intermediate results, and Context for relevant background information, facilitating appropriate responses from the model [8]