谷歌新架构突破Transformer超长上下文瓶颈!Hinton灵魂拷问:后悔Open吗?
量子位·2025-12-05 09:33

Core Insights - Google has recently made significant advancements in AI, particularly in addressing the limitations of the Transformer architecture regarding long context processing [5][7][32] - The introduction of new models, Titans and MIRAS, aims to combine the speed of RNNs with the performance of Transformers, allowing for the expansion of context windows up to 2 million tokens during inference [2][11][14] Group 1: New Architectures - Titans is a new architecture that incorporates a neural long-term memory module, which dynamically updates weights during inference, enhancing the model's ability to retain and process information [14][15] - MIRAS serves as the theoretical framework behind Titans, focusing on integrating new and old information efficiently without losing critical concepts [22][28] Group 2: Memory Mechanisms - The Titans architecture introduces the concept of "Memory as Context" (MAC), which allows the model to use long-term memory as additional context for the attention mechanism, improving its ability to summarize and understand large amounts of information [16][18] - The model's ability to selectively update long-term memory based on "surprise metrics" enables it to prioritize significant new inputs while maintaining efficiency [19][20][21] Group 3: Performance Comparison - Experimental results indicate that models based on Titans and MIRAS outperform state-of-the-art linear recurrent models and comparable Transformer baseline models, demonstrating superior performance even with fewer parameters [27][32] - The new architecture's capability to handle extremely long contexts positions it as a strong competitor against large models like GPT-4 [32] Group 4: Future of AI Models - The exploration beyond Transformers continues, but the Transformer architecture remains a foundational theory in the era of large models [33] - Google's decision to publicly share its Transformer research has had a profoundly positive impact on the AI community, as noted by industry leaders [34]