大世界视角

Search documents
AI已迷失方向?强化学习教父Sutton最新发布OaK架构,挑战当前AI范式,提出超级智能新构想
AI科技大本营· 2025-08-22 08:05
Core Concept - The OaK architecture is a systematic response to the need for intelligent agents that can continuously learn, model the world, and plan effectively, aiming to achieve superintelligence through experiential learning [3][5][7]. Group 1: OaK Architecture Overview - OaK architecture is a model-based reinforcement learning framework characterized by continuous learning components, specialized learning rates for each weight, and a five-step evolution path called FC-STOMP [3][26]. - The architecture emphasizes the importance of runtime learning over design-time learning, advocating for online learning where agents learn from real-world interactions [13][14][21]. Group 2: Key Features of OaK - The architecture is designed to be domain-general, empirical, and capable of open-ended complexity, allowing agents to form necessary concepts based on their computational resources [16][19]. - The "Big World" hypothesis posits that the world is far more complex than any intelligent agent can fully comprehend, leading to the conclusion that agents must operate with approximate models and strategies [19][20]. Group 3: Learning Mechanisms - OaK architecture introduces the concept of subproblems, where agents autonomously generate subproblems based on curiosity and intrinsic motivation, facilitating a cycle of problem-solving and feature generation [28][31]. - The architecture's core process involves eight steps that include learning main strategies, generating new state features, creating subproblems, and using learned models for planning [27][29]. Group 4: Challenges and Future Directions - Two significant challenges remain: ensuring reliable continual deep learning and generating new state features, which are critical for the architecture's success [37][38]. - The OaK framework aims to provide a comprehensive solution to fundamental AI problems, offering a mechanism for how learned models can be used for planning, which is currently lacking in AI [40].