工作记忆(working memory)

Search documents
AI Day直播 | MemoryVLA:助力长时序机器人操作任务
自动驾驶之心· 2025-09-03 03:19
Core Viewpoint - The article discusses the development of MemoryVLA, a cognitive-memory-action framework inspired by human memory systems, aimed at improving the performance of Vision-Language-Action (VLA) models in long-term robotic manipulation tasks [3][7]. Group 1: VLA Challenges and Solutions - Existing VLA models primarily rely on current observations, leading to poor performance in long-term, time-dependent tasks [7]. - Cognitive science indicates that humans utilize a memory system involving neural activity and the hippocampus to manage tasks effectively, which serves as the inspiration for MemoryVLA [7]. Group 2: MemoryVLA Framework - MemoryVLA incorporates a pre-trained Vision-Language Model (VLM) that encodes observations into perceptual and cognitive tokens, facilitating the formation of working memory [3]. - A Perceptual-Cognitive Memory Bank is established to store consolidated low-level details and high-level semantics, allowing for adaptive retrieval of relevant entries for decision-making [3]. Group 3: Implications for Robotics - The framework aims to enhance the ability of robots to perform tasks that require temporal awareness and memory, addressing the inherent nature of robotic manipulation tasks [3][7]. - The article also touches on the importance of memory and reasoning within VLA models, suggesting a need for further exploration in these areas [7].