目标导航

Search documents
哈工大提出UAV-ON:面向空中智能体的开放世界目标导航基准测试
具身智能之心· 2025-08-05 00:03
点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Jianqiang Xiao等 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 目标导航(ObjectNav)作为一种替代方案,要求智能体基于语义线索定位目标,无需密集的指令序列。但现有ObjectNav研究主要集中在地面室内场景,在大规 模、非结构化的户外空中环境中仍未得到充分探索。为此,UAV-ON基准被提出,旨在推动无人机在复杂现实环境中基于语义目标描述的自主导航研究。 UAV-ON基准概述 UAV-ON是首个针对无人机在开放世界中进行实例级目标导航的大规模基准,其核心特点包括: 多样化环境 :包含14个基于Unreal Engine构建的高保真户外环境,覆盖城市、森林、山地、水域等多种场景,空间尺度从350×250到1400×1250单位不等,总 水平面积约900万平方单位,体现了真实世界的语义丰富性和视觉复杂性。 语义目标设计 :定义了1270个标注目标,每个目标对应一个实例级 ...
大话一下!具身里面视觉语言导航和目标导航有什么区别?
具身智能之心· 2025-08-01 10:30
Core Viewpoint - The article discusses the evolution of robot navigation technology from traditional mapping and localization to large model-based navigation, which includes visual language navigation (VLN) and goal navigation. VLN focuses on following instructions, while goal navigation emphasizes autonomous exploration and pathfinding based on environmental understanding [1][5]. Group 1: Visual Language Navigation (VLN) - VLN is fundamentally a task of following instructions, which involves understanding language commands, perceiving the environment, and planning movement strategies. The VLN robot system consists of a visual language encoder, historical environmental representation, and action strategy modules [2][4]. - The learning process for the strategy network has shifted from extracting patterns from labeled datasets to leveraging large language models (LLMs) for effective planning information extraction [4] - The architecture of VLN robots requires them to accumulate visual observations and execute actions in a loop, making it crucial to determine the current task stage for informed decision-making [4]. Group 2: Goal Navigation - Goal navigation extends VLN by enabling agents to autonomously explore and plan paths in unfamiliar 3D environments based solely on target descriptions, such as coordinates or images [5][7]. - Unlike traditional VLN, goal-driven navigation systems must transition from understanding commands to independently interpreting the environment and making decisions, integrating computer vision, reinforcement learning, and 3D semantic understanding [7]. Group 3: Commercial Applications and Demand - Goal-driven navigation technology has been successfully implemented in various verticals, such as terminal delivery, where it combines with social navigation algorithms to handle dynamic environments and human interactions [9]. - Companies like Meituan and Starship Technologies have deployed delivery robots in complex urban settings, while others like Aethon have developed service robots for medical and hospitality sectors, enhancing service efficiency [9][10]. - The growth of humanoid robots has led to an increased focus on adapting navigation technology for applications in home services, healthcare, and industrial logistics, creating significant job demand in the navigation sector [10]. Group 4: Learning and Knowledge Challenges - Both VLN and goal navigation require knowledge across multiple domains, including natural language processing, computer vision, reinforcement learning, and graph neural networks, making it challenging for newcomers to gain comprehensive expertise [11]. - The fragmented nature of knowledge in these fields can lead to difficulties in learning, often causing individuals to abandon their studies before achieving a solid understanding [11].
具身目标导航是怎么找到目标并导航的?
具身智能之心· 2025-07-13 04:13
Core Viewpoint - The article discusses the evolution of robot navigation technology from traditional mapping and localization to large model-based navigation, which includes visual language navigation (VLN) and goal navigation. VLN focuses on following instructions, while goal navigation emphasizes understanding the environment to find paths independently [1][4]. Group 1: Visual Language Navigation (VLN) - VLN is fundamentally a task of following instructions, which involves understanding language commands, perceiving the environment, and planning movement strategies. The VLN robot system consists of a visual language encoder, environmental history representation, and action strategy modules [2]. - The key challenge in VLN is how to effectively compress information from visual and language inputs, with current trends favoring the use of large-scale pre-trained visual language models and LLMs for instruction breakdown and task segmentation [2][3]. - The learning of strategy networks has shifted from pattern extraction from labeled datasets to distilling effective planning information from LLMs, marking a significant research focus [3]. Group 2: Goal Navigation - Goal navigation extends VLN by requiring agents to autonomously explore and plan paths in unfamiliar 3D environments based solely on target descriptions, such as coordinates or images [4]. - Unlike traditional VLN, goal-driven navigation systems must transition from "understanding instructions to finding paths" by autonomously parsing semantics, modeling environments, and making dynamic decisions [6]. Group 3: Commercial Applications and Demand - Goal-driven navigation technology has been industrialized in various verticals, such as terminal delivery, where it combines with social navigation algorithms to handle dynamic environments. Examples include Meituan's delivery robots and Starship Technologies' campus delivery robots [8]. - In sectors like healthcare, hospitality, and food service, companies like 嘉楠科技, 云迹科技, and Aethon have deployed service robots for autonomous delivery, enhancing service efficiency [8]. - The development of humanoid robots has led to an increased focus on adapting navigation technology for home services, care, and industrial logistics, creating significant job demand in the navigation sector [9]. Group 4: Learning and Knowledge Challenges - Both VLN and goal navigation require knowledge across multiple domains, including natural language processing, computer vision, reinforcement learning, and graph neural networks, making the learning path challenging for newcomers [10].
机器人导航的2个模块:视觉语言导航和目标导航有什么区别?
具身智能之心· 2025-07-02 10:18
Core Viewpoint - The article discusses the evolution of robot navigation technology from traditional mapping and localization to large model-based navigation, which includes visual language navigation (VLN) and goal navigation. VLN focuses on following instructions, while goal navigation emphasizes understanding the environment to find paths independently [1][4]. Summary by Sections Visual Language Navigation (VLN) - VLN is fundamentally a task of following instructions, which involves understanding language commands, perceiving the environment, and planning movement strategies. The VLN robot system consists of three main modules: visual language encoder, environmental history representation, and action strategy [2]. - The robot processes language commands and visual observations, requiring effective information compression through a visual language encoder. Key issues include the choice of encoder and whether to project visual and language representations into a common space [2]. - The learning of the strategy network has shifted from extracting patterns from labeled datasets to distilling effective planning information from large language models (LLMs) [3]. Goal Navigation - Goal navigation extends VLN by enabling agents to explore unfamiliar 3D environments and plan paths based solely on target descriptions, such as coordinates or images [4]. - Unlike traditional VLN, goal-driven navigation requires a transition from "understanding instructions to finding paths" autonomously, involving semantic parsing, environmental modeling, and dynamic decision-making [6]. Commercial Application and Demand - Goal-driven navigation technology has been implemented in various verticals, such as terminal delivery, where it combines with social navigation algorithms to handle dynamic environments. Examples include Meituan's delivery robots and Starship Technologies' campus delivery robots [8]. - In sectors like healthcare, hospitality, and food service, companies like 嘉楠科技, 云迹科技, and Aethon have deployed service robots for autonomous delivery, enhancing service efficiency [8]. - The development of humanoid robots has led to an increased focus on adapting navigation technology, with companies like Unitree and Tesla showcasing advanced capabilities [9]. - The growth in this sector has created significant job demand, particularly in navigation roles, which are recognized as one of the first technology subfields to achieve practical application [9]. Knowledge and Learning Challenges - Both VLN and goal navigation encompass a wide range of knowledge areas, including natural language processing, computer vision, reinforcement learning, and graph neural networks. This complexity presents challenges for learners seeking to enhance their interdisciplinary skills [10].