Workflow
触觉语言模型 DOVE
icon
Search documents
国内首篇!融合语言模型的多模态触觉传感器,推动机器人触觉迈向人类水平
机器之心· 2026-01-25 04:01
Core Viewpoint - The article discusses the development of a biomimetic multimodal tactile sensor, SuperTac, which enhances robotic perception by integrating advanced sensing technologies inspired by the complex sensory systems of pigeons, marking a significant step towards achieving human-like tactile perception in robots [2][4]. Group 1: Biomimetic Logic - SuperTac's hardware design is inspired by the biological features of pigeons, which possess one of the most complex sensory systems in nature [7]. - The sensor integrates a miniaturized multispectral imaging module that covers an ultra-wide frequency range from ultraviolet (390 nm) to mid-infrared (5.5–14.0 μm), enabling robots to analyze thermal radiation, fluorescence, and other physical information in a single interaction [10][11]. Group 2: Core Mechanism - The core competitive advantage of SuperTac lies in its 1 mm thick light field modulation multi-layer sensing skin, which utilizes a conductive layer made of transparent PEDOT:PSS to provide uniform electrical signals for high-precision material classification [14]. - The sensor's design allows it to capture micro-textures and deformations while also obtaining RGB color information, ensuring comprehensive data collection during interactions [16]. Group 3: Tactile Language Model - The DOVE model, with 8.5 billion parameters, employs a hierarchical architecture to align cross-modal physical signals with natural language representations, enhancing the system's language understanding and logical reasoning capabilities [19]. - DOVE's training involves a three-stage strategy that transforms heterogeneous sensor signals into a unified image representation, aligning tactile features with language model space for complex reasoning [20]. Group 4: Application Scenarios - SuperTac, combined with the DOVE model, enables robots to transition from basic physical perception to high-level semantic cognition, allowing for human-like embodied interactions [22]. - In practical applications, DOVE can accurately translate sensory impressions into human-understandable language, such as identifying an object as a "yellow, room temperature, metal material with patterned protrusions" [24]. - The model's capabilities are validated in challenging tasks, such as waste sorting, where it can deduce the characteristics of objects and make logical decisions based on tactile feedback [26]. Group 5: Future Directions - The research outlines promising future directions for robotic tactile sensing, including sensor miniaturization, low-power chips, and high-integration packaging to enhance operational flexibility and thermal stability [28].