InformationFusion期刊发表：Touch100k用语言解锁触觉感知新维度

Core Insights - The article discusses the significance of touch in enhancing the perception and interaction capabilities of robots, highlighting the development of the Touch100k dataset and the TLV-Link pre-training method [1][11]. Group 1: Touch100k Dataset - Touch100k is the first large-scale dataset that integrates tactile, multi-granular language, and visual modalities, aiming to expand tactile perception from "seeing" and "touching" to "expressing" through language [2][11]. - The dataset consists of tactile images, visual images, and multi-granular language descriptions, with tactile and visual images sourced from publicly available datasets and language descriptions generated through human-machine collaboration [2][11]. Group 2: TLV-Link Method - TLV-Link is a multi-modal pre-training method designed for tactile representation using the Touch100k dataset, consisting of two phases: course representation and modality alignment [6][11]. - The course representation phase employs a "teacher-student" paradigm where a well-trained visual encoder transfers knowledge to a tactile encoder, gradually reducing the teacher model's influence as the student model improves [6][11]. Group 3: Experiments and Analysis - Experiments evaluate TLV-Link from the perspectives of tactile representation and zero-shot tactile understanding, demonstrating its effectiveness in material property recognition and robot grasping prediction tasks [8][11]. - Results indicate that the Touch100k dataset is practical, and TLV-Link shows significant advantages over other models in both linear probing and zero-shot evaluations [9][11]. Group 4: Summary - The research establishes a foundational dataset and method for tactile representation learning, enhancing the modeling capabilities of tactile information and paving the way for applications in robotic perception and human-robot interaction [11].