人脸机器人登上Science Robotics封面:用AI教会仿生人脸机器人「开口说话」
机器之心·2026-01-15 04:31

Core Viewpoint - The article discusses a groundbreaking research achievement in humanoid robotics, focusing on the development of a robot capable of realistic lip movements synchronized with speech and music, marking a significant advancement in human-robot interaction [2][24]. Group 1: Research Background - Hu Yuhang, the founder of Shaping Technology and a PhD graduate from Columbia University, has dedicated his research to enabling robots to develop self-modeling capabilities, allowing them to understand their physical structure and adapt to various tasks [1]. - The research was published in Science Robotics and features a humanoid robot with a biomimetic facial structure that can perform lip movements in sync with human speech and songs [2][3]. Group 2: Importance of Lip Movement - Nearly half of human attention during face-to-face communication is focused on lip movements, making natural facial expressions crucial for effective interaction [5]. - Traditional humanoid robots have struggled with realistic lip movements, often appearing puppet-like when speaking, but this new research addresses that gap [6][22]. Group 3: Technical Innovations - The robot features a highly biomimetic face with over 20 miniature motors hidden beneath a flexible silicone skin, enabling rapid and coordinated lip movements [8][10]. - The robot learns to control its facial expressions through a self-supervised learning process, observing its own facial changes and building a model called Facial Action Transformer (FAT) [12]. Group 4: Learning Mechanism - The robot utilizes a combination of sound-driven lip movements and visual feedback from synthetic videos to learn the correspondence between audio signals and lip movements, allowing it to perform lip-syncing without understanding semantics [14][16]. - The research demonstrated the robot's ability to synchronize lip movements across multiple languages and even sing songs, showcasing its robust cross-linguistic generalization capabilities [18][21]. Group 5: Future Implications - The ability to express natural lip movements is seen as a missing link in humanoid robots, which are increasingly entering fields that require emotional communication, such as entertainment, education, and healthcare [22][24]. - Economists predict that over one billion humanoid robots may be manufactured in the next decade, emphasizing the necessity for these robots to possess realistic facial expressions to engage effectively with humans [24].

人脸机器人登上Science Robotics封面:用AI教会仿生人脸机器人「开口说话」 - Reportify