无人机也能打排球吗？清华团队用强化学习探了探路

Core Insights - The article discusses a new embodied AI task proposed by Tsinghua University, focusing on "multi-drone volleyball," which aims to enhance the capabilities of drones in a three-dimensional space through teamwork and strategy [1][2]. Group 1: Task Overview - The "multi-drone volleyball" task requires drones to demonstrate high maneuverability and precise control while collaborating as a team to hit a ball over a net and compete against opposing teams [2]. - The Tsinghua team has developed the VolleyBots testing platform to simulate the human learning process in volleyball, incorporating various tasks for single and multiple drones [2][6]. Group 2: Algorithm Development - The Hierarchical Co-Self-Play (HCSP) algorithm was designed to enable drones to learn cooperation, division of roles, and offensive/defensive transitions through hierarchical strategy learning and self-play mechanisms [2][12]. - The research incorporated various reinforcement learning and game-theoretic algorithms, with the HCSP showing an average win rate of 82.9% against multiple baseline algorithms [15]. Group 3: Training Phases - The training process consists of three phases: low-level skill learning, high-level strategy game playing, and collaborative self-play, allowing drones to evolve their strategies and skills in a competitive environment [14]. - The drones demonstrated the ability to form clear roles during matches, such as defense, passing, and offense, and even developed new tactics like "setter's lob" during training [15]. Group 4: Real-World Application - The JuggleRL system was introduced to enable drones to perform continuous juggling in the real world, achieving a record of 462 consecutive juggles without any real data fine-tuning [16][18]. - This achievement marks a significant step in embodied reinforcement learning, transitioning from virtual environments to real physical interactions [18][19].