开源8300小时标注数据,新一代实时通用游戏AI Pixel2Play发布
机器之心·2026-01-17 03:24

Core Insights - The article discusses the advancements in AI models for gaming, particularly focusing on the Pixel2Play (P2P) model developed by researchers at Player2, which aims to enhance AI's performance in real-time gaming environments [2][5]. Group 1: Model Development - The P2P model utilizes game visuals and text instructions as inputs to generate corresponding keyboard and mouse operation signals, achieving over 20Hz end-to-end inference speed on consumer-grade RTX 5090 graphics cards [2]. - P2P has been trained on over 40 games, totaling more than 8300 hours of gameplay data, and can play multiple games on Roblox and Steam in a zero-shot manner [2]. - The model employs a lightweight framework and is built from scratch, featuring a decoder Transformer and a lightweight action-decoder to enhance inference speed by five times [10]. Group 2: Training Data and Open Source - High-quality "visual-action" data is scarce online, prompting the Open-P2P project to open-source all training datasets to fill this gap [5][3]. - The training data includes game images, text instructions, and precise keyboard and mouse operation annotations, which are crucial for training effective game AI models [8][5]. Group 3: Model Evaluation - P2P has been evaluated using four different model sizes, with parameters ranging from 150M to 1.2B, achieving inference speeds of 80Hz for the 150M model and 40Hz for the 1.2B model [12]. - In human evaluations, the 1.2B model showed a preference rate of 80%, 83%, and 75% over smaller models in various games, indicating superior performance [13]. - The model's ability to follow text instructions significantly improved its success rate in tasks, demonstrating strong understanding and execution capabilities [15]. Group 4: Causal Reasoning - The article highlights the challenge of causal confusion in behavior cloning, particularly in high-frequency interaction environments, and notes that increasing model size and training data can enhance the model's understanding of causal relationships [17]. - As training data and model parameters increase, the P2P model's performance in causal inference assessments shows a positive trend [19].