Workflow
强化学习
icon
Search documents
迪士尼机器人「摔跤」也内卷:不仅要摔得轻,还要摔得帅!AI新研究把Bug玩成绝活
机器人大讲堂· 2025-12-22 11:26
机器人摔倒是个大难题,尤其是 "头重脚轻"的机器人,一不小心就可能造成昂贵的损伤。过去,为了防止摔 倒,工程师们要么限制其性能,让它畏首畏尾;要么任其"硬着陆" 。 这些方法都治标不治本。 但是,如果换个思路呢? 与其想尽办法避免摔倒,不如把 "摔倒"本身,变成一门可以学习和控制的艺术。 就在最近,来自迪士尼研究院( Disney Research)的一项最新研究,彻底颠覆了我们对机器人摔倒的认 知。他们提出了一种名为"机器人速成班:学习柔软且风格化的摔倒"(Robot Crash Course: Learning Soft and Stylized Falling)的全新方法。 这项研究的核心思想是: 让机器人不仅能摔得 "软",最大限度减少冲击和损伤,还能摔得"帅",在倒地后摆 出一个用户指定的、充满艺术感的姿势。 想象一下,一个机器人在舞台上出现失误,它没有僵硬地倒下,而是顺势一个翻滚,最后以一个帅气的卧倒姿 势结束,不仅没出糗,反而秀了一波操作。这简直是把 Bug玩成了绝活! 这项研究成果,不仅能让机器人在娱乐、影视等行业大放异彩,更能为机器人的安全和快速恢复提供全新的解 决方案。一个能控制自己摔倒姿 ...
RL加持的3D生成时代来了!首个「R1 式」文本到3D推理大模型AR3D-R1登场
机器之心· 2025-12-22 08:17
强化学习(RL)在大语言模型和 2D 图像生成中大获成功后,首次被系统性拓展到文本到 3D 生成领域!面对 3D 物体更高的空间复杂性、全局几何一致 性和局部纹理精细化的双重挑战,研究者们首次系统研究了 RL 在 3D 自回归生成中的应用! 强化学习应用于 3D 生成的挑战 来自上海人工智能实验室、西北工业大学、香港中文大学、北京大学、香港科技大学等机构的研究者提出了 AR3D-R1 ,这是首个强化学习增强的文本到 3D 自回归模型。该工作系统研究了奖励设计、RL 算法和评估基准,并提出 Hi-GRPO ——一种层次化强化学习范式,通过分离全局结构推理与局部纹理 精修来优化 3D 生成。同时引入全新基准 MME-3DR ,用于评估 3D 生成模型的隐式推理能力。 实验表明 AR3D-R1 在 Kernel Distance 和 CLIP Score 上均取得显著提升,达到 0.156 和 29.3 的优异成绩。 论文标题:Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation 代码链接: https://github. ...
2025 文章、播客合集 | 42章经
42章经· 2025-12-21 13:32
Core Insights - The company has been actively engaging in AI discussions, releasing a total of 22 podcasts and 18 articles in the current year, with a significant increase in podcast subscriptions reaching nearly 110,000 [2][37] - The importance of organizational capability in the AI era has been highlighted, suggesting it is a critical barrier for AI companies [3] - The company remains optimistic about the AI market despite fluctuations, indicating that early entrants and optimistic investors are likely to reap rewards [8] Summary by Sections - **2023 and 2024 Activities**: In 2023, the company published 20 pieces of content, while in 2024, it increased to 34 pieces despite a market downturn [2] - **Key Podcast Episodes**: - The episode discussing organizational capability as a barrier for AI companies was particularly impactful [3] - A conversation with Zhang Jinjian provided insights into structural changes in a rapidly differentiating world [4] - The episode on AI infrastructure clarified its role beyond cost reduction, emphasizing its importance for AI companies' success [6] - **Market Outlook**: The company expressed a positive outlook for 2025, identifying hidden opportunities amidst market pessimism in 2024 [8] - **Emerging Trends**: Discussions on Agent development and its implications for the AI landscape were prevalent, indicating a growing interest in this area [9][14] - **Globalization Challenges**: Insights from PingCAP's CTO highlighted the challenges and lessons learned in globalizing AI ventures [30]
机器人学习现状!Physical Intelligence内部员工分享(从数采到VLA再到RL)
具身智能之心· 2025-12-20 16:03
Core Insights - The article discusses the current state of robot learning as of December 2025, emphasizing that most systems rely on behavior cloning (BC) and the challenges associated with it [5][40][39] - It highlights the importance of data collection from human demonstrations and the limitations of existing methods in achieving robust performance in real-world applications [6][10][12] Group 1: Behavior Cloning and Its Challenges - As of December 2025, all robot learning systems primarily utilize behavior cloning, where human demonstrations are used to train models to mimic actions [5] - The data for behavior cloning comes from human demonstrations and various other sources, but the need for extensive data collection poses significant challenges [7][10] - The limitations of behavior cloning include the inability to generalize well to out-of-distribution (OOD) states, leading to performance degradation in real-world scenarios [16][23][40] Group 2: Data Collection Methods - Data collection methods include using human operators with smart demo gloves and video platforms to gather diverse task execution data [11][13] - The challenges in data collection include ensuring the data is representative of the tasks and the need for extensive training for operators to provide usable data [9][10] - The article emphasizes the importance of high-quality data for training models and the difficulties in achieving this at scale [10][19] Group 3: Future Directions in Robot Learning - The article predicts that within two years, video model backbones will replace current VLA methods, and within ten years, world models will effectively simulate general open-world interactions [73] - It suggests that traditional simulation and game engines will serve as data generators for world models, emphasizing the continued importance of expert demonstration data [73] - The need for robust Q/V functions that can operate effectively in OOD states is highlighted as a critical area for future research [72]
对话小马智行王皓俊:Robotaxi正进入1到1000的阶段
Hua Er Jie Jian Wen· 2025-12-20 05:31
Core Insights - The global autonomous driving industry is undergoing a paradigm shift, transitioning from experimental phases to tangible financial performance, with companies like Baidu and Pony.ai achieving operational profitability [2][4][11] - The competitive landscape for Robotaxi has evolved, focusing on profitability and operational efficiency as hardware costs decrease and AI reshapes operational rules [3][11] Commercialization Progress - Pony.ai's Robotaxi achieved unit economic model (UE) profitability in Guangzhou, indicating a successful transition from R&D to commercial viability [4][5] - The average daily revenue for Pony.ai's seventh-generation Robotaxi is approximately 299 RMB, with a target of 24 rides per day to ensure positive cash flow [4][5] - The company aims to scale its fleet to 1,000 vehicles by 2025, 3,000 by 2026, and 100,000 by 2030, integrating Robotaxi into daily life [2][11] Cost Management and Operational Efficiency - Significant cost reductions have been achieved, with the BOM cost of the seventh-generation vehicle dropping by 70% compared to the sixth generation [5][6] - The use of mass-produced components and optimized algorithms has enhanced operational efficiency, allowing for better performance with lower costs [5][6] - Insurance costs for Robotaxi are 50% lower than traditional taxis, reflecting the safety record of AI drivers [6] Industry Competition - The Robotaxi market is becoming increasingly competitive, with major players like Waymo and Tesla entering the fray, each adopting different strategies [8][10] - Waymo's recent funding round has pushed its valuation to nearly $100 billion, while Tesla is focusing on a low-cost, vision-based approach [8][10] - New entrants like XPeng and Hello are also planning to launch their own Robotaxi services, intensifying competition [9][10] Market Potential and Future Outlook - The Robotaxi market could reach $80 billion in major Chinese cities by 2030, with potential global market size reaching $3.94 trillion when including overseas markets [12] - As hardware costs decline, operational expenses will become a larger portion of the cost structure, emphasizing the importance of operational efficiency [12] - The industry is shifting from a focus on technology to one centered on operational capabilities and market presence [11][12] Strategic Shifts - Pony.ai is transitioning to a "light asset" model, partnering with vehicle manufacturers and service platforms to reduce capital expenditure [7][14] - The company is focusing on creating a value chain where it provides AI technology while others handle vehicle production and service distribution [7][14] - The emphasis is on building partnerships and leveraging local resources in international markets, particularly in the Middle East [6][18]
「一脑多形」圆桌:世界模型、空间智能在具身智能出现了哪些具体进展?丨GAIR 2025
雷峰网· 2025-12-20 04:07
Core Viewpoint - The article discusses the current state and future potential of embodied intelligence, focusing on the challenges and opportunities presented by world models and spatial intelligence in the field of robotics and AI [2][4][10]. Group 1: Development of Embodied Intelligence - The technology route for embodied intelligence is still in an exploratory phase, with no convergence yet, which is seen as a positive sign for innovation [4][3]. - There is a consensus among experts that the core issues of embodied intelligence, such as interaction and human-machine collaboration, should be addressed by academic institutions, while industries focus on practical applications [4][5]. - The integration of AI with physical entities is expected to lead to significant advancements in intelligence, but the field must avoid reverting to industrial automation without achieving generalized intelligence [4][5][30]. Group 2: World Models in Autonomous Driving - World models are currently being utilized by leading companies like Tesla to enhance data generation and improve decision-making processes through closed-loop testing [11][12]. - The concept of world models has gained traction in autonomous driving due to the simplicity of generating scenarios compared to robotics, with advancements in generative AI enabling the creation of realistic training samples [12][13]. - There is ongoing debate regarding the definition and application of world models in both autonomous driving and robotics, with differing opinions on the necessity of pixel-level reconstruction versus latent state representation [12][13][14]. Group 3: Spatial Intelligence in Robotics - Spatial intelligence is a critical aspect of robotics, with a focus on perception and understanding spatial relationships, which has evolved from traditional SLAM techniques to more learning-based approaches [20][21]. - The current challenges in spatial intelligence include the need for better data representation and understanding of complex spatial relationships, which are still underdeveloped in robotic systems [22][23]. - The integration of visual and semantic information is essential for enhancing robots' spatial capabilities, but the field is still in its early stages [22][23][24]. Group 4: Commercialization and Future Applications - The future of drone applications is expected to expand significantly, with potential uses in various sectors, but the timeline for widespread adoption remains uncertain [26][27]. - The gap between technological capabilities and market needs poses challenges for entrepreneurs, as there is often a mismatch between innovative ideas and practical industrial requirements [30][31]. - The shift towards learning-based control paradigms is anticipated to increase the applicability of drones and robots in real-world scenarios, moving beyond traditional automation [28][29].
最近收到了很多同学关于自驾方向选择的咨询......
自动驾驶之心· 2025-12-19 09:25
Core Insights - The article discusses various advanced directions in autonomous driving research, emphasizing the importance of deep learning and traditional methods for different academic backgrounds [2][3]. Group 1: Research Directions - Key areas of focus include VLA, end-to-end learning, reinforcement learning, 3DGS, and world models, which are recommended for students in computer science and automation [2]. - For mechanical and vehicle engineering students, traditional methods like PnC and 3DGS are suggested due to their lower computational requirements and ease of entry [2]. Group 2: Paper Guidance Services - The article announces the launch of a paper guidance service that covers various topics such as end-to-end learning, multi-sensor fusion, and trajectory prediction [3][6]. - The service includes support for topic selection, full process guidance, and experimental assistance [6]. Group 3: Publication Success - The guidance service has a high acceptance rate for papers submitted to top conferences and journals, including CVPR, AAAI, and ICLR [7]. - The article highlights the range of publication venues, including CCF-A, CCF-B, and various SCI categories [10].
首个文本到3D生成RL范式诞生,攻克几何与物理合理性
量子位· 2025-12-19 07:20
3DGenR1团队 投稿 量子位 | 公众号 QbitAI 在大语言模型和文生图领域,强化学习 (RL) 已成为提升模型思维链与生成质量的关键方法。 但当我们将目光转向更为复杂的文本到3D生成时,这套方法还会还管用吗? 近期,一项由 西北工业大学、北京大学、香港中文大学、上海人工智能实验室、香港科技大学合作 开展 的研究系统性探索了这一重要问 题。 论文链接: https://arxiv.org/pdf/2512.10949 代码链接: https://github.com/Ivan-Tang-3D/3DGen-R1 强化学习是否能够用于Text-to-3D生成,以加强3D自回归模型的逐步推理与生成过程? 在LLM推理和2D文生图中,RL已经证明可以显著提升CoT推理能力和生成质量。但 3D物体更长、更稠密、更具几何约束 。 因此相关方向研究常面临这几个问题: Progressive Investigation:四个层次拆解Text-to-3D+RL 1. Reward设计层 1. 奖励如何同时刻画语义对齐、几何一致性和视觉质量? 2. 现有RL算法是否适合自回归式3D生成? 3. 缺乏专门考察"3D推理能力 ...
亚马逊AGI负责人离职,强化学习大佬Pieter Abbeel接任
机器之心· 2025-12-19 00:21
Core Viewpoint - Rohit Prasad, the Senior Vice President and Chief Scientist of Amazon's AGI team, has announced his departure, marking a significant leadership change in Amazon's AI initiatives [1][3][4]. Group 1: Leadership Changes - Rohit Prasad joined Amazon in 2013 and played a crucial role in developing Alexa and leading the Nova foundational model project [3][4]. - Following Prasad's exit, Amazon will centralize AI research under the cloud computing division (AWS), with Peter DeSantis appointed to lead a new organization that will report directly to CEO Andy Jassy [5][6]. Group 2: AI Development Focus - Amazon aims to enhance its AI product development to compete with OpenAI, Google, and Anthropic, having launched its own foundational model series, Nova, and developed custom AI chips, Trainium, to rival Nvidia [5]. - The new department led by Peter DeSantis will oversee the development of core models, support for self-developed chip initiatives, and exploration of quantum computing technologies [10][12]. Group 3: New Appointments - Pieter Abbeel, a leading AI researcher and co-founder of Covariant, will take over the leadership of the foundational model research team, focusing on advancing Amazon's AI research [12][17]. - Abbeel's extensive background in AI and robotics positions him well to drive innovation and collaboration within Amazon's AI initiatives [12][15]. Group 4: Employment Perspectives - AWS CEO Matt Garman expressed confidence that AI will create more jobs than it displaces, emphasizing the importance of nurturing new talent to fill high-value roles in the future [19][20]. - Garman highlighted that junior developers, who are more adept at using AI tools, will play a crucial role in the evolving tech landscape, countering the notion that AI will replace entry-level positions [20].
端到端落地中可以参考的七个Project
自动驾驶之心· 2025-12-19 00:05
Core Viewpoint - The article emphasizes the importance of end-to-end production in autonomous driving technology, highlighting the need for practical experience in various algorithms and applications to address real-world challenges in the industry [2][7]. Course Overview - The course is designed to provide in-depth knowledge on end-to-end production techniques, focusing on key algorithms such as one-stage and two-stage frameworks, reinforcement learning, and trajectory optimization [2][4]. - It includes practical projects that cover the entire process from theory to application, ensuring participants gain hands-on experience [2][12]. Instructor Background - The instructor, Wang Lu, is a top-tier algorithm expert with a strong academic background and extensive experience in developing and implementing advanced algorithms for autonomous driving [3]. Course Structure - The course consists of eight chapters, each focusing on different aspects of end-to-end algorithms, including: 1. Overview of end-to-end tasks and integration of perception and control systems [7]. 2. Two-stage end-to-end algorithm frameworks and their advantages [8]. 3. One-stage end-to-end algorithms with a focus on performance [9]. 4. Application of navigation information in autonomous driving [10]. 5. Introduction to reinforcement learning algorithms and training strategies [11]. 6. Optimization of trajectory outputs using various algorithms [12]. 7. Post-processing strategies for ensuring reliable outputs [13]. 8. Sharing of production experiences and strategies for real-world applications [14]. Target Audience - The course is aimed at advanced learners with a foundational understanding of autonomous driving algorithms, including familiarity with reinforcement learning and diffusion models [15][17].