Workflow
具身智能之心
icon
Search documents
具身的「Imagenet 时刻」,李飞飞团队官宣全球顶级具身智能挑战赛
具身智能之心· 2025-09-25 00:04
编辑丨 机器之心 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 在计算机视觉的历史上,Imagenet 挑战赛曾被誉为 AI 发展的分水岭,引爆了深度学习的浪潮。那么,在具身智能与机器人领域,是否也会迎来类似的 "拐点时 刻"? 答案或许渐渐清晰。李飞飞团队与斯坦福 AI 实验室正式官宣:首届 BEHAVIOR 挑战赛将登陆 NeurIPS 2025。这是一个为具身智能量身定制的 "超级 benchmark",涵盖真实家庭场景下最关键的 1000 个日常任务(烹饪、清洁、整理……),并首次以 50 个完整长时段任务作为核心赛题,考验机器人能否在逼真 的虚拟环境中完成真正贴近人类生活的操作。 为什么 BEHAVIOR 值得关注? 与以往碎片化的基准不同,BEHAVIOR 首次提出:一个真正的家庭机器人,必须同时具备跨房间导航、双手精细操控、长期规划与动态适应等多项能力。 任务规模前所未有:覆盖 1000 个家庭活动,50 个完整长程挑战,平均单个任务需 ...
最近在具身领域做的一些事情,社区、硬件和求职......
具身智能之心· 2025-09-25 00:04
Core Viewpoint - The article emphasizes the ongoing efforts in the field of embodied intelligence, focusing on community building, hardware development, and job opportunities in the sector [3][4][12]. Group 1: Community Development - The community aims to create a comprehensive platform for technical exchange related to embodied intelligence, facilitating academic and engineering discussions [14]. - The community has established a closed loop in various fields, including industry, academia, job seeking, and Q&A exchanges, providing solutions to encountered problems [6]. - The community is actively working on improving its structure and content to better serve its members, with plans to present enhanced offerings after the holiday [3]. Group 2: Hardware and Product Development - There are ongoing efforts to address complaints regarding the high cost and usability of hardware, with plans to introduce better solutions soon [3]. - The community is testing and developing embodied products, aiming to provide users with effective platforms for their needs [3]. Group 3: Job Opportunities and Academic Guidance - The community has received numerous inquiries from universities regarding recruitment in the field of embodied intelligence, particularly for research assistants, PhD candidates, and postdoctoral positions [3]. - Members are encouraged to prepare for upcoming job opportunities and academic advancements, with the community offering resume submission for internal referrals [3][20]. Group 4: Educational Resources - The community has compiled over 30 technical routes for newcomers, significantly reducing the time needed for research and learning [6][8]. - Various resources, including open-source projects, datasets, and learning paths, are available to assist members in their studies and career development [14][31][37]. Group 5: Industry Insights and Networking - The community has established connections with numerous leading companies in the field, facilitating job referrals and industry insights [20][22]. - Members can engage with industry leaders through forums and live discussions, gaining valuable knowledge about the latest developments in embodied intelligence [6][20].
具身智能之心国庆&中秋福利来了!课程/社区/硬件/论文辅导等
具身智能之心· 2025-09-24 06:32
Group 1 - The article promotes a series of discounts and offers related to embodied intelligence courses and services, available from September 24 to October 12 [1][4][6] - New members joining the knowledge community can enjoy a 30% discount, while existing members can renew at a 50% discount [1][4] - Various courses, including VLA, VLN, and reinforcement learning, are offered at a 20% discount [2][4] Group 2 - The Super Discount Card allows for a 30% discount on all courses within one year [4][7] - One-on-one paper tutoring can provide a maximum discount of 5,000 yuan for a 1,000 yuan fee, while group tutoring (1v6) offers a 1,000 yuan reduction [4][7] - The article highlights several research hardware options, including reinforcement learning platforms and robotic arms [4][7]
准备搞一个具身的吃瓜群!
具身智能之心· 2025-09-24 06:32
添加微信的时候记得备注:昵称+机构/公司+入群 思考了下,很有必要搞一个有趣的群。于是乎立刻创办了一个(因为精力只够维护1个的群,所以只有500 人的规模,满了就关闭新人加入),这个群后面不会转发任何具身智能之心的文章和直播类内容,仅做行 业交流、产品讨论、学术讨论, 当然也欢迎唠唠工作、求职和创业 。 最近峰哥收到具身智能之心的粉丝反馈,社区内有没有一个没那么正式的社群(就是不要每天发文章和学 术),可以每天聊一些行业、吃瓜、求职等topic。 如果大家比较感兴趣,可以加我微信oooops-life邀请入群。我们希望您是正在具身工业界就职的同学或正在 从事相关科研活动的大佬。 确实,我们的群都太过学术化了,可能和我们教育科技的IP有关。 ...
今日Talk来啦!具身智能新基建:从大模型到真实世界
具身智能之心· 2025-09-24 02:30
Core Viewpoint - The article discusses an upcoming event hosted by the Beijing Academy of Artificial Intelligence, focusing on embodied intelligence and its new infrastructure, highlighting the importance of this field in the AI industry [1]. Event Details - The event titled "AI 智原Talk" will take place on September 24, 2025, from 14:00 to 17:30 at the Beijing Zhiyuan Artificial Intelligence Research Institute [2]. - The event is organized by the Beijing Zhiyuan Artificial Intelligence Research Institute and supported by various organizations including Baidu PaddlePaddle and the China Internet Association's AI Committee [2]. Agenda Overview - The agenda includes a series of presentations: - Introduction by the Vice President of the Beijing Zhiyuan Artificial Intelligence Research Institute [3]. - Presentation on the innovative foundation of embodied intelligence by the head of embodied data [3]. - A session on the operational framework and construction of the embodied brain by the head of the large model [5]. - An upgrade of the Zhiyuan evaluation system, including the release of the autumn 2025 ranking [5]. - A discussion on the technical practices and value verification of FlagScale in embodied intelligence scenarios [5]. Participation Information - Attendees can register for the event via a QR code and join the WeChat group for further discussions on embodied intelligence [6].
【CEAIS 2025】全日程公布,参会早鸟报名火热进行中!
具身智能之心· 2025-09-24 00:04
Core Viewpoint - The article highlights the establishment and development of the Artificial Intelligence Institute at Xi'an Jiaotong University, emphasizing its role in advancing research in embodied intelligence and robotics, and announcing the upcoming second China Embodied Intelligence and Systems Conference (CEAIS 2025) to be held in Xi'an [2][4]. Group 1: Conference Overview - The second CEAIS 2025 will take place on November 1, 2025, at the Xi'an Jianguo Hotel, focusing on cutting-edge research in embodied intelligence [4]. - The conference will feature over ten academicians and nearly a hundred senior experts, with four keynote speeches and fifteen technical sub-forums covering various topics such as foundational models, intelligent robotics, and emotional embodied intelligence [4][8]. Group 2: Conference Schedule - The conference will commence with registration on October 31, 2025, followed by a dinner and a meeting for the Embodied Intelligence Committee [7]. - The main conference day on November 1 will include an opening ceremony, keynote reports, and multiple technical forums addressing topics like embodied intelligence computing architecture, intelligent driving, and robot sensors [8][10]. Group 3: Technical Forums and Topics - Key topics in the technical forums will include the exploration of large model emergence capabilities, advancements in embodied intelligence computing chips, and the development of humanoid and bionic robots [9][12]. - Specific reports will cover areas such as multi-modal flexible tactile perception technology, autonomous learning in robotic systems, and the integration of AI in traditional medicine [10][13]. Group 4: Registration and Participation - Early bird registration fees are set at 1800 RMB for non-members and 1200 RMB for members, with discounts for students [41]. - Participants are encouraged to register online and will receive a conference manual and invoice upon registration [42][44].
西湖大学发布世界模型WorldForge,让普通视频模型秒变「世界引擎」
具身智能之心· 2025-09-24 00:04
Core Viewpoint - The article discusses the advancements in AI video generation, particularly focusing on the World Forge framework developed by the West Lake University AGI Lab, which allows for precise control over video generation without sacrificing quality or requiring retraining of models [2][3][32]. Summary by Sections Introduction to AI Video Generation - Since the introduction of Sora, the realism of AI-generated videos has significantly improved, but controllability remains a challenge [2]. - Current methods either require expensive fine-tuning or lead to quality degradation due to noise and artifacts in guiding signals [2]. World Forge Framework - World Forge is a new framework that enables precise control during the video generation process without modifying model weights, effectively adding a "director's brain" to video diffusion models [3][32]. - The framework allows for the generation of 360° videos from a single image and the ability to reframe videos with complex camera movements [6][21]. Method Overview - The framework operates on a training-free guidance principle, injecting "spatiotemporal geometry" during inference [12]. - It employs a series of innovative guiding modules to ensure that the model adheres to spatial and temporal consistency while maintaining creative freedom [13]. Key Innovations 1. **Intra-step Recursive Refinement (IRR)**: This mechanism ensures that AI-generated movements strictly follow predefined camera trajectories by incrementally correcting predictions with real content [15]. 2. **Flow-Gated Latent Fusion (FLF)**: This module separates motion and appearance channels in the latent space, allowing precise control signals to be sent only to motion channels, preserving detail in appearance channels [16]. 3. **Dual-Path Self-Correction Guidance (DSG)**: This strategy balances trajectory accuracy and image quality by dynamically adjusting the guiding signals based on the differences between guided and non-guided paths [17]. Performance Highlights - World Forge excels in generating 360° panoramic views from a single image, overcoming limitations of traditional panorama methods [21]. - It allows for cinematic-level video reframing, enabling users to specify complex camera movements while maintaining stability and reducing artifacts [23]. - The framework supports video editing capabilities, such as stabilizing footage, removing unwanted objects, and seamlessly integrating new elements [29]. Advantages of World Forge - The training-free nature of World Forge significantly lowers the barrier to creating high-quality 3D/4D visual content, making it accessible for various applications in film, gaming, and digital twin technologies [32][34]. - Its flexibility allows it to be integrated into various mainstream video models without the need for targeted retraining, showcasing strong generalization capabilities across different domains [34].
跨越仿真与真实数据鸿沟:Real2Sim2Real重要工作一览!
具身智能之心· 2025-09-24 00:04
点击下方 卡片 ,关注" 具身智能 之心 "公众号 编辑丨具身智能之心 所有内容出自国内首个具身智能全栈学习社区:具身智能之心知识星球。国庆优惠,欢迎和近2000名成员 一起交流具身产业与学术。 Real2Sim2Real近3年工作一览 论文题目: Incremental Few-Shot Adaptation for Non-Prehensile Object Manipulation using Parallelizable Physics Simulators 论文链接:https://arxiv.org/pdf/2409.13228? 论文时间: ICRA 2025 作者单位: 马克斯·普朗克智能系统研究所 论文题目: RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning 本文只做学术分享,如有侵权,联系删文 由于真实数据采集成本高,国内外具身领域有不少团队在研究real2sim、Real2Sim2Real 相关工作。和一些 具身公司坚定走真机采集路线不同,他们相信 ...
每当有人咨询具身入门的路线时,我一定会推荐这套完整的教程
具身智能之心· 2025-09-24 00:04
点击下方 卡片 ,关注" 具身智能 之心 "公众号 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要 的。 具身大小脑构成了主要内容 具身智能领域主要围绕2个重要的部分展开:大脑和小脑,这是机器人最重要的模块,如果类比于人类,大 脑负责思考感知(主导语义理解和任务规划),小脑负责执行(高精度的运动执行)。 在细分领域又有仿真、vla、diffusion policy、vln、世界模型、强化等多个子模块。vla和世界模型目前正在 自驾和具身领域同时发力,代表2个不同的技术路线。 vla目前主要研究热点为端到端和分层2种方案,又分别基于大模型和diffusion技术作为拓展。现vla+rl方 案,也被越来越做学者作为探索的方向。 diffusion policy作为action模块,负责学习具体的动作和执行。主要有状态扩散、动作空间扩散、三维空间 扩散等多个方向。 仿真现在比较好的点是sim2real和real2sim2real,如何解决真机泛化差的问题是 ...
VLA及其相关方向占据了顶会近一半的具身工作,特别是这几个......
具身智能之心· 2025-09-23 04:00
从今年各个机器人与AI顶会来看,VLA及其相关衍生方向,占据了近一半的具身产出。特别是长程操作、 泛化、少样本、VLA+RL、人形相关。 想象一下,如果能通过语言下达指令,并且丝滑执行任何你想要的动作,是一件多么幸福的事情!如果能 长时间连续动作完成,将会非常方便。下面给大家介绍下VLA到底是啥? VLA打破了传统方法的单任务局限,使得机器人能够在多样化的场景中自主决策,灵活应对未见过的环 境,广泛应用于制造业、物流和家庭服务等领域。此外,VLA模型已成为研究热点,推动了多个前沿项目 的发展,如pi0、RT-2、OpenVLA、QUAR-VLA和HumanVLA,这些研究促进了学术界与工业界的合作。 其适应性体现在能够应用于机械臂、四足机器人和人形机器人等多种平台,为各类智能机器人的发展提供 了广泛的潜力和实际应用价值,成为智能机器人领域的关键驱动力。 从产业角度看,国内外具身智能领域正处于蓬勃发展阶段,Unitree、智元、星海图、银河通用、逐际动力 等团队从实验室走向商业化,华为、京东、腾讯等科技巨头也积极布局,与国外Tesla、Figure AI等公司正 在一起推动这一领域的发展。 很多同学后台留言,咨 ...