Workflow
具身智能之心
icon
Search documents
智源研究院智星计划:海外招聘(具身智能/多模态/类脑模型等)
具身智能之心· 2025-11-08 00:03
Core Insights - The "ZhiXing Plan - Overseas Recruitment" is a strategic talent recruitment initiative by the Beijing Academy of Artificial Intelligence aimed at attracting and nurturing high-level AI research talent from around the world [2]. Group 1: Program Overview - The "ZhiXing Plan" targets top graduates from global universities, inviting selected candidates for a research visit of six months or more at the Beijing Academy of Artificial Intelligence [2]. - Successful candidates will collaborate with leading international research teams on cutting-edge topics and gain entry into the ZhiYuan talent ecosystem [2]. Group 2: Recruitment Highlights - Exceptional performers during the visit may receive priority for full-time positions at the research institute, including roles such as Principal Investigator (PI), researcher, or postdoctoral fellow [5]. - The program offers access to advanced research topics and supercomputing resources, allowing collaboration with leading researchers in the field [5]. Group 3: Support and Resources - Participants will benefit from long-term academic development and resource support, including entry into the ZhiYuan academic network for ongoing research collaboration and career opportunities [7]. - Comprehensive support services will be provided, including academic mentorship and administrative assistance for both research and daily life [7][8]. Group 4: Target Audience and Research Areas - The recruitment is aimed at Chinese doctoral candidates, postdoctoral researchers, and formal research staff from top universities and laboratories [9]. - Research directions include brain-like models, intelligent systems, embodied intelligence, multimodal AI, and AI for Science, with candidates expected to have published at least three first-author papers in top conferences [9]. Group 5: Application Process - Applications are accepted year-round with rolling reviews until positions are filled [11]. - The process includes academic evaluation, interviews, and formal invitations for successful candidates [10]. Group 6: Vision and Goals - The "ZhiXing Plan - Overseas Recruitment" is positioned as a starting point for integrating into China's leading AI research platform, aiming to conduct globally impactful research and build long-term collaborative relationships [12].
ICML 2026新规「避坑」指南:参会非必须、原稿将公开、互审设上限
具身智能之心· 2025-11-08 00:03
Core Points - The article discusses the new submission guidelines for ICML 2026, which will take place from July 7 to July 12, 2026, in Seoul, South Korea [4] - The conference will implement a double-blind review process, and accepted papers will be presented at the conference [4][5] - Authors can choose whether to attend the conference in person or only have their papers included in the proceedings [7][8] Submission Requirements - Papers must be submitted as a single file, with a maximum of 8 pages for the main text [5] - Accepted papers can add an additional page in the final version [6] - The original submission version of accepted papers will be made public, and authors of rejected papers can also choose to make their original submissions public [10] Important Dates - The submission website will open on January 8, 2026, with the abstract submission deadline on January 23, 2026, and the full paper submission deadline on January 28, 2026 [15][16][17] Review Process - Each paper must meet mutual review requirements, and failure to comply may result in rejection [19] - The double-blind review policy prohibits simultaneous submissions to multiple conferences or journals [20] Ethical Guidelines - Authors must adhere to research and review ethics, including providing a potential societal impact statement with their papers [25] - A plain language summary must be submitted to communicate the research significance to the public [26] Additional Notes - Authors are allowed to use generative AI tools for writing or research but must take full responsibility for the content [23] - Violations of the submission guidelines may lead to sanctions or rejection of the paper [24]
很多同学正在为科研平台发愁?我们却悄悄推出了一款好用的......
具身智能之心· 2025-11-07 10:01
Core Viewpoint - Imeta-Y1 is a lightweight, cost-effective robotic arm designed specifically for beginners and researchers in the field of embodied intelligence, enabling low-cost and efficient algorithm validation and project development [2][5]. Group 1: Product Features - The robotic arm offers a complete open-source toolchain and code examples, facilitating a seamless process from data collection to model deployment [3][17]. - It supports dual-language interfaces (Python/C++) to cater to users' programming preferences, ensuring quick onboarding [3][18]. - Compatibility with ROS1 and ROS2 is provided, along with URDF models for smooth transitions between simulation and real-world applications [3][19]. - The arm features high-precision motion control, low power consumption, and an open hardware architecture, allowing for seamless integration from simulation to real machine [5][35]. Group 2: Technical Specifications - The robotic arm has a weight of 4.2 kg, a rated load of 3 kg, and 6 degrees of freedom, with a working radius of 612.5 mm and a repeat positioning accuracy of ±0.1 mm [8][19]. - It operates at a supply voltage of 24V and communicates via CAN, with external interfaces for power and CAN connections [8][19]. - The arm's joint motion range and maximum speeds are specified, ensuring versatility in various applications [8][19]. Group 3: Development and Support - A comprehensive open-source SDK is provided, including drivers, API interfaces, sample code, and documentation, supporting rapid application development [26][29]. - The product supports multi-modal data fusion, compatible with mainstream frameworks like TensorFlow and PyTorch, enabling end-to-end intelligent algorithm implementation [29][32]. - The company offers 24-hour rapid response for after-sales support, ensuring users receive timely assistance [3][19]. Group 4: Testing and Reliability - Rigorous hardware testing processes, including precision calibration, durability, load performance, and stability verification, ensure the robotic arm's reliability and safety across various application scenarios [35][39].
逆天了,马斯克万亿美元薪酬通过!和Optimus一起热舞庆祝~
具身智能之心· 2025-11-07 00:45
Core Points - Elon Musk's $1 trillion compensation plan has been approved by Tesla shareholders, with 75% authorization granted [1] - The atmosphere at the meeting was electrifying, with chants of "Elon! Elon!" echoing throughout the venue [7] - Musk is required to achieve 12 ambitious milestones related to market value, revenue, and profit benchmarks to unlock the full compensation [7] Summary by Sections - **Compensation Approval** - Tesla shareholders have approved a $1 trillion compensation package for Elon Musk, marking a significant milestone in corporate compensation structures [1] - **Shareholder Meeting Atmosphere** - The shareholder meeting was characterized by high energy, with attendees enthusiastically supporting Musk, indicating strong investor confidence [7] - **Milestones for Compensation** - Musk must meet 12 specific milestones, including targets for market capitalization, revenue, and profitability, to qualify for the full compensation package [7]
银河通用&清华推出DexNDM,用神经动力学重塑灵巧操作
具身智能之心· 2025-11-07 00:05
Core Insights - The article discusses the development of DexNDM, a new method aimed at solving the sim-to-real challenge in dexterous robotic manipulation, particularly in achieving stable in-hand rotation of various objects [2][5][31] Group 1: Background and Challenges - High dexterity in remote operation of complex tools, such as using a screwdriver or hammer, has been a long-standing challenge in robotics [4] - Traditional direct mapping remote operation methods are limited to simple tasks and cannot handle complex manipulations requiring fine motor skills [4] Group 2: DexNDM Methodology - DexNDM proposes a semi-autonomous remote operation paradigm that breaks down complex tasks into stable, reliable atomic skills that robots can execute autonomously [5] - The method focuses on learning general, stable atomic skills for in-hand object rotation, covering a wide range of scenarios including challenging elongated and small objects [5][14] Group 3: Key Features and Achievements - DexNDM achieves unprecedented dexterity by enabling continuous rotation of elongated objects and intricate manipulation of small objects under challenging wrist postures [7][14] - The method demonstrates superior performance in manipulating complex geometries compared to previous works, even with more general hardware [14] - It showcases high adaptability to various wrist postures and rotation axes, allowing for precise control regardless of the mechanical hand's orientation [17] Group 4: Robustness and Practical Applications - The DexNDM system exhibits high dexterity and robustness, successfully performing complex tool usage tasks such as tightening screws and assembling furniture [21] - The system's robustness allows it to handle long-horizon assembly tasks without interruption, even in the presence of unforeseen scenarios [21] Group 5: Innovations in Data Collection and Modeling - DexNDM employs a joint-wise neural dynamics model that effectively fits real-world data to bridge the gap between simulation and reality [24] - An automated data collection strategy, termed "chaos box," is utilized to gather diverse interaction data with minimal human intervention [28] - The training of a residual policy network is implemented to compensate for the dynamics gap between simulation and real-world applications [30]
具身智能之心双十一优惠来啦!
具身智能之心· 2025-11-07 00:05
Group 1 - The core promotion period for the embodied intelligence series is from November 1 to November 11 [2] - Discounts include 30% off for new users and 50% off for renewals [3] - The embodied intelligence series courses are available at a price of 8 BT for a single course and 7 BT for three courses [2] Group 2 - Additional benefits include significant discounts on robotic arms and development components [3] - The company encourages inquiries for more details about the promotional activities [1][3]
首个开源扩散VLA:Unified DVLA!实现SOTA性能+4倍加速
具身智能之心· 2025-11-07 00:05
Core Insights - The article discusses the development of the Unified Diffusion VLA (UD-VLA) architecture, which integrates image generation and action prediction within a unified framework, leveraging the advantages of Diffusion Large Language Models (DLLM) [3][19]. Group 1: Unified VLA Model - The motivation behind the Unified VLA model is to utilize DLLM's strengths in generating and understanding data, focusing on the mutual benefits of image generation and action prediction [3]. - The Joint Discrete Denoising Diffusion Process (JD3P) is introduced, allowing for the simultaneous generation of actions and images during the denoising process [9][10]. Group 2: Technical Mechanisms - Unified tokenization is employed to convert text, images, and actions into a single multimodal sequence, marked by special tokens to differentiate modalities [7]. - A hybrid attention mechanism is implemented to maintain causal relationships within different modalities, ensuring that actions benefit from the denoising of images [7]. Group 3: Training and Inference - The training process consists of two phases: first, post-training on large video datasets to inject future image generation capabilities, and second, jointly optimizing image and action generation [10]. - Inference involves parallel decoding with adaptive masking, initializing all positions with a mask and iterating a few times for refinement [11][12]. Group 4: Performance Evaluation - The UD-VLA model achieves state-of-the-art performance, demonstrating a fourfold speedup compared to autoregressive models while maintaining high action quality [3][19]. - Comprehensive evaluations on benchmarks like CALVIN, LIBERO, and SIMPLER show UD-VLA's superior performance in long-horizon robotic manipulation tasks [15][16].
清北推出Motion Transfer,机器人直接从人类数据中端到端学习技能
具身智能之心· 2025-11-07 00:05
Core Insights - The article discusses the release of Gemini Robotics 1.5 by Google DeepMind, highlighting its Motion Transfer Mechanism (MT) for transferring skills between different robots without retraining [1][2] - A collaborative team from Tsinghua University, Peking University, Wuhan University, and Shanghai Jiao Tong University has developed a new framework called MotionTrans, which enables zero-shot skill transfer from humans to robots using VR data [2][4] MotionTrans Framework - MotionTrans is an end-to-end, zero-shot, multi-task skill transfer framework that allows robots to learn human skills directly from VR data without prior demonstrations [4][7] - The framework supports zero-shot transfer, meaning robots can learn tasks like pouring water and plugging/unplugging devices solely from human data collected via VR [7][16] - It also allows for fine-tuning with a small number of robot data samples (5-20), significantly improving success rates for 13 human skills [7][17] Technical Details - The MotionTrans framework is designed to be architecture-agnostic, allowing it to be integrated with popular models like Diffusion Policy and VLA [7][10] - The team developed a human data collection system that captures first-person video, head movement, wrist poses, and hand actions, which are then transformed into a format suitable for robots [9][10] - The framework employs techniques like coordinate transformation and hand retargeting to bridge the gap between human and robot actions [10][11] Performance Evaluation - In zero-shot evaluations, the robot achieved an average success rate of 20% across 13 tasks, with some tasks like Pick-and-Place reaching success rates of 60%-80% [14][16] - After fine-tuning with a small number of robot trajectories, the average success rate improved to approximately 50% with 5 trajectories and up to 80% with 20 trajectories [17][18] - The results indicate that even tasks with initially zero success rates showed the model could learn the correct action direction, demonstrating the framework's ability to capture task semantics [14][22] Conclusion - MotionTrans has proven that even advanced models can learn new skills under zero-robot demonstration conditions using only human VR data, changing the perception of human data from a supplementary role to a primary one in skill acquisition [22][23] - The team has open-sourced all data, code, and models to support further research in this area [23]
从转型和研究来看,什么方向更适合第一篇论文?
具身智能之心· 2025-11-06 11:47
Group 1 - The article discusses suitable research directions for publishing papers, particularly in the fields of embodied intelligence, including vln, vla, reinforcement learning, and real2sim2real [1] - For researchers currently engaged in SLAM, vln and vla are recommended as good entry points, especially for those with robotic arms [1] - The article emphasizes the importance of having a good idea for research, noting that new researchers may need to navigate various challenges to develop innovative concepts [1] Group 2 - A new paper guidance service has been launched, offering customized one-on-one mentoring in various advanced topics such as multimodal large models, VLA, reinforcement learning, and more [2] - The mentoring team consists of PhD holders and researchers from top universities and companies, providing comprehensive support from topic selection to publication strategy [2] - The service aims to bridge the gap between academia and industry, focusing not only on paper publication but also on practical application value [3] Group 3 - The article promotes a free matching service for the first ten inquiries, allowing students to have in-depth meetings with mentors based on their research direction and academic background [5]
创办了一个具身论文复现的交流群
具身智能之心· 2025-11-06 11:47
Group 1 - The company has established a technical communication group focused on replicating open-source projects such as vla, vln, and dp, addressing issues related to performance metrics and data collection [1] - The aim of the group is to create a platform for users to share experiences and avoid common pitfalls in the replication process [1] Group 2 - Interested individuals can join the group by adding a designated assistant on WeChat, providing their name and a note indicating their interest in replication [2]