Workflow
具身智能之心
icon
Search documents
马斯克宣布:量产脑机接口,手术全自动化
具身智能之心· 2026-01-04 00:32
Core Viewpoint - Neuralink, founded by Elon Musk, aims to mass-produce brain-machine interface devices by 2026, transitioning from laboratory to clinical applications, with a focus on simplifying the surgical process for implantation [1][10][42]. Group 1: Neuralink's Development Timeline - Neuralink was established in 2016, with significant milestones including animal experiments in 2019, demonstrations with a pig in 2020, and a monkey playing a game in 2021 [33][34][35][36]. - In 2023, Neuralink received FDA approval to conduct human clinical trials, marking a pivotal moment in its development [38]. - By September 2025, Neuralink had implanted devices in 12 patients, which increased to 20 by December of the same year [5][41]. Group 2: Surgical Process and Technology - The current surgical procedure for implanting the brain-machine interface involves complex steps, including the removal of part of the skull and the dura mater, which complicates scalability [8][9]. - Neuralink plans to simplify this process by allowing electrode wires to penetrate the dura mater without cutting it, reducing risks and costs associated with the surgery [12][14]. - This new "minimally invasive" approach is expected to lower the barriers for standardization and increase the accessibility of the technology [14]. Group 3: Market Potential and Applications - The demand for brain-machine interfaces is significant, particularly for treating neurological disorders such as paralysis, muscular atrophy, Parkinson's disease, dementia, and vision impairments [6][18]. - The first human volunteer for Neuralink's device, Noland Arbaugh, was able to post on social media and play video games post-surgery, showcasing the potential life-changing impact of the technology [19][22]. - If Neuralink successfully scales production and reduces surgical costs, it could transform the lives of many individuals with neurological conditions [23]. Group 4: Broader Implications and Future Vision - Beyond medical applications, Musk envisions Neuralink as a means for humanity to keep pace with advanced AI, suggesting that a high-bandwidth interface could prevent humans from becoming obsolete [25][27]. - The potential for individuals to update their skills through direct brain connections to the internet could lead to unprecedented advancements in human civilization [28].
让机器人“舞得更好”的全身运控的方案还有哪些进化空间?
具身智能之心· 2026-01-04 00:32
点击下方 卡片 ,关注" 具身智能 之心 "公众号 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 ★ 继续具身智能之心上次的圆桌,我们为大家整理了机器人全身运控的一些insigts。这次主要探索RL+VLA、realsim2real、3DGS和仿真的一些问题,近万字分 享。 刘斯坦: 我们想聊一聊关于RL的事情,现在很多VLA的训练已经有点类似于标准化了,先进行模仿学习训练基座。然后基于一些仿真环境进行一些强化学习,就是跑最后的 是一公里或者十公里,最后的10%好像已经有一些这种标准化的一种训练的方式。我们如果看 deepseek R1。还有比如说最近提出的那个超级人工智能的一些图景的 话,其实全部都是关于RL的训练范式有关的创新,就是RL它不是一个简单的就是一个强化学习上,然后仿真环境去用了就结束了。它可能是涉及到一个非常复杂 的流程等等,我们想在第二个大问题的第一个小问题的讨论是关于RL训练范式的创新和未来发展的情况。首先,我们想就是问问张 ...
耶鲁00后博士带队!成立仅1年多,单月量产交付了超百台机器人
具身智能之心· 2026-01-01 02:03
>> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 各家今年的量产交付情况怎么样? 点击下方 卡片 ,关注" 具身智能 之心 "公众号 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 整个机器人行业,正在褪去华丽的故事外衣,回归商业的本质:用切实的交付证据,代替悬浮的概念展示。 成立没多久,开始量产交付 今年12月,UniXAI宣布完成两轮合计3亿元融资,本次投资由由川商基金、吴中金控、益华资本、青域基金、太浩创投等机构,以及若干上市公司及产业方参与 投资,老股东赛纳资本追加投资。这家在今年世界人形机器人运动会上取得2枚金牌的公司,正在聚光灯下。"算法-硬件-场景"三位一体的发展路线,正在不断地 得到市场的肯定。 ★ 2025年,人形机器人行业正在经历一场从"概念追捧"到"交付验证"的深刻认知升级。回想之前,一场概念发布会、几段展台演示视频,就足以被冠以"量产落地"之 名。 丰瑜很务实,选的赛道也是正确的,00后已经真的后生可畏了,许多投资人这样评价。 这位从耶鲁回来的CEO,带着团队1年多的时间 ...
Physical Intelligence最新π0.5+ego!从人类视频到机器人技能的跨模态迁移
具身智能之心· 2025-12-31 04:00
点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Simar Kareer等 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 在机器人学与多模态智能领域,人类经验是赋予机器人物理智能的核心源泉,但如何让机器人直接从海量人类视频中学习技能,一直面临着模态差异、数据对齐等 关键挑战。 来自 Physical Intelligence 与佐治亚理工学院的联合团队 提出的 " +ego" 框架,以 "规模化预训练 + 跨模态协同微调" 为核心,首次揭示了视觉 - 语 言 - 动作(VLA)模型中 "人类 - 机器人技能迁移" 的涌现性规律,为通用机器人政策的规模化训练提供了全新思路。 ★ 论文题目:Emergence of Human to Robot Transfer in Vision-Language-Action Models 核心亮点:无显式对齐的跨模态迁移、多样化预训练驱动的涌现能力、仅需数十小时人类数据实现性能翻倍、覆盖场景 / ...
VLA-Arena:一个用于系统性评估VLA的开源基准框架
具身智能之心· 2025-12-31 00:50
Research Background and Motivation - The Vision-Language-Action models (VLAs) are rapidly evolving towards general robotic strategies, achieving capabilities such as cross-carrier generalization, dexterous manipulation, and instruction following. However, there is a lack of quantitative understanding regarding the capability boundaries, limitations, and failure modes of these models, with existing benchmarks having three core deficiencies [1][4]. Core Design: Structured Tasks and Benchmark Framework - The VLA-Arena framework is proposed to address the aforementioned issues, aiming to systematically design and accurately characterize the capability frontiers and failure mechanisms of VLA models [1][4]. - The benchmark includes 170 tasks categorized into four dimensions, covering difficulty levels from L0 to L2 [6]. Key Components and Technical Details - The framework enhances the Behavior Domain Definition Language (BDDL) to create the Constraint Behavior Domain Definition Language (CBDDL), focusing on two core enhancements [6][7]. - The VLA-Arena-S/M/L datasets are provided, categorized by task level (L0/L1) and trajectory count (10/30/50 per task), constructed from human demonstration data with preprocessing steps to ensure reproducibility [8]. Experimental Design and Main Findings - The experimental setup evaluates models across two architectural paradigms, including autoregressive models and continuous action generation models, using success rate (SR) and cumulative cost (CC) as evaluation metrics [12][13]. - Key findings indicate that: 1. Models exhibit a strong tendency to memorize rather than generalize, with performance drastically declining in L1 and L2 tasks [14]. 2. There is an asymmetry in robustness, where models are generally resilient to language perturbations but vulnerable to visual disturbances [15]. 3. A trade-off exists between safety and performance, with models struggling to integrate safety constraints effectively [16]. 4. The ability to handle distractors varies, with static distractors posing greater challenges than dynamic ones, and models failing in long-horizon tasks [19]. 5. Increasing data diversity can enhance near-distribution performance but may harm far-distribution generalization capabilities [17]. Comparison with LIBERO Benchmark - The VLA-Arena tasks require deeper language understanding compared to LIBERO, where performance declines less significantly in the absence of instructions, indicating a more robust semantic grounding in real-world scenarios [22].
全球超低价5888元!开箱即用支持π0.5的家用科研级具身协作臂来啦
具身智能之心· 2025-12-31 00:50
点击下方 卡片 ,关注" 具身智能 之心 "公众号 "把 30 万的研究门槛打到 5888元——我们的目标是:让具身智能人人可上手。" 2025年末,特修斯意海融硅科技正式发布具身智能机械臂S1系列产品,这不仅是"理工男"写给全国具身智 能科研工作者的一封"情书",也是意海融硅交给自己的一份作业。"S"代表的是Silicon,象征着让智能真正 落地于物理世界的那一层"硅基载体"。 从VLA到Aloha、从LeRobot到GR00T……这些项目正在定义具身智能的"现在"。但意海融硅所希望的是, 让更多人能够一起参与定义"未来"。 意海融硅曾注意到 LeRobot 这样的项目,它让更多人能通过3D打印+舵机的方式上手VLA算法。但舵机在 精度、扭矩和感知能力上的局限,也构成了它的天花板。 在复现斯坦福的mobile-aloha工作的过程中,意海融硅团队被各类机器人平台以及具身智能平台的价格所震 撼:动辄20万、30万甚至更贵的双足人形机器人、轮臂复合机器人等等,无疑拉高了具身智能的科研门 槛。 人形机器人的爆火,DeepSeek的出圈,让一个问题愈发清晰:当AI已经足够聪明,它还缺什么? 答案是:"缺一副身体。" ...
吴恩达年终总结:2025是AI工业时代的黎明
具身智能之心· 2025-12-31 00:50
Core Insights - 2025 is marked as a pivotal year in the AI industry, characterized by rapid advancements and significant developments in AI technologies and infrastructure [10][14][30] - The competition for AI talent has intensified, with leading companies offering unprecedented salaries to attract top professionals [23][27] - The emergence of reasoning models and programming agents has transformed software development, lowering barriers to entry and enabling more individuals to participate in AI innovation [37][40] Group 1: AI Industry Developments - The year 2025 is described as the dawn of the AI industrial era, with major advancements in AI capabilities and infrastructure [14][30] - AI companies are projected to spend over $300 billion in capital expenditures, primarily on building new data centers to support AI tasks [30][32] - By 2030, the costs associated with building sufficient computing power for AI needs could reach $5.2 trillion, indicating a massive investment trend [30] Group 2: Talent Acquisition and Market Dynamics - AI firms are engaged in a fierce talent war, with salaries reaching levels comparable to professional sports stars, as companies like Meta offer up to hundreds of millions in compensation [23][27] - OpenAI, Meta, and other tech giants are implementing strategies to retain talent, including higher stock compensation and accelerated vesting schedules [27][30] - The influx of capital and talent into the AI sector is contributing to economic growth, with evidence suggesting that the majority of GDP growth in the U.S. in early 2025 is driven by data center and AI investments [30] Group 3: Technological Advancements - The introduction of reasoning models has significantly improved the performance of large language models (LLMs), enhancing their capabilities in various tasks [21][22][24] - Programming agents have become a competitive battleground among AI giants, with advancements allowing them to complete over 80% of programming tasks [31][34] - The development of new benchmarks and evaluation methods for programming agents reflects the evolving landscape of AI capabilities [34]
不走硅谷路线!大摩两次押注的具身公司,持续重仓基座模型和落地
具身智能之心· 2025-12-30 10:00
点击下方 卡片 ,关注" 具身智能 之心 "公众号 机器人产业正站在一次关键拐点前。 在这样一个时间窗口里,顶级投行的判断,往往比热闹的发布更具参考意义。每一份重量级研究报告,其实 都在提前描绘未来几年产业格局的轮廓。 2025年,摩根士丹利(Morgan Stanley)在先后发布的两份机器人产业深度报告中,罕见地连续两次重点聚焦 同一家中国公司—— 智平方(AI² Robotics)。 一次,是将其列为定义行业底座的基础大模型代表; 另一次,则将其视为机器人商业化落地的标杆案例。 在大摩的研究体系中,这种"技术 + 落地"的双重定位并不常见。它所指向的,是一种在机器人行业里尤为稀 缺的能力:技术领先与商业验证同时成立的确定性。 模型站位:不是"跟随硅谷路线",而是全球 VLA 路线的定义者之一 在 2025 年 12 月 5 日发布的最新报告《The Robot Almanac Vol.1: AI Gets Physical; Cambrian Explosion of Bots》 中,摩根士丹利对全球机器人产业格局给出了明确判断: 中国已经在机器人与具身智能领域建立起领先优势,并且这一优势仍在持续扩大。 ...
具身智能之心的25年还有2天!
具身智能之心· 2025-12-30 01:11
Group 1 - The core viewpoint of the article highlights the growth and maturation of the embodied intelligence industry, with an increase in both B-end partnerships and a shift towards more specialized C-end content [1][2] - The industry has seen a significant increase in the recruitment of candidates with around one year of experience, indicating a shift from hiring mostly inexperienced graduates [1] - The company has established nearly 40 embodied groups and has grown its paid community to over 2000 members, showcasing its value in cultivating talent and providing industry insights [3] Group 2 - The company is offering various discounts on embodied courses, including a 25% discount on all courses and a 40% discount for new members joining the knowledge community [7] - Additional promotions include a maximum discount of 1500 on high-cost embodied research robotic arms and a free high-quality course for purchases over 3000 [7] - The company is also providing personalized project and job application guidance at discounted rates, further supporting the development of professionals in the field [7]
阿里AstraNav-World:端到端世界模型,联合推理视觉状态和动作
具身智能之心· 2025-12-30 01:11
点击下方 卡片 ,关注" 具身智能 之心 "公众号 作者丨 Junjun Hu等 编辑丨具身智能之心 本文只做学术分享,如有侵权,联系删文 >> 点击进入→ 具身智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有你想要的。 核心问题与研究动机 具身导航在开放动态环境中面临的关键瓶颈的是:现有方法多采用" 先想象未来视觉状态,再规划动作 "的松散耦合范式,导致物理一致性缺失、因果关系模糊,且 误差会随时间累积,最终破坏长视野规划的可靠性。 要实现稳健的真实世界导航,需要同时推进两大能力:一是"预见未来"——基于动作生成可信的未来视觉状态,体现对物理规律和因果关系的理解;二是"规划未 来"——生成任务导向的动作序列,约束视觉生成结果贴近可到达的真实世界。两者的割裂是现有方案性能受限的核心原因,因此需要一个统一框架实现双向约束与 协同优化。 核心贡献 方法架构详解 VLM中央规划器 $$40394\pm639367$$ 基于Wan-2.2-TI2V-5B扩散模型,针对导航场景做三大核心优化: 1. 条件编码替换 :用VLM规划器替代传统文本编码 ...