Foundation Model
Search documents
一文速通「机器人3D场景表示」发展史
机器之心· 2026-01-23 00:45
Core Viewpoint - The article discusses the rapid development of robotics and the need for robots to understand the world similarly to humans, focusing on various scene representation methods in robotics [2][4]. Group 1: Historical Development of 3D Scene Representation - The integration of deep learning, computer graphics, and robotics has led to significant advancements, with Neural Radiance Fields (NeRF), 3D Gaussian Splatting, and Foundation Models emerging as promising innovations for achieving general embodied intelligence [8]. Group 2: Types of Scene Representation - Point Cloud: Represents scenes using discrete 3D points obtained from radar or camera sensors [10]. - Voxel: Discretizes 3D space into regular cubic grids, storing various information like density and occupancy [10]. - Mesh: Constructs continuous geometric representations of scenes through triangulated surfaces, offering higher detail [10]. - Signed Distance Function (SDF): Represents the distance from spatial points to object surfaces for continuous geometric representation [10]. Group 3: Applications in Robotics - In mapping and localization, existing methods have achieved remarkable results in SLAM, with neural scene representations enabling more precise and dense modeling, beneficial for obstacle avoidance [15]. - In the operation module, traditional methods excel in real-time performance and computational efficiency for grasping tasks, while neural network-based representations show better generalization capabilities for complex tasks [15]. - Navigation tasks benefit from neural scene representations, which provide accurate environmental reconstruction and better integration of semantic and language information for complex navigation tasks [16]. Group 4: Challenges and Future Directions - The article identifies three main challenges: 1. The need for end-to-end general networks versus modular systems, highlighting the limitations of modular intelligence in terms of generalization and transferability [19]. 2. Data scarcity in robotics compared to large language models, which hinders the development of neural scene representations and foundation models [20]. 3. Real-time performance bottlenecks in deploying neural scene representations, with a focus on cloud-based versus onboard deployment strategies [21]. Group 5: Contributions and Resources - The article provides a comprehensive and up-to-date review of various scene representation methods in robotics, detailing the advantages of different representations for each module [22]. - It highlights future research directions to address current technical limitations and encourages further advancements in this rapidly evolving field [22]. - An open-source project on GitHub has been launched to compile relevant articles and continue adding new research findings in the field of robotics [22].
加码智能化投入,理想正式成立美国研发中心|36氪独家
3 6 Ke· 2025-12-18 07:02
Core Insights - Li Auto is establishing an AI R&D center in Silicon Valley to enhance its smart technology development, focusing on advanced driver assistance systems and recruiting high-end algorithm talent [1][2] - The expansion of Li Auto's R&D capabilities aligns with its global strategy, which includes existing centers in Germany and China, bringing the total to four R&D centers worldwide [1] - The competition in the smart driving sector is intensifying, with major players like Huawei and Xiaopeng Motors also investing heavily in autonomous driving technologies [3][6] Group 1: Li Auto's Strategic Moves - Li Auto's new Silicon Valley center aims to upgrade its previous small R&D team to a full-fledged center, emphasizing the importance of proximity to cutting-edge AI talent [1] - The company has already established R&D centers in Germany and has two domestic centers in Beijing and Shanghai, focusing on core technology breakthroughs and vehicle development [1] - The establishment of the AI center is part of Li Auto's broader strategy to evolve into an "AI company" over the next decade, with a focus on integrating AI into all business operations [6] Group 2: Competitive Landscape - The establishment of R&D centers in Silicon Valley by Chinese EV companies like Li Auto, NIO, and Xiaopeng Motors indicates a significant escalation in the competition for smart vehicle technology [2][6] - Xiaopeng Motors has a well-established North American team that plays a crucial role in developing its autonomous driving technology, highlighting the importance of sustained investment in R&D [4][5] - The race for advanced driver assistance systems is marked by significant technological advancements, with companies like Huawei and Xiaopeng Motors already deploying L3-level autonomous driving solutions [3][6]
加码AI投入,理想正式成立美国研发中心
3 6 Ke· 2025-12-18 02:08
Group 1 - Li Auto is establishing an AI R&D center in Silicon Valley to enhance its smart technology development, with recruitment already underway [1][2] - The new center will focus on advanced driver assistance systems and aims to attract high-end talent with cutting-edge AI backgrounds [1][2] - Li Auto's global strategy includes establishing R&D centers in key international markets, with existing centers in Germany and China [2] Group 2 - The competition in automotive intelligent driving is intensifying, with Huawei and Li Auto both advancing their autonomous driving technologies [3] - Li Auto's VLA driver model, which integrates visual-language-action modeling, aims to improve the vehicle's understanding of complex scenarios [3][6] - The shortage of AI talent, particularly those with expertise in both AI models and autonomous driving, makes Silicon Valley a critical location for talent acquisition [3][6] Group 3 - Xpeng Motors has a well-established North American R&D team that plays a significant role in developing its autonomous driving technology [4] - Xpeng's Foundation Model, with a parameter count of 72 billion, is being developed to enhance its AI capabilities, contrasting with Li Auto's VLA model at 4 billion parameters [5] - Li Auto is shifting its strategic focus towards becoming an AI company, establishing an AI technology committee to integrate AI into all business operations [5][6] Group 4 - The establishment of the Silicon Valley R&D center represents Li Auto's commitment to advancing its AI capabilities and staying competitive in the rapidly evolving automotive industry [6] - The competition among Chinese automakers is shifting from traditional sales battles to a race for AI innovation and talent acquisition [6]
What Is a Humanoid Foundation Model? An Introduction to GR00T N1 - Annika & Aastha
AI Engineer· 2025-07-28 16:29
Market Trends & Industry Dynamics - McKinsey 报告指出,全球 30 个最发达经济体中,职位数量超过了能够胜任的人数,过去十年中,职位增长率超过人口增长率 420% [2][3] - 物理 AI 对于解决休闲、酒店、医疗保健、建筑、交通运输、制造业等行业的问题至关重要,这些行业不能仅靠像 ChatGPT 这样的聊天机器人来解决 [3][4] - 英伟达 Project Groot 是将人形机器人和其他形式的机器人技术引入世界的战略,涵盖了计算基础设施、软件和所需的研究 [11] Robotics Foundation Model & Technology - 英伟达的 GR 101 机器人基础模型是开源且高度可定制的,其一大特点是跨具身性,该模型包含 20 亿参数 [1][12] - 机器人数据策略包括:少量且昂贵的真实世界数据(机器人执行真实任务),大量非结构化的互联网视频数据(人类解决任务),以及理论上无限的合成数据 [14][16][17][18] - Project Groot 的数据解决方案包括数据金字塔策略,强调通过模拟和世界基础模型来增强和倍增高质量数据 [13][18][19] - Groot N1 系统引入了双系统架构,系统一快速执行任务(120 赫兹),系统二缓慢规划复杂任务,灵感来源于 Daniel Kahneman 的《思考快与慢》 [23][24][25] - Groot N1 采用扩散 Transformer 块,结合视觉编码器、VLM(视觉语言模型)和文本分词器处理图像和文本输入,并通过动作解码器生成可用于特定机器人的动作向量 [27][28][29][30] - 机器人学习的两种主要方式是模仿学习(通过复制人类专家)和强化学习(通过试错最大化奖励),Groot N1 结合使用了这两种方法 [32][33][36] Deployment & Compute Infrastructure - 物理 AI 生命周期包括生成数据、使用数据和部署,英伟达称之为“三大计算机问题”,涉及不同计算特征:模拟阶段(OVX Omniverse),训练阶段(DGX),边缘部署阶段(AGX) [9][10]