VLA
Search documents
四家具身智能公司齐聚,热钱与泡沫并存的万亿赛道谁能挺进决赛圈
Bei Ke Cai Jing· 2025-06-29 08:26
Core Insights - The embodied intelligence sector is experiencing unprecedented investment and interest, with discussions on whether there is a bubble and which applications will mature first [1][3] Investment Landscape - The current investment scale in embodied intelligence is significantly lower than that in the smart automotive sector, indicating potential for growth once scalable commercial applications are identified [3][4] - Companies believe that more capital is needed to bridge the financing gap between domestic and international players, with domestic leading companies operating at a scale of tens of billions of RMB compared to tens of billions of USD for their US counterparts [3][4] Market Applications - B-end applications are seen as the most suitable for initial deployment, particularly in areas like logistics, quality inspection, and manufacturing processes [6][7] - The industry is exploring various strategies, including the replacement of human labor in hard-to-fill positions, with a gradual expansion into more complex scenarios over the next few years [6][7] Technological Development - The VLA (Vision, Language, Action) model is considered a key framework for the future of robotics, with ongoing improvements in data collection and model training methodologies [7][8] - The industry is moving towards a unified model paradigm, emphasizing the importance of integrating visual, linguistic, and action capabilities in robotic systems [8] Competitive Landscape - The embodied intelligence sector is expected to evolve similarly to the smartphone and automotive industries, with a diverse range of players including hardware manufacturers and AI developers [9][10] - The market is anticipated to consolidate into a limited number of major players, with a focus on maintaining technological barriers and establishing closed-loop commercial applications [10][11]
北大卢宗青:现阶段世界模型和 VLA 都不触及本质|具身先锋十人谈
雷峰网· 2025-06-20 11:54
" 互联网视频数据是唯一可以 scale up 的道路 。 " 作者丨 郭海惟 编辑丨 陈彩娴 作为一名具身大脑的创业者,卢宗青有着金光闪闪的履历: 他是紧随 DeepMind之后,中国新生代的强化学习研究者。北京大学计算机学院长聘副教授,担任过智源 研究院多模态交互研究中心负责人,负责过首个国家自然科学基金委原创探索计划通用智能体项目,还同 时在NeurIPS、ICLR、ICML等机器学习的国际顶级会议担任领域主席。 早在 2023年,他旗下团队便有利用多模态模型研究通用 Agent 的研究尝试,让 Agent 玩《荒野大镖客 2》和办公,使其成为第一个从零开始在AAA级游戏中完成具体任务的 LLM 智能体。相关论文几经波折, 今年终于被 ICML 2025 录用。不过他自述对那份研究其实不够满意,因为"泛化性不足"。 当完成那些研究以后,卢宗青意识到 "当前的多模态模型缺乏与世界交互的能力"。因为模型缺少学习物 理交互的数据,所以 我们看到的那些泛化的能力本质都是 "抽象"的,它终究无法理解动作和世界的关 系,自然也无法预测世界 。 这如今成为他想在具身智能创业的起点:开发一个通用的具身人工智能模型。 卢 ...
对话灵初智能CEO王启斌:让机器人进工厂有意义,让机器人学会打麻将也有意义
Sou Hu Cai Jing· 2025-06-11 08:47
Core Viewpoint - The article discusses the advancements in embodied intelligence, particularly focusing on Lingchu Intelligent's development of the Psi R1 model, which enables robots to perform complex tasks in dynamic environments, such as playing Mahjong with humans [3][6][17]. Company Overview - Lingchu Intelligent was founded in 2024 by a team with extensive experience in robotics and artificial intelligence, including CEO Wang Qibin, who has a background in product management, and other notable figures from Stanford University and the robotics field [5][6]. - The company has established a joint laboratory with Peking University to enhance its research capabilities in embodied intelligence [5]. Technology and Innovation - The Psi R1 model represents a significant advancement in robot capabilities, allowing for "action perception-environment feedback-dynamic decision-making" in a closed-loop system [3][6]. - The transition from Vision Language Models (VLM) to Vision Language Action Models (VLA) is highlighted, with VLA enabling robots to understand and execute physical actions based on visual and textual information [7][14]. - The company aims to address the challenges of long-range operations in semi-open environments, which are crucial for practical applications in logistics and retail [8][14]. Market Position and Strategy - Lingchu Intelligent positions itself as a provider of stable and cost-effective robotic solutions, focusing on practical applications rather than superficial demonstrations [5][10]. - The company has plans to deliver products to overseas logistics clients within six months, indicating a clear market strategy [7][21]. - The target market includes manufacturing processes and logistics operations, with a focus on tasks such as material inspection and handling [21]. Financial Outlook - The company anticipates achieving sales of several hundred million by the end of 2026, reflecting a strong growth trajectory [22]. - Pricing strategies are designed to be competitive, aiming to keep robot costs below two years' worth of labor costs for similar positions [23]. Industry Trends - There is a growing expectation from investors for clear commercialization pathways in the field of embodied intelligence, contrasting with previous years [8][25]. - The article notes that while there is significant investment in the sector, the focus is shifting towards sustainable and viable technological advancements [25][26].
银河通用创始人王鹤:做好VLA,将见证具身智能第一次真正高峰的到来
Mei Ri Jing Ji Xin Wen· 2025-06-06 15:28
Core Insights - The current goal of embodied intelligence is to promote its industrialization, as stated by the founder and CTO of Galaxy General Robotics, Wang He [1][4][7] - The GALBOT G1 robot was showcased at the event, demonstrating its ability to accurately retrieve items from densely packed shelves upon receiving commands [1][3] Company Overview - Galaxy General Robotics was established in May 2023 in Haidian, Beijing, focusing on humanoid robot hardware and embodied intelligence large models [3] - The company has completed over 1.2 billion yuan in financing within the past year, attracting investments from various strategic and industry investors, including Meituan and IDG Capital [3] Product Development - On June 1, 2023, Galaxy General launched its self-developed end-to-end navigation large model, TrackVLA, which features pure visual environmental perception and language command-driven capabilities [3] - The robot dog, enhanced by the large model, can navigate complex environments like supermarkets and assist in carrying heavy items [3] Industry Trends - Embodied intelligence has gained significant public attention, highlighted by events such as the world's first humanoid robot half-marathon and a recent robot combat competition [4] - The industry faces the challenge of practical implementation, with a focus on how to effectively deploy humanoid robots in real-world scenarios [4][7] Future Plans - Galaxy General's robots are already operational in seven unmanned pharmacies in Beijing, with plans to open 100 more in major cities by the end of the year [8] - The upcoming "World Humanoid Robot Sports Conference" is scheduled for August 15-17, 2023, at the National Stadium and National Speed Skating Hall [8] Technological Insights - Wang He emphasized that the VLA (Vision-Language-Action) model represents a starting point for embodied intelligence, capable of direct visual observation and action output without intermediary steps [9] - The current focus for VLA applications is on mobility, grasping, and placing tasks, which are primarily visual-based and can be enhanced with tactile and mechanical sensors [9]
2025中国高阶智能辅助驾驶最新技术洞察:算力跃迁、数据闭环、VLA与世界模型
EqualOcean· 2025-06-05 05:42
Investment Rating - The report does not explicitly state an investment rating for the industry Core Insights - The report highlights the evolution of advanced driver assistance systems (ADAS) in China, focusing on the expansion of operational design domains (ODD), technological equity, safety concerns, and supportive policies [4][21][23] - It emphasizes the need for algorithm, data, and computing power upgrades to address safety shortcomings in high-level ADAS technologies [23][66] - The report discusses the transition from modular to end-to-end architectures in vehicle algorithms, aiming for human-like driving capabilities [66][68] Summary by Sections 1. Market Background - The expansion of high-level ADAS ODD is noted, with a focus on technological inclusivity and addressing accident anxiety through safety redundancies [4][21] - Policy support is highlighted as crucial for rational promotion of ADAS technologies [4][21] 2. Technology Insights - The report decodes the underlying logic of data, algorithms, and computing power in high-level ADAS [4][28] - It discusses the computing power landscape, noting the shift towards higher TOPS (trillions of operations per second) capabilities in vehicle and cloud computing [42][44] - Data challenges, including collection and positioning technologies, are identified as critical areas for development [4][28] 3. Competitive Analysis - The competitive landscape is analyzed, detailing the tiered structure of companies and their development strategies [29][30] - The report outlines various collaboration models among automotive manufacturers and technology providers, emphasizing the balance between self-research and external sourcing [83] 4. Trend Insights - The report notes the commercialization progress of passenger vehicle L3 systems, indicating a growing market for advanced ADAS [31][32] - It highlights the importance of continuous upgrades and iterations in ADAS functionalities to meet evolving consumer expectations and safety standards [82][83]
AI 如何成为理想一号工程
晚点LatePost· 2025-05-23 07:41
Core Viewpoint - The article discusses Li Auto's strategic focus on artificial intelligence (AI) and its evolution from a vehicle-centric AI assistant to a multi-platform intelligent application, emphasizing the importance of AI in future competitiveness [4][5][6]. Group 1: Strategic Meetings and AI Prioritization - Li Auto holds biannual closed-door strategy meetings to discuss future directions, with significant participation from top executives and industry leaders [3]. - Following a strategic meeting, Li Auto adjusted its AI-related business priorities, emphasizing the strategic importance of intelligent driving over other AI applications [4][5]. - The company aims to become a global leader in AI by 2030, with a clear focus on enhancing its AI capabilities and applications [5][6]. Group 2: Development of AI Capabilities - Li Auto has transitioned its AI assistant, "Li Xiang," from a vehicle-only application to a multi-platform tool, including mobile and web applications [7]. - The company has invested in self-developed algorithms, achieving a full switch to in-house technology for its AI functionalities by March 2023 [7][8]. - The introduction of the multi-modal cognitive model, Mind GPT 1.0, marks a significant advancement in Li Auto's AI capabilities [7]. Group 3: Intelligent Driving and Technological Advancements - Li Auto's intelligent driving system, AD Max, was launched to address product shortcomings and enhance competitive positioning in the market [10][11]. - The company has initiated a large-scale recruitment drive for its intelligent driving team, reflecting its commitment to advancing this technology [10]. - The shift towards an "end-to-end" model for intelligent driving aims to streamline processes and improve system performance through better data utilization [10][11]. Group 4: Organizational Changes and AI Integration - Li Auto established an AI Technical Committee to integrate AI capabilities across various business lines, enhancing collaboration and execution [15][16]. - The committee includes leaders from key departments, ensuring that AI is a core focus in strategic decision-making [16][17]. - The company aims to develop a foundational model that serves as a core capability for all AI projects, positioning itself as a leader in the automotive AI landscape [17].
AI 如何成为理想一号工程
晚点Auto· 2025-05-22 07:16
Core Viewpoint - The article discusses Li Auto's strategic shift towards AI and intelligent driving, emphasizing the importance of AI in the company's long-term competitiveness and product development [3][10][12]. Group 1: AI Strategy and Development - Li Auto held a strategic meeting in October 2022, where the priority of AI-related business was adjusted, emphasizing the strategic importance of intelligent driving [3][5]. - The company aims to become a global leader in AI by 2030, with significant investments in AI talent and technology [5][10]. - Li Auto's AI assistant, "Li Xiang," has evolved from a car-mounted system to a multi-platform application, indicating a broader vision for AI applications beyond the vehicle [7][8]. Group 2: Intelligent Driving Focus - Intelligent driving was designated as the company's primary strategy in 2023, with plans to heavily invest in this area to compete with major players like Huawei [10][12]. - The company has expanded its intelligent driving team significantly, with over 50 new positions created in late 2023, reflecting a strong commitment to this technology [10][11]. - Li Auto is transitioning its intelligent driving technology from a modular approach to an "end-to-end" model, which is expected to enhance performance and user experience [11][12]. Group 3: Organizational Changes and AI Integration - An AI Technical Committee was established to integrate AI capabilities across various business lines, indicating a strategic focus on AI as a core business direction [14][15]. - The committee includes leaders from product development and research departments, ensuring that AI applications are aligned with the company's overall strategy [15][16]. - Li Auto's foundational model for AI is seen as a critical component for future developments, with aspirations to rank among the top three in the industry [17][18].
TransDiffuser: 理想VLA diffusion出轨迹的架构
理想TOP2· 2025-05-18 13:08
Core Viewpoint - The article discusses the advancements in the field of autonomous driving, particularly focusing on the Diffusion model and its application in generating driving trajectories, highlighting the differences between VLM and VLA systems [1][4]. Group 1: Diffusion Model Explanation - Diffusion is a generative model that learns data distribution through a process of adding noise (Forward Process) and removing noise (Reverse Process), akin to a reverse puzzle [4]. - The model's denoising process involves training a neural network to predict and remove noise, ultimately generating target data [4]. - Diffusion not only generates the vehicle's trajectory but also predicts the trajectories of other vehicles and pedestrians, enhancing decision-making in complex traffic environments [5]. Group 2: VLM and VLA Systems - VLM consists of two systems: System 1 mimics learning to output trajectories without semantic understanding, while System 2 has semantic understanding but only provides suggestions [2]. - VLA is a single system with both fast and slow thinking capabilities, inherently possessing semantic reasoning [2]. - The output of VLA is action tokens that encode the vehicle's driving behavior and surrounding environment, which are then decoded into driving trajectories using the Diffusion model [4][5]. Group 3: TransDiffuser Architecture - TransDiffuser is an end-to-end trajectory generation model that integrates multi-modal perception information to produce high-quality, diverse trajectories [6][7]. - The architecture includes a Scene Encoder for processing multi-modal data and a Denoising Decoder that utilizes the DDPM framework for trajectory generation [7][9]. - The model employs a multi-head cross-attention mechanism to fuse scene and motion features during the denoising process [9]. Group 4: Performance and Innovations - The model achieves a Predictive Driver Model Score (PDMS) of 94.85, outperforming existing methods [11]. - Key innovations include anchor-free trajectory generation and a multi-modal representation decorrelation optimization mechanism to enhance trajectory diversity and reduce redundancy [11][12]. Group 5: Limitations and Future Directions - The authors note challenges in fine-tuning the model, particularly the perception encoder [13]. - Future directions involve integrating reinforcement learning and referencing models like OpenVLA for further advancements [13].
理想汽车,压力山大
虎嗅APP· 2025-05-09 13:14
出品丨虎嗅汽车组 作者丨肖漫 头图丨视觉中国 尽管账上有千亿现金,但理想还没有迎来躺着数钱的时候。 理想今年的销量目标是70万辆,比去年多了20万,而理想规划的新车型仅有理想i8和理想i6两款,且都在下半年发布,其他均为改款焕新。销量的重 担,依然落在L系列上。然而,L系列的销量疲势已经出现,近4个月都未能突破月销4万辆的瓶颈,不及去年下半年的表现。 一季度承压明显,进入四月,理想逐渐出牌。先是在上海车展发布MEGA家庭版车型,并在 5 月 8 日发布了 L 系列的智能焕新版(即L系列的2025 款)。 理想的打法是"加量不加价",除了全系升级智能辅助驾驶硬件,L7/8 是全系标配魔毯双腔空悬、新增 52.3kWh 大电池,L9 则是用上了双枪双阀魔毯空 气悬架、十八点热石按摩等配置。 理想的刀法落得很精确,增配的目的显然是保量。理想手上的牌仅剩理想i8和理想i6,接下来的战,理想要怎么打? 理想L系列,尚能战否? 理想的第一道难关,是要守住增程车型的基本盘。 今年以来,理想L系列车型经历了不同程度的下滑,销量扛把子L6升至回到了去年刚上市的水平,L7的单月销量也都在一万以下。 数据来源: 车主之家 理想L系 ...
理想汽车,压力山大
Hu Xiu· 2025-05-09 10:58
Core Viewpoint - Despite having over 100 billion in cash, Li Auto has not yet reached a point of effortless profitability, facing significant challenges in achieving its sales targets for the year [1]. Group 1: Sales Performance and Targets - Li Auto's sales target for the year is set at 700,000 units, an increase of 200,000 units from the previous year, but the new models, Li i8 and Li i6, are only set to launch in the second half of the year, placing the sales burden primarily on the L series [1][2]. - The L series has shown signs of sales fatigue, failing to exceed 40,000 units per month for the past four months, which is below the performance of the second half of last year [1][3]. - The L series' market share for range-extended products is projected to decline by 2.1% in Q4 2024, indicating increasing competition [6]. Group 2: Competitive Landscape - Li Auto faces growing competition from various manufacturers, including Huawei, which has launched multiple models that directly compete with the L series [6][7]. - The introduction of over 10 new mid-to-large SUVs this year, such as the AITO M8 and Lynk & Co 900, is intensifying competition in the market [7]. - The AITO M8 has already accumulated over 70,000 pre-orders, positioning it as a strong competitor against the L8 and L9 models [7]. Group 3: Product Strategy and Innovations - Li Auto's strategy includes enhancing the L series with new features while maintaining price stability, aiming to preserve sales volume [2][8]. - The L series is undergoing significant upgrades in intelligent driving capabilities, with the introduction of the new Thor-U chip and advanced laser radar systems [8][9]. - The new VLA (Vision Language Action) model aims to differentiate Li Auto's intelligent driving technology from competitors, focusing on advanced capabilities that mimic human-like understanding and reasoning [9][11]. Group 4: Electric Vehicle Strategy - Li Auto's upcoming pure electric models, the i8 and i6, are seen as critical variables for the company's performance this year, with conservative internal sales expectations of 50,000 units for both [15][16]. - The company is exploring new growth avenues through channel transformations and expanding into overseas markets, although the complexity of international markets poses challenges [16][17]. - The domestic market remains the primary focus, but increasing competition and uncertainties in the electric vehicle segment make achieving sales targets difficult [17].