Workflow
端到端
icon
Search documents
小鹏&理想全力攻坚的VLA路线,到底都有哪些研究方向?
自动驾驶之心· 2025-09-17 23:33
Core Viewpoint - The article discusses the transition in intelligent driving technology from rule-driven to data-driven approaches, highlighting the limitations of end-to-end models in complex scenarios and the potential of VLA (Vision-Language Action) as a more streamlined solution [1][2]. Group 1: Challenges in Learning and Research - The technical stack for autonomous driving VLA has not yet converged, leading to a proliferation of algorithms and making it difficult for newcomers to enter the field [2]. - A lack of high-quality documentation and fragmented knowledge in various domains increases the entry barrier for beginners in autonomous driving VLA research [2]. Group 2: Course Development - A new course titled "Autonomous Driving VLA Practical Course" has been developed to address the challenges faced by learners, focusing on a comprehensive understanding of the VLA technical stack [3][4]. - The course aims to provide a one-stop opportunity to enhance knowledge across multiple fields, including visual perception, language modules, and action modules, while integrating cutting-edge technologies [2][3]. Group 3: Course Features - The course emphasizes quick entry into the subject matter through a Just-in-Time Learning approach, using simple language and case studies to help students grasp core technologies rapidly [3]. - It aims to build a framework for research capabilities, enabling students to categorize papers and extract innovative points to form their own research systems [4]. - Practical application is a key focus, with hands-on sessions designed to complete the theoretical-to-practical loop [5]. Group 4: Course Outline - The course covers the origins of autonomous driving VLA, foundational algorithms, and the differences between modular and integrated VLA [6][10][12]. - It includes practical sessions on dataset creation, model training, and performance enhancement, providing a comprehensive learning experience [12][14][16]. Group 5: Instructor Background - The instructors have extensive experience in multimodal perception, autonomous driving VLA, and large model frameworks, with numerous publications in top-tier conferences [22]. Group 6: Learning Outcomes - Upon completion, students are expected to thoroughly understand the current advancements in autonomous driving VLA and master core algorithms [23][24]. - The course is designed to benefit students in internships, job recruitment, and further academic pursuits in the field [26]. Group 7: Course Schedule - The course is set to begin on October 20, with a structured timeline for unlocking chapters and providing support through online Q&A sessions [27].
那些号称端到端包治百病的人,压根从来没做过PnC......
自动驾驶之心· 2025-09-16 23:33
Core Viewpoint - The article discusses the current state and future potential of end-to-end (E2E) autonomous driving systems, emphasizing the need for a shift from modular to E2E approaches in the industry, while acknowledging the challenges and limitations that still exist in achieving maturity in this technology [3][5]. Group 1: End-to-End Autonomous Driving - The concept of end-to-end systems involves directly processing raw sensor data to output control signals for vehicles, representing a significant shift from traditional modular approaches [3][4]. - E2E systems are seen as a way to provide a comprehensive representation of the information affecting vehicle behavior, which is crucial for handling the open-set scenarios of autonomous driving [4]. - The industry is currently divided, with some companies focusing on Vehicle Language Architecture (VLA) and others on traditional methods, but there is a consensus that E2E systems are the future [2][5]. Group 2: Industry Trends and Challenges - There is a growing recognition that autonomous driving is transitioning from rule-based to knowledge-driven systems, which necessitates a deeper understanding of E2E methodologies [5]. - Despite the high potential of E2E systems, there are still significant challenges to overcome before they can fully replace traditional planning and control methods [5]. - The article suggests that companies should allow more time for E2E systems to mature rather than rushing to implement them without adequate understanding [5]. Group 3: Community and Learning Resources - The "Autonomous Driving Heart Knowledge Planet" community aims to provide a platform for sharing knowledge and resources related to autonomous driving, including technical routes and job opportunities [8][18]. - The community has gathered over 4,000 members and aims to expand to nearly 10,000 within two years, offering a space for both beginners and advanced learners to engage with industry experts [8][18]. - Various learning resources, including video tutorials and technical discussions, are available to help members navigate the complexities of autonomous driving technologies [12][18].
2025年,盘一盘中国智驾的自动驾驶一号位都有谁?
自动驾驶之心· 2025-09-10 23:33
Core Viewpoint - The automatic driving industry is undergoing a significant technological shift towards "end-to-end" solutions, driven by Tesla's leadership and advancements in large model technologies. This shift is prompting domestic automakers to increase investments and adjust their structures, making "end-to-end" a mainstream production solution by 2024 [1]. Group 1: Key Figures in Automatic Driving - The article highlights key figures in China's automatic driving sector, focusing on those who directly influence technology routes and team growth [1]. - Notable leaders include: - **Lang Xianpeng** from Li Auto, who has led advancements in assisted driving technology, including the launch of full-scene NOA and the no-map NOA feature [5]. - **Ye Hangjun** from Xiaomi, who has been pivotal in the development of Xiaomi's end-to-end driving system and has overseen multiple cutting-edge projects [7][9]. - **Ren Shaoqing** from NIO, who has significantly contributed to the development of urban NOA and emphasizes the importance of data in smart driving [11]. - **Li Liyun** from XPeng, who has taken over leadership in smart driving and focuses on a pure vision solution [14][15]. - **Yang Dongsheng** from BYD, who has led the development of the DM-i hybrid system and is pushing for the integration of advanced driving systems across all BYD models [17][20]. - **Su Jing** from Horizon Robotics, who is leading the development of end-to-end HSD solutions [21][22]. - **Cao Xudong** from Momenta, who has developed a data-driven strategy for autonomous driving and is focusing on end-to-end large models [25][26]. Group 2: Technological Trends and Innovations - The article discusses the technological evolution in the automatic driving sector, emphasizing the transition to end-to-end architectures and the emergence of large models, world models, and VLM solutions [1][53]. - Companies are adopting various strategies: - Li Auto is focusing on E2E and VLA systems [5]. - Xiaomi is heavily investing in end-to-end technology with significant output [9]. - NIO is pursuing a world behavior model approach [11]. - XPeng is committed to a pure vision strategy [15]. - BYD is integrating advanced driving systems across its entire lineup [20]. - Momenta is leveraging a dual strategy of L2 and L4 development to enhance its market position [26]. Group 3: Future Outlook - The article concludes that the leaders in the automatic driving industry are crucial in shaping the future of smart driving in China, with a shared goal of creating systems that are safe, reliable, and tailored to local conditions [51][53]. - The ongoing competition and collaboration among these leaders will drive the industry towards more intelligent and user-friendly solutions [51].
后端到端时代:我们必须寻找新的道路吗?
自动驾驶之心· 2025-09-01 23:32
Core Viewpoint - The article discusses the evolution of autonomous driving technology, particularly focusing on the transition from end-to-end systems to Vision-Language-Action (VLA) models, highlighting the differing approaches and perspectives within the industry regarding these technologies [6][32][34]. Group 1: VLA and Its Implications - VLA, or Vision-Language-Action Model, aims to integrate visual perception and natural language processing to enhance decision-making in autonomous driving systems [9][10]. - The VLA model attempts to map human driving instincts into interpretable language commands, which are then converted into machine actions, potentially offering both strong integration and improved explainability [10][19]. - Companies like Wayve are leading the exploration of VLA, with their LINGO series demonstrating the ability to combine natural language with driving actions, allowing for real-time interaction and explanations of driving decisions [12][18]. Group 2: Industry Perspectives and Divergence - The current landscape of autonomous driving is characterized by a divergence in approaches, with some teams embracing VLA while others remain skeptical, preferring to focus on traditional Vision-Action (VA) models [5][6][19]. - Major players like Huawei and Horizon have expressed reservations about VLA, opting instead to refine existing VA models, which they believe can still achieve effective results without the complexities introduced by language processing [5][21][25]. - The skepticism surrounding VLA stems from concerns about the ambiguity and imprecision of natural language in driving contexts, which can lead to challenges in real-time decision-making [19][21][23]. Group 3: Technical Challenges and Considerations - VLA models face significant technical challenges, including high computational demands and potential latency issues, which are critical in scenarios requiring immediate responses [21][22]. - The integration of language processing into driving systems may introduce noise and ambiguity, complicating the training and operational phases of VLA models [19][23]. - Companies are exploring various strategies to mitigate these challenges, such as enhancing computational power or refining data collection methods to ensure that language inputs align effectively with driving actions [22][34]. Group 4: Future Directions and Industry Outlook - The article suggests that the future of autonomous driving may not solely rely on new technologies like VLA but also on improving existing systems and methodologies to ensure stability and reliability [34]. - As the industry evolves, companies will need to determine whether to pursue innovative paths with VLA or to solidify their existing frameworks, each offering unique opportunities and challenges [34].
自动驾驶之心业务合伙人招募来啦!模型部署/VLA/端到端方向~
自动驾驶之心· 2025-08-28 08:17
Core Viewpoint - The article emphasizes the recruitment of business partners for the autonomous driving sector, highlighting the need for expertise in various advanced technologies and offering attractive incentives for potential candidates [2][3][5]. Group 1: Recruitment Details - The company plans to recruit 10 outstanding partners for autonomous driving-related course development, research paper guidance, and hardware development [2]. - Candidates with expertise in large models, multimodal models, diffusion models, and other advanced technologies are particularly welcome [3]. - Preferred qualifications include a master's degree or higher from universities ranked within the QS200, with priority given to candidates with significant conference contributions [4]. Group 2: Incentives and Opportunities - The company offers resource sharing related to autonomous driving, including job recommendations, PhD opportunities, and study abroad guidance [5]. - Attractive cash incentives are part of the compensation package for successful candidates [5]. - Opportunities for collaboration on entrepreneurial projects are also available [5].
端到端/大模型/世界模型秋招怎么准备?我们建了一个求职交流群...
自动驾驶之心· 2025-07-30 23:33
Core Viewpoint - There is a growing gap between academic knowledge and practical skills required in the workplace, particularly for job seekers preparing for campus recruitment [1] Group 1: Industry Observations - Many individuals with work experience are exploring opportunities in large models and world models, indicating a shift in industry focus [1] - Traditional regulatory frameworks are being reconsidered as the industry moves towards more embodied approaches [1] Group 2: Community Building - The company aims to create a comprehensive platform that connects talent across the industry, facilitating growth and collaboration [1] - A new community has been established to discuss industry-related topics, including company developments, product research, and job seeking [1] - The community encourages networking among industry peers and aims to provide timely insights into industry trends [1]
上半年净利大增44%,药明康德加速回到增长轨道
36氪· 2025-07-11 13:48
Core Viewpoint - WuXi AppTec is entering a growth phase, with significant revenue and profit increases expected in the first half of 2025, driven by its unique "integrated, end-to-end" CRDMO business model [4][5][21]. Financial Performance - WuXi AppTec anticipates a revenue of approximately RMB 20.799 billion for the first half of 2025, representing a year-on-year growth of about 20.64%, with core business revenue expected to grow by approximately 24.24% [4]. - The adjusted net profit is projected to be around RMB 6.315 billion, reflecting a year-on-year increase of approximately 44.43% [4]. - The company expects to achieve a net profit of approximately RMB 8.561 billion, which is a year-on-year increase of about 101.92%, largely due to the sale of equity in an associate company [4][11]. Market Reaction - Following the positive earnings forecast, WuXi AppTec's stock surged over 10% in the Hong Kong market, indicating strong investor confidence in the company's recovery and growth potential [5][20]. Business Model and Growth Drivers - The company's success is attributed to its focus on the "integrated, end-to-end" CRDMO model, which allows for a steady flow of early-stage projects converting into downstream projects [14][15]. - WuXi AppTec's order backlog exceeded RMB 40 billion for the first time, with a significant increase in orders expected to drive future revenue growth [8][15]. Regional and Sectoral Insights - The overseas market remains a key revenue driver for WuXi AppTec, with faster recovery in biotech financing compared to domestic markets [16]. - The company is expanding its capabilities in new molecular businesses, particularly in peptides and oligonucleotides, which are expected to be significant growth drivers in the coming years [16][19]. Capacity Expansion - WuXi AppTec is actively expanding its production capacity, with plans to increase its peptide solid-phase synthesis reactor volume significantly by the end of 2025 [18][19]. - The company is also investing heavily in global D&M capacity construction, with capital expenditures projected to reach RMB 7-8 billion in 2025 [19]. Future Outlook - With the global biopharmaceutical investment climate improving and the domestic innovative drug market remaining strong, WuXi AppTec is well-positioned for continued growth [21].
当我们谈大模型和vla岗位的时候,究竟有哪些内容?(附岗位)
自动驾驶之心· 2025-07-11 11:23
Core Viewpoint - The article discusses the differences between VLA (Vision-Language-Action) and end-to-end models in the context of autonomous driving, emphasizing the importance of large models and their applications in the industry [2]. Group 1: Job Descriptions and Requirements - Positions related to large model development, including VLA and end-to-end roles, are highlighted, with a focus on skills in fine-tuning, lightweight models, and deployment [2]. - The job of an end-to-end/VLA engineer involves developing and implementing driving systems, optimizing model structures, and constructing high-quality training datasets [6]. - The VLA/VLM algorithm position requires a master's degree in computer science or AI, with 3-5 years of experience in autonomous driving or AI algorithms, and proficiency in VLA/VLM architectures [8][10]. Group 2: Technical Skills and Experience - Candidates are expected to have experience with multimodal large language models, fine-tuning existing models for specific business scenarios, and familiarity with Transformer and multimodal technologies [5]. - Experience in computer vision, trajectory prediction, and decision planning is essential, along with a strong foundation in mainstream technologies and frameworks like PyTorch [9]. - The article emphasizes the need for candidates to have published papers in top conferences or achieved notable results in international competitions [9][11].
对话千寻高阳:端到端是具身未来,分层模型只是短期过渡
晚点LatePost· 2025-07-10 12:30
Core Viewpoint - The breakthrough in embodied intelligence will not occur in laboratories but in practical applications, indicating a shift from academic research to entrepreneurial ventures in the field [1][5]. Company Overview - Qianxun Intelligent was founded by Gao Yang, a chief scientist and assistant professor at Tsinghua University, and Han Fengtao, a veteran in the domestic robotics industry, to explore the potential of embodied intelligence [2][3]. - The company recently demonstrated its new Moz1 robot, capable of performing intricate tasks such as organizing office supplies [4][3]. Industry Trends - The development of embodied intelligence is currently at a critical scaling moment, similar to the advancements seen with large models like GPT-4, but it may take an additional four to five years for significant breakthroughs [2][29]. - There is a notable difference in the development of embodied intelligence between China and the U.S., with China having advantages in hardware manufacturing and faster repair times for robots [6][7]. Research and Development - Gao Yang transitioned from autonomous driving to robotics, believing that robotics offers more versatility and challenges compared to specialized applications like self-driving cars [10][12]. - The field of embodied intelligence is experiencing a convergence of ideas, with many previously explored paths being deemed unfeasible, leading to a more focused research agenda [12][13]. Technological Framework - Gao Yang defines the stages of embodied intelligence, with the industry currently approaching Level 2, where robots can perform a limited range of tasks in office settings [17][18]. - The preferred approach in the industry is end-to-end systems, particularly the vision-language-action (VLA) model, which integrates visual, linguistic, and action components into a unified framework [19][20]. Data and Training - The training of VLA models involves extensive data collection from the internet, followed by fine-tuning with real-world operation data and reinforcement learning to enhance performance [23][24]. - The scaling law observed in the field indicates that increasing data volume significantly improves model performance, with a ratio of 10-fold data increase leading to substantial performance gains [27][28]. Market Dynamics - The demand for humanoid robots stems from the need to operate in environments designed for humans, although non-humanoid designs may also be effective depending on the application [33][34]. - The industry is moving towards a model where both the "brain" (AI) and the "body" (robotic hardware) are developed in tandem, similar to the automotive industry, allowing for specialization in various components [39][41].
从苹果复盘再谈理想:是智能机,而非家电
Tianfeng Securities· 2025-06-16 05:09
Industry Rating - The industry investment rating is maintained at "Outperform" [1] Core Insights - The report emphasizes the evolution of Apple from iPhone 1 to iPhone 4, highlighting how it reshaped the smartphone standard and established a composite profit system through hardware, services, and ecosystem integration [3][10] - The report discusses Li Auto's transition from range-extended vehicles to pure electric and AI integration, indicating a gradual enhancement of its competitive moat [4][34] Summary by Sections Apple Review - The iPhone 1 launched in 2007 revolutionized smartphones by changing interaction logic, functionality, design, and software ecosystem [10] - By 2024, iPhone products accounted for 51.45% of Apple's revenue, while services contributed 24.59% with a high gross margin of 73.9%, indicating a significant shift towards service profitability [3][16] Li Auto - Li Auto's product definition capabilities have redefined the home SUV market, with the Li ONE and the popular L series establishing a strong brand identity [4][23] - The company has successfully addressed key issues in the electric vehicle market, such as range anxiety and charging concerns, through its range-extended technology [4][33] - Li Auto is advancing towards L4 autonomous driving capabilities, with significant developments in AI and intelligent driving systems [34][38]