Workflow
端到端VLA
icon
Search documents
端到端VLA剩下的论文窗口期没多久了......
自动驾驶之心· 2025-11-11 00:00
Core Viewpoint - The article discusses the evolution of autonomous driving technology, highlighting the transition from rule-based systems to end-to-end models represented by companies like Ideal and Xpeng, and currently to the world model phase represented by NIO, emphasizing the continuous presence of deep learning throughout these changes [1]. Group 1: Course Introduction - The course covers the development from modular production algorithms to end-to-end systems and now to VLA, focusing on core algorithms such as BEV perception, visual language models (VLM), diffusion models, reinforcement learning, and world models [5]. - Participants will gain a comprehensive understanding of the end-to-end technical framework and key technologies, enabling them to reproduce mainstream algorithm frameworks like diffusion models and VLA, and apply their knowledge to projects [5]. Group 2: Instructor Background - The course is led by Jason, an expert in algorithms from a top domestic manufacturer, with a strong academic background including a C9 undergraduate degree and a PhD from a QS top 50 institution, along with multiple published papers [6]. Group 3: Student Feedback and Outcomes - Feedback indicates that students completing the course can achieve a level equivalent to one year of experience as an end-to-end autonomous driving algorithm engineer, benefiting from the training for internships and job recruitment [5]. Group 4: Research Guidance - The program offers a structured approach to research, guiding students through topic selection, literature review, methodology development, and paper writing, with a high success rate in publication [11][15]. - The service includes personalized matching with experienced mentors based on research direction and goals, ensuring a tailored learning experience [18]. Group 5: Additional Opportunities - Outstanding students may receive recommendation letters from prestigious institutions and direct referrals to research positions in leading companies like Alibaba and Huawei [19].
智平方创始人郭彦东:没有技术自信,中国机器人就没有创新突破
晚点LatePost· 2025-09-28 15:25
Core Viewpoint - The article emphasizes the importance of developing versatile robots that can be integrated into various real-world applications, rather than creating specialized machines that serve limited functions. The focus is on achieving general-purpose robotics that can learn and adapt in diverse environments [2][4][55]. Group 1: Company Background - Guo Yandong, a prominent figure in the robotics industry, has a strong academic and professional background, including a PhD in artificial intelligence from Purdue University and experience at Microsoft and companies like Xiaopeng Motors and OPPO [2][3][19]. - In 2023, Guo founded Zhi Ping Fang, focusing on the Visual Language Action (VLA) model for robotics, positioning the company as a leader in this niche within the industry [3][4]. Group 2: Business Strategy - Zhi Ping Fang has adopted a pragmatic approach to business, generating significant revenue through model services while many competitors focus on flashy demonstrations [4]. - The company has secured over a thousand commercial orders across high-end manufacturing sectors, including automotive and semiconductor industries, indicating strong market demand [4][28]. Group 3: Product Development - The second-generation robot, Aibao, features a wheeled chassis for stability and efficiency, contrasting with the bipedal designs pursued by competitors like Tesla [4][33]. - The robots are designed to learn tasks quickly, adapting to new roles within hours to days, which enhances their utility in manufacturing environments [30]. Group 4: Market Position and Future Plans - The robotics industry is experiencing rapid growth, with over 400 companies currently operating in China, surpassing the early days of the electric vehicle sector [47]. - Zhi Ping Fang aims to deliver tens of thousands of robots by 2028, focusing on practical applications in various sectors, including retail and logistics [51]. Group 5: Industry Insights - The article highlights the need for robots to be versatile and capable of learning from real-world interactions, which is essential for their integration into everyday environments [36][49]. - Guo advocates for a balanced approach to pricing and performance, stressing that the focus should be on creating high-quality, functional robots rather than competing solely on cost [55].
自驾方向适合去工作、读博还是转行?
自动驾驶之心· 2025-09-22 10:30
Core Viewpoint - The article discusses the decision-making process for individuals in the autonomous driving field regarding whether to pursue a PhD, continue working, or switch careers, emphasizing the importance of foundational knowledge and practical experience in the industry [2][3]. Group 1: Career Decisions - The article highlights two critical questions for individuals considering a career in autonomous driving: the availability of foundational knowledge and practical experience in their current environment, and their readiness to take on pioneering research roles if pursuing a PhD [2][3]. - It points out that many academic mentors may lack deep expertise in autonomous driving, which can hinder students' development if they do not have a solid foundation [2]. - The article suggests that students should assess their preparedness to independently explore and solve problems, especially in cutting-edge research areas where few references exist [2][3]. Group 2: Community and Resources - The "Autonomous Driving Heart Knowledge Planet" community is introduced as a resource for beginners, offering a comprehensive platform for learning, sharing knowledge, and networking within the autonomous driving field [3][5]. - The community has over 4,000 members and aims to grow to nearly 10,000 in the next two years, providing a space for technical sharing and job-seeking interactions [3][5]. - Various practical questions and topics are addressed within the community, including entry points for end-to-end systems, multi-modal models, and the latest industry trends [5][16]. Group 3: Learning and Development - The community offers a structured learning system with over 40 technical routes covering various aspects of autonomous driving, including perception, simulation, and planning control [7][14]. - It provides access to numerous resources, including video tutorials, technical discussions, and job opportunities, aimed at both beginners and those looking to advance their skills [8][18]. - The community also facilitates connections with industry leaders and experts, enhancing members' understanding of the latest developments and job market trends in autonomous driving [12][92].
机器人跨越“三重门”——具身智能创新者亲历的现实与趋势丨议事厅
Xin Hua Wang· 2025-09-15 03:44
Group 1 - The humanoid robot industry is experiencing a dichotomy, with significant advancements in practical applications contrasted by challenges in scaling production and securing orders [1][5][36] - Investment in humanoid robotics has surged, with over 20 companies in the sector pursuing IPOs, marking a transformative year for mass production of humanoid robots [1][5] - The development of embodied intelligence is at a crossroads, requiring a balance between technological innovation and practical profitability [1][15] Group 2 - Companies like Beijing Galaxy General Robotics are leading the way in deploying humanoid robots in various sectors, achieving significant milestones in industrial and retail applications [5][8] - The key challenge for humanoid robots lies in their ability to operate autonomously without remote control, which is dependent on advanced data and model training [10][12] - High-quality data is crucial for enhancing the capabilities of humanoid robots, with a focus on diverse and rich datasets to improve their performance in real-world scenarios [12][30] Group 3 - The success of humanoid robots in competitive environments, such as soccer, demonstrates their potential for real-world applications and helps in refining their operational capabilities [36][41] - The industry faces a "chicken or egg" dilemma, where technological advancements must align with market demand to create a sustainable business model [37][42] - The transition from demonstration to practical application is essential for the industry, with a focus on creating a commercial ecosystem that supports ongoing development and deployment [35][42]
师兄自己发了篇端到端VLA,申博去TOP2了。。。
自动驾驶之心· 2025-08-21 11:24
Core Viewpoint - The article discusses a research guidance program focused on Vision-Language-Action (VLA) models for autonomous driving, aimed at helping students develop their research skills and produce publishable papers in the field [5][36]. Group 1: Program Overview - The VLA research guidance program includes 12 weeks of online group research, 2 weeks of paper guidance, and 10 weeks of paper maintenance [15][36]. - The program addresses common issues faced by students, such as lack of direction, poor hands-on skills, and difficulties in writing and submitting papers [38]. Group 2: Course Structure - The course is structured into 14 weeks, covering topics from introductory lessons to advanced VLA models and paper writing methodologies [10][12][37]. - Key topics include traditional end-to-end autonomous driving, modular VLA models, and reasoning-enhanced VLA models [10][12][36]. Group 3: Target Audience and Requirements - The program targets students at various academic levels (bachelor's, master's, and doctoral) who are interested in enhancing their research capabilities in autonomous driving and AI [16][36]. - Basic requirements include familiarity with deep learning, Python programming, and the use of PyTorch [22][36]. Group 4: Course Benefits - Participants will gain insights into classic and cutting-edge papers, coding skills, and methodologies for writing and revising papers [21][36]. - The program aims to provide each student with a research idea, enhancing their ability to conduct independent research [21][36]. Group 5: Teaching Methodology - The program employs a "2+1" teaching model, featuring a main instructor and additional support staff to ensure comprehensive learning [24][25]. - Continuous assessment and feedback mechanisms are in place to optimize the learning experience and address individual student needs [25][36].