Workflow
视觉感知
icon
Search documents
即将开课!自动驾驶VLA全栈学习路线图分享~
自动驾驶之心· 2025-10-15 23:33
Core Insights - The focus of academia and industry has shifted towards VLA (Vision-Language Action) in autonomous driving, which provides human-like reasoning capabilities for vehicle decision-making [1][4] - Traditional methods in perception and lane detection have matured, leading to decreased attention in these areas, while VLA is now a critical area for development among major autonomous driving companies [4][6] Summary by Sections Introduction to VLA - VLA is categorized into modular VLA, integrated VLA, and reasoning-enhanced VLA, which are essential for improving the reliability and safety of autonomous driving [1][4] Course Overview - A comprehensive course on autonomous driving VLA has been designed, covering foundational principles to practical applications, including cutting-edge algorithms like CoT, MoE, RAG, and reinforcement learning [6][12] Course Structure - The course consists of six chapters, starting with an introduction to VLA algorithms, followed by foundational algorithms, VLM as an interpreter, modular and integrated VLA, reasoning-enhanced VLA, and a final project [12][20] Chapter Highlights - Chapter 1 provides an overview of VLA algorithms and their development history, along with benchmarks and evaluation metrics [13] - Chapter 2 focuses on the foundational knowledge of Vision, Language, and Action modules, including the deployment of large models [14] - Chapter 3 discusses VLM's role as an interpreter in autonomous driving, covering classic and recent algorithms [15] - Chapter 4 delves into modular and integrated VLA, emphasizing the evolution of language models in planning and control [16] - Chapter 5 explores reasoning-enhanced VLA, introducing new modules for decision-making and action generation [17][19] Learning Outcomes - The course aims to deepen understanding of VLA's current advancements, core algorithms, and applications in projects, benefiting participants in internships and job placements [24]
南海浮标观测网实录“桦加沙”
Core Insights - The article highlights the response to Typhoon "Haikui" through the deployment of a buoy observation network, marking a significant advancement in marine extreme environment monitoring from "data observation" to "visual perception" [1][2] Group 1: Typhoon Monitoring and Data Collection - The South China Sea Investigation Center utilized the buoy observation network to conduct real-time monitoring of Typhoon "Haikui," which entered the South China Sea on September 22 [1] - The MF14006 buoy recorded extreme conditions, including a maximum wave height of 14.4 meters and a peak wind speed of 38 meters per second during the typhoon [1] - The buoy's environmental sensing system provided visual data of the sea conditions during the typhoon, aiding in the validation of ocean numerical forecasting models [1] Group 2: Technological Advancements - The environmental sensing system was designed and developed by a technical team from the South China Sea Investigation Center, achieving breakthroughs in image target recognition, efficient compression algorithms, and narrowband satellite transmission [1] - The system enabled a hundredfold image compression and rapid retransmission of lost packets, significantly enhancing data transmission reliability under extreme sea conditions [1][2]
暑期打比赛!PRCV 2025空间智能与具身智能视觉感知挑战赛报名即将截止~
自动驾驶之心· 2025-08-04 07:31
Group 1 - The competition aims to advance research in spatial intelligence and embodied intelligence, which are critical technologies for applications in autonomous driving, smart cities, and robotics [5][7] - The integration of reinforcement learning and computer vision is highlighted as a driving force for breakthroughs in the field [5][7] Group 2 - The competition is organized by a team of experts from various institutions, including Beijing University of Science and Technology and Tsinghua University, with sponsorship from Beijing Jiuzhang Yunjing Technology Co., Ltd [9][10] - Participants can register as individuals or teams, with a maximum of five members per team, and must submit their registration by August 10 [11][12] Group 3 - The competition consists of two tracks: Spatial Intelligence and Embodied Intelligence, each with specific tasks and evaluation criteria [20][23] - For Spatial Intelligence, participants are required to construct a 3D reconstruction model based on multi-view aerial images, while the Embodied Intelligence track involves completing tasks in dynamic occlusion scenarios [20][23] Group 4 - Evaluation for Spatial Intelligence includes rendering quality and geometric accuracy, with scores based on a weighted formula [22][21] - The Embodied Intelligence track evaluates task completion and execution efficiency, with scores also based on a weighted system [23][25] Group 5 - Prizes for each track include cash rewards and computing resource vouchers, with a total of 12 awards distributed among the top teams [25][27] - The competition emphasizes the importance of intellectual property rights and requires participants to ensure their submissions are original and self-owned [31][28]
暑假打比赛!PRCV 2025空间智能与具身智能视觉感知挑战赛启动~
自动驾驶之心· 2025-07-17 07:29
Core Viewpoint - The competition aims to advance research in spatial intelligence and embodied intelligence, focusing on visual perception as a key technology for applications in autonomous driving, smart cities, and robotics [2][4]. Group 1: Competition Purpose and Significance - Visual perception is crucial for achieving spatial and embodied intelligence, with significant applications in various fields [2]. - The competition seeks to promote high-efficiency and high-quality research in spatial and embodied intelligence technologies [4]. - It aims to explore innovations in cutting-edge methods such as reinforcement learning, computer vision, and graphics [4]. Group 2: Competition Organization - The competition is organized by a team of experts from institutions like Beijing University of Science and Technology, Tsinghua University, and the Chinese Academy of Sciences [5]. - The competition is supported by sponsors and technical support units, including Beijing Jiuzhang Yunjing Technology Co., Ltd. [5]. Group 3: Competition Data and Resources - Participants will have access to real and simulated datasets, including multi-view drone aerial images and specific simulation environments for tasks [11]. - The sponsor will provide free computing resources, including H800 GPU power for validating and testing submitted algorithms [12][13]. Group 4: Task Settings - The competition consists of two tracks: Spatial Intelligence and Embodied Intelligence, each with specific tasks and evaluation methods [17]. - The Spatial Intelligence track involves constructing a 3D reconstruction model based on multi-view aerial images [17]. - The Embodied Intelligence track focuses on completing tasks in dynamic occlusion simulation environments [17]. Group 5: Evaluation Methods - Evaluation for Spatial Intelligence includes rendering quality and geometric accuracy, with specific metrics like PSNR and F1-Score [19][20]. - For Embodied Intelligence, evaluation will assess task completion and execution efficiency, with metrics such as success rate and average pose error [23][21]. Group 6: Awards and Recognition - Each track will have awards, including cash prizes and computing vouchers, sponsored by Beijing Jiuzhang Yunjing Technology Co., Ltd. [25]. - Awards include first prize of 6,000 RMB and 500 computing vouchers, with additional prizes for second and third places [25]. Group 7: Intellectual Property and Data Usage - Participants must sign a data usage agreement, ensuring that the provided datasets are used solely for the competition and deleted afterward [29]. - Teams must guarantee that their submitted results are reproducible and that all algorithms and related intellectual property belong to them [29]. Group 8: Conference Information - The 8th China Conference on Pattern Recognition and Computer Vision (PRCV 2025) will be held from October 15 to 18, 2025, in Shanghai [27]. - The conference will feature keynote speeches from leading experts and various forums to promote academic and industry collaboration [28].
暑假打打比赛!PRCV 2025空间智能与具身智能视觉感知挑战赛正式启动~
自动驾驶之心· 2025-06-30 12:51
Core Viewpoint - The competition aims to advance research in spatial intelligence and embodied intelligence, focusing on visual perception as a key supporting technology for applications in autonomous driving, smart cities, and robotics [2][4]. Group 1: Competition Purpose and Significance - Visual perception is crucial for achieving spatial and embodied intelligence, with significant applications in various fields [2]. - The competition seeks to promote high-efficiency and high-quality research in spatial and embodied intelligence technologies [4]. - It aims to explore innovations in cutting-edge methods such as reinforcement learning, computer vision, and graphics [4]. Group 2: Competition Organization - The competition is organized by a team of experts from institutions like Beijing University of Science and Technology, Tsinghua University, and the Chinese Academy of Sciences [5]. - The competition is sponsored by Beijing Jiuzhang Yunjing Technology Co., Ltd., which also provides technical support [5]. Group 3: Competition Data and Resources - Participants will have access to real and simulated datasets, including multi-view drone aerial images and specific simulation environments for tasks [11]. - The sponsor will provide free computing resources, including H800 GPU power, for validating and testing submitted algorithms [12][13]. Group 4: Task Settings - The competition consists of two tracks: Spatial Intelligence and Embodied Intelligence, each with specific tasks and evaluation methods [17]. - Spatial Intelligence requires building a 3D reconstruction model based on multi-view aerial images, while Embodied Intelligence involves completing tasks in dynamic occlusion scenarios [17]. Group 5: Evaluation Methods - Evaluation for Spatial Intelligence includes rendering quality and geometric accuracy, with scores based on PSNR and F1-Score metrics [19][20]. - For Embodied Intelligence, evaluation focuses on task completion and execution efficiency, with metrics such as success rate and average pose error [23][21]. Group 6: Submission and Awards - Results must be submitted in a specified format, and top-ranking teams will have their results reproduced for evaluation [24]. - Awards for each track include cash prizes and computing vouchers, with a total of 12 awards distributed among the top teams [25].
从“幕后”到“台前”,乐动机器人研发开支下降营销开支涨三倍
Bei Jing Shang Bao· 2025-06-16 14:36
Core Viewpoint - Ledong Robotics, established in 2017, has launched its first generation of lawn mowing robots in 2024, with plans for a second generation in the following year. The company has filed for an IPO in Hong Kong, reporting revenues of 234 million RMB, 277 million RMB, and 467 million RMB for 2022, 2023, and 2024 respectively, with corresponding net losses of 73.13 million RMB, 68.49 million RMB, and 57.48 million RMB [1][2]. Financial Performance - Revenue for Ledong Robotics increased from 234 million RMB in 2022 to 467 million RMB in 2024, with a compound annual growth rate (CAGR) of approximately 50.5% [1][2]. - The company's net losses decreased from 73.13 million RMB in 2022 to 57.48 million RMB in 2024, indicating a reduction in losses over the years [1][2]. - The gross profit margin declined from 27.3% in 2022 to 19.5% in 2024, while sales costs increased from 169.89 million RMB to 376.03 million RMB during the same period [2][9]. Product Segmentation - Visual perception products, including sensors and algorithm modules, accounted for 97.8%, 99.1%, and 94% of revenues from 2022 to 2024, highlighting the importance of this segment [3][4]. - Sensor sales increased significantly, with 133.89 thousand units sold in 2022, rising to 695.83 thousand units in 2024, while algorithm module sales peaked in 2023 at 1.07 million units before declining [5][6]. Market Position - Ledong Robotics holds a 1.6% share of the global market for intelligent robots focused on visual perception, ranking first among competitors [2][3]. - The global market for intelligent lawn mowing robots is projected to reach 6.1 billion RMB in 2024, with Ledong Robotics selling over 15,000 units since its launch [6][7]. Cost Structure - Research and development (R&D) expenses remained stable, decreasing slightly from 96.7 million RMB in 2022 to 94.86 million RMB in 2024, while marketing expenses surged by 323% in 2024 [9][10]. - The company’s overall operating expenses increased gradually, with total operating expenses of 1.56 billion RMB in 2022, 1.58 billion RMB in 2023, and 1.63 billion RMB in 2024 [9][10]. Future Outlook - Ledong Robotics plans to continue increasing R&D and marketing expenditures, which may lead to ongoing net losses as the company expands its global footprint [9][10]. - The company sees significant growth potential in the lawn mowing robot segment, particularly in markets like Europe, North America, and Australia [6][7].