世界模型
Search documents
AD智驾的2025年:监管刹车、技术狂飙,“地大华魔”四雄争霸
3 6 Ke· 2025-12-11 09:55
Core Insights - The automotive industry in 2025 has seen a significant shift towards safety and responsibility, moving away from exaggerated claims about autonomous driving technology [1][3] - The Chinese Ministry of Industry and Information Technology has banned the term "autonomous driving," leading to a more realistic portrayal of the technology by car manufacturers [3][5] Industry Developments - The narrative around autonomous driving has changed, with companies now focusing on "assisted driving" and "intelligent driving assistance" instead of "autonomous driving" [3][5] - The industry is characterized by two main trends: advancement in technology and democratization of intelligent driving [5][11] Key Players and Innovations - Xiaopeng Motors has introduced a second-generation VLA model that eliminates the "middleman" in the translation process, allowing for direct machine understanding of physical environments [6][7] - BYD launched the "Tian Shen Zhi Yan" high-level intelligent driving system, targeting the 100,000 yuan market with various versions, including features like highway NOA and automatic parking [11][13] - Geely has also entered the market with its own intelligent driving system, offering multiple versions with varying capabilities [11][13] Competitive Landscape - Tesla's role has evolved, with Chinese companies no longer viewing it as the sole leader in intelligent driving technology [13][14] - Horizon Robotics has gained traction with its end-to-end architecture and aims to make urban NOA widely available, achieving significant market share in the autonomous driving sector [19][21] - DJI's subsidiary, Zhuoyue Technology, has focused on practical applications and has made strides in the European market, showcasing its capabilities in urban NOA [22][24] Strategic Collaborations - Huawei has formed numerous partnerships across the automotive industry, providing comprehensive intelligent driving solutions to various manufacturers [25][28] - Momenta has expanded its collaboration network significantly, working with multiple brands to implement its driving assistance solutions [29][31] Challenges and Future Outlook - Despite advancements, the industry faces challenges related to user trust and the potential for misuse of autonomous driving systems [33][34] - The ongoing evolution of intelligent driving technology is expected to continue, with a focus on making it accessible to a broader market while addressing safety and ethical concerns [35][36]
自驾世界模型剩下的论文窗口期没多久了......
自动驾驶之心· 2025-12-11 00:05
Core Insights - The article highlights the recent surge in research papers related to world models in autonomous driving, indicating a trend towards localized breakthroughs and verifiable improvements in the field [1] - It emphasizes the importance of refining submissions to top conferences, suggesting that the final 10% of polishing can significantly impact the overall quality and acceptance of the paper [2] - The platform "Autonomous Driving Heart" is presented as a leading AI technology media outlet in China, with a strong focus on autonomous driving and related interdisciplinary fields [3] Summary by Sections Research Trends - Numerous recent works in autonomous driving, such as MindDrive and SparseWorld-TC, reflect a focus on world models, which are expected to dominate upcoming conferences [1] - The article suggests that the main themes for the end of this year and the first half of next year will likely revolve around world models, indicating a strategic direction for researchers [1] Guidance and Support - The platform offers personalized guidance for students, helping them navigate the complexities of research and paper submission processes [7][13] - It claims a high success rate, with a 96% acceptance rate for students who have received guidance over the past three years [5] Faculty and Resources - The platform boasts over 300 dedicated instructors from top global universities, ensuring high-quality mentorship for students [5] - The instructors have extensive experience in publishing at top-tier conferences and journals, providing students with valuable insights and support [5] Services Offered - The article outlines various services, including personalized paper guidance, real-time interaction with mentors, and comprehensive support throughout the research process [13] - It also mentions the potential for students to receive recommendations from prestigious institutions and direct job placements in leading tech companies [19]
中国AI走出差异化务实之路
Zhong Guo Qing Nian Bao· 2025-12-10 07:28
Core Viewpoint - The article discusses the contrasting approaches of the United States and China in the field of artificial intelligence (AI), highlighting concerns about a potential AI bubble in the U.S. and China's focus on practical applications and cost-effectiveness [1][3]. Group 1: AI Bubble Concerns - There are rising voices in the U.S. regarding the possibility of an AI bubble, with Goldman Sachs reporting five warning signs similar to those before the internet bubble burst [1]. - Lin Yifu, a prominent economist, predicts that the U.S. may experience an AI bubble burst during the 14th Five-Year Plan period, potentially leading to a financial crisis similar to the 2008 housing market collapse [1]. Group 2: China's Practical Approach - Chinese AI development emphasizes "cost-effectiveness, industrial application, and practical results," focusing on foundational innovation rather than speculative concepts [2][3]. - The Chinese strategy is characterized by a "low-cost, high-adaptability, strong implementation" path, contrasting with the U.S. approach of massive investments in AGI [3]. Group 3: Technological Innovations - Breakthroughs in embodied intelligence are highlighted, with teams in China achieving significant advancements in robot dexterity and data accuracy through innovative techniques [4][6]. - The development of AI-driven optimization solvers is expected to capture a significant share of the global optimization market, projected to reach $107 billion by 2025, with a compound annual growth rate of about 10% from 2026 to 2033 [4]. Group 4: Industrial Integration and Ecosystem Building - The integration of AI with hardware is emphasized as a key advantage for China, leveraging its robust industrial base to create a unique "AI + hardware" path [7]. - The article notes the importance of overcoming data fragmentation and building a cohesive ecosystem to ensure the successful commercialization of AI technologies [8]. Group 5: Funding and Innovation Challenges - There is a call for more patient capital and larger investments in disruptive innovation, particularly in fields like quantum computing and controlled nuclear fusion [9]. - The article suggests that the current financing models may not be suitable for groundbreaking innovations, advocating for a shift towards supporting long-term goals in AI development [9].
读懂2025中国AI走向!公司×产品×人物×方案,最值得关注的都在这里了
量子位· 2025-12-10 04:26
Core Insights - The year 2025 is marked by significant advancements in AI, particularly with the emergence of DeepSeek-R1 and the release of the V3.2 series, which encapsulate the year's technological narrative [1] - The main storyline revolves around the competition between open-source and closed-source AI models, focusing on inference efficiency, training paradigms, and cost structures, while world models evolve from theoretical concepts to real products [1] - 2025 is referred to as the "Agent Year," where AI agents transitioned from passive responders to proactive planners, leading to transformative changes across various industries [1] Group 1: AI Development and Trends - The AI landscape is evolving into an "Agent Internet Era," indicating a shift in how AI technologies are integrated into everyday applications [2] - AI is becoming a critical infrastructure in sectors like healthcare, meteorology, and industry, moving beyond mere plugins to essential components of existing systems [3] - The interplay between open-source and closed-source technologies is blurring, with agents, embodied intelligence, and world models overlapping and facilitating cross-industry collaboration [3] Group 2: AI Awards and Recognition - The "2025 AI Annual List" was unveiled at the MEET2026 Smart Future Conference, recognizing leading companies, potential startups, outstanding products, solutions, and key figures in the AI sector [6][8] - The selection process involved hundreds of companies and individuals, with results based on real data and expert opinions, reflecting the most representative forces in China's AI ecosystem [7][8] - The awards highlight companies that have played dual roles as "wave makers" and "steady navigators," continuously introducing new paradigms, tools, and models to the industry [12][14] Group 3: Notable Companies and Products - The "2025 AI Annual Leading Enterprises" list features companies that excel in technology, long-term investment, product implementation, and industry reputation, showcasing a diverse range of approaches to AI [12][18] - The "2025 AI Annual Outstanding Products" list includes applications that integrate AI into daily communication, search, and creative processes, as well as tools embedded in enterprise workflows [24] - The "2025 AI Annual Outstanding Solutions" list emphasizes solutions that incorporate cutting-edge algorithms into mature product forms, enhancing real business processes and accelerating the integration of AI technologies [30][31] Group 4: Key Figures in AI - The "2025 AI Annual Focus Figures" list includes entrepreneurs and leaders who have made significant contributions to the AI field, demonstrating the importance of human influence in technological advancements [35][36] - These individuals are recognized for their roles in driving product and business growth, advancing scientific research, and fostering collaboration across the industry [35][36]
安向京:无人驾驶终端具身移动 是充满想象力的新赛道
Xin Lang Cai Jing· 2025-12-10 02:37
12月8日-9日,在2025地平线技术生态大会期间,行深智能CEO 安向京 莅临新浪汽车高端访谈间时表示:未来不再是送一个一个具体的东西,而是实现空间 转移平台的管理,把物流变成空间转移。不同的物流企业,甚至能服务快递、生鲜、烟草、预制菜等等广大的物流城配体系,甚至更进一步可以服务环卫、 服务安防,甚至包括煤气泄露的巡检等等一系列的应用,所有的终端移动的应用或者是具身移动的应用,都可以被无人驾驶的能力所覆盖和赋能,这个就是 非常有想象力的空间和有想象力的赛道。 以下为专访实录 新浪汽车:感谢安总来到新浪汽车的直播间,安总简单地和大家打个招呼。 安向京:大好!我是行深智能的安向京,我们行深智能是2017年成立的,到现在已经有八年了,我们聚焦在L4的末端无人物流赛道上。 新浪汽车:您刚才说到了末端无人赛道L4级,这和我们理解的最后一公里有什么具体的场景吗? 行深智能CEO 安向京(右) 安向京:对,末端在物流领域大概是这么分,分干线物流、支线物流和末端物流,所以说末端物流基本上涵盖了城配以及您刚才说的最后一公里,甚至最后 50米所有的场景,所以末端物流的概念相对可能比最后一公里要大一点,一般最后一公里的概念都是老 ...
澳门大学首个世界模型驱动的视觉定位框架!
自动驾驶之心· 2025-12-10 00:04
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 论文作者 | Haicheng Liao等 编辑 | 自动驾驶之心 在自动驾驶的交互场景中,最尴尬的时刻莫过于此: 乘客指着前方复杂的路口说:"跟着那辆SUV"。自动驾驶系统看着眼前三辆长得差不多的车,内心OS:"哪辆?是左边那辆?还是正在变道那辆?" 现有的自动驾驶视觉定位(Visual Grounding)模型,大多像是一个" 只会看图说话 "的愣头青。它们盯着当前的这一帧画面,试图从 像素 里找答案。一旦指令模糊, 或者目标被遮挡,它们就很容易"指鹿为马",甚至引发错误推理。 人类司机为什么不会弄错?因为我们会" 预判 "。 当我们听到指令时,大脑里会瞬间推演未来的画面:左边那辆车马上要转弯了,不符合"跟着"的语境;只有中间那辆车在加速直行,才是最可能的意图。 "在行动之前,先思考未来"。 受此启发,来自[澳门大学]的研究团队提出了全新的框架 ThinkDeeper。这是首个将世界模型(World Model)引入自动驾驶视觉定位的研究。这项工作不仅刷 ...
世界模型自动驾驶小班课!特斯拉世界模型、视频&OCC生成速通
自动驾驶之心· 2025-12-09 19:00
Core Viewpoint - The article introduces a new course titled "World Models and Autonomous Driving Small Class," focusing on advanced algorithms in the field of autonomous driving, including general world models, video generation, and OCC generation [1][3]. Course Overview - The course is developed in collaboration with industry leaders and follows the success of a previous course on end-to-end and VLA autonomous driving [1]. - The course aims to enhance understanding and practical skills in world models, which are crucial for the advancement of autonomous driving technology [11]. Course Structure Chapter 1: Introduction to World Models - This chapter covers the relationship between world models and end-to-end autonomous driving, the history of world models, and current application cases [6]. - It discusses various types of world models, including pure simulation, simulation plus planning, and generating sensor inputs and perception results [6]. Chapter 2: Background Knowledge of World Models - The second chapter focuses on foundational knowledge related to world models, including scene representation, Transformer technology, and BEV perception [6][12]. - It highlights key technical terms frequently encountered in job interviews related to world models [7]. Chapter 3: Discussion on General World Models - This chapter addresses popular general world models and recent trends in autonomous driving jobs, including models from Li Feifei's team and DeepMind [7]. - It provides insights into the core technologies and design philosophies behind these models [7]. Chapter 4: Video Generation-Based World Models - The fourth chapter focuses on video generation algorithms, showcasing significant works such as GAIA-1 & GAIA-2 and recent advancements from various institutions [8]. - It includes practical applications using open-source projects like OpenDWM [8]. Chapter 5: OCC-Based World Models - This chapter explores OCC generation algorithms, discussing three major papers and a practical project that extends to vehicle trajectory planning [9]. Chapter 6: World Model Job Topics - The final chapter shares practical experiences from the instructor's career, addressing industry applications, pain points, and interview preparation for related positions [10]. Target Audience and Learning Outcomes - The course is designed for individuals aiming to deepen their understanding of end-to-end autonomous driving and world models [11]. - Upon completion, participants are expected to achieve a level equivalent to one year of experience as a world model autonomous driving algorithm engineer, mastering key technologies and being able to apply learned concepts in projects [14].
随到随学!端到端与VLA自动驾驶小班课正式结课
自动驾驶之心· 2025-12-09 19:00
Core Viewpoint - 2023 marks the year of end-to-end production, with 2024 expected to be a significant year for end-to-end production in the automotive industry, as leading new forces and manufacturers have already achieved end-to-end production [1][3]. Group 1: End-to-End Production Development - The automotive industry has two main paradigms: single-stage and two-stage, with UniAD being a representative of the single-stage approach that directly models vehicle trajectories from sensor inputs [1]. - Since last year, the single-stage end-to-end development has rapidly advanced, leading to various derivatives such as perception-based, world model-based, diffusion model-based, and VLA-based single-stage methods [3][5]. - Major players in the autonomous driving sector, including both solution providers and car manufacturers, are focusing on self-research and production of end-to-end autonomous driving technologies [3]. Group 2: Course Overview - A course titled "End-to-End and VLA Autonomous Driving" has been launched, aimed at teaching cutting-edge algorithms in both single-stage and two-stage end-to-end approaches, with a focus on the latest developments in the industry and academia [5][14]. - The course is structured into several chapters, starting with an introduction to end-to-end algorithms, followed by background knowledge on various technologies such as VLA, diffusion models, and reinforcement learning [8][9]. - The second chapter is highlighted as containing the most frequently asked technical keywords for job interviews in the next two years [9]. Group 3: Technical Focus Areas - The course covers various subfields of single-stage end-to-end methods, including perception-based (UniAD), world model-based, diffusion model-based, and the currently popular VLA-based approaches [10][12]. - The curriculum includes practical assignments, such as RLHF fine-tuning, and aims to provide students with hands-on experience in building and experimenting with pre-trained and reinforcement learning modules [11][12]. - The course emphasizes the importance of understanding BEV perception, multi-modal large models, and the latest advancements in diffusion models, which are crucial for the future of autonomous driving [12][16].
端到端落地小班课:核心算法&实战讲解(7个project)
自动驾驶之心· 2025-12-09 19:00
Core Insights - The article discusses the evolving recruitment landscape in the autonomous driving sector, highlighting a shift in demand from perception roles to end-to-end, VLA, and world model positions [2] - A new advanced course focused on end-to-end production in autonomous driving has been designed, emphasizing practical applications and real-world experience [2][4] Course Overview - The course is structured to cover various core algorithms, including one-stage and two-stage end-to-end methods, navigation information applications, reinforcement learning, and trajectory optimization [2] - The course aims to provide in-depth knowledge and practical skills necessary for production in autonomous driving, with a focus on real-world applications and challenges [2][4] Chapter Summaries - **Chapter 1: Overview of End-to-End Tasks** Discusses the integration of perception tasks and the learning-based design of control algorithms, which are essential skills for companies in the end-to-end era [7] - **Chapter 2: Two-Stage End-to-End Algorithm Framework** Introduces the modeling methods of two-stage frameworks and the information transfer between perception and planning, including practical examples [8] - **Chapter 3: One-Stage End-to-End Algorithm** Focuses on one-stage frameworks that allow for lossless information transfer, presenting various methods and practical learning experiences [9] - **Chapter 4: Production Application of Navigation Information** Covers the critical role of navigation information in autonomous driving, detailing mainstream navigation map formats and their integration into models [10] - **Chapter 5: Introduction to RL Algorithms in Autonomous Driving** Explains the necessity of reinforcement learning in conjunction with imitation learning to enhance the model's ability to generalize [11] - **Chapter 6: Trajectory Output Optimization** Engages participants in practical projects focusing on algorithms based on imitation learning and reinforcement learning [12] - **Chapter 7: Safety Net Solutions - Spatiotemporal Joint Planning** Discusses post-processing logic to ensure model accuracy and stability in trajectory outputs, introducing common smoothing algorithms [13] - **Chapter 8: Experience Sharing on End-to-End Production** Provides insights on practical experiences in production, addressing data, models, scenarios, and strategies for system capability enhancement [14] Target Audience - The course is aimed at advanced learners with a foundational understanding of autonomous driving algorithms, reinforcement learning, and programming skills [15][17]
Khosla 继 OpenAI 后的最大赌注,General Intuition 凭 38 亿个游戏高光片段做世界模型
海外独角兽· 2025-12-09 12:05
Core Insights - General Intuition has raised $134 million in seed funding, led by Vinod Khosla, marking his largest seed investment since OpenAI in 2019, indicating a significant bet on the next generation of intelligent paradigms [2][5][6] - The company aims to create a unique world model that combines human-like intuition and physical common sense, differentiating itself from traditional LLMs [3][6][28] Funding and Investment - The $134 million seed round is the largest single seed investment by Khosla Ventures since its initial investment in OpenAI, which was approximately $50 million [5][6] - Khosla's investment logic is based on first principles, identifying a transformative technological path that General Intuition is pursuing [6][28] Unique Data Assets - General Intuition has access to over 3.8 billion game highlight video clips, a unique dataset that is difficult to replicate [7][11] - The data is filtered to retain only meaningful human actions, providing a rich source of episodic memory for training AI models [12][11] Technological Framework - The company envisions a three-phase AI competition landscape: Bits to Bits (text generation), Atoms to Bits (robotic perception), and Atoms to Atoms (physical task execution) [4][5] - General Intuition aims to drive 80% of atomic-level physical interactions globally by 2030 [5] AI Model and Training - The model is designed to understand all possible outcomes based on current states and actions, moving beyond traditional video generation models [20][21] - The training process utilizes imitation learning from millions of human players, allowing the AI to replicate nuanced human behaviors [23][24] Market Strategy - The initial focus is on the gaming industry, providing a universal AI layer to replace traditional scripting systems for game developers [34][36] - Future phases include applications in simulation environments, such as autonomous driving, leveraging low-cost data from virtual worlds [38][39] Team and Leadership - The CEO, Pim de Witte, has a strong technical background and a history in the gaming community, which informs the company's strategic direction [42][44] - The team comprises experts with significant contributions to world models and AI research, enhancing the company's innovative capabilities [46][47]