Workflow
World Model
icon
Search documents
4000人的自动驾驶黄埔军校,死磕技术分享与求职交流~
自动驾驶之心· 2025-07-12 14:43
Core Viewpoint - The smart driving industry is experiencing significant growth, with companies willing to invest heavily in research and talent acquisition, indicating a robust job market and opportunities for new entrants [2][3]. Group 1: Industry Trends - The smart driving sector continues to attract substantial funding for research and development, with companies offering competitive salaries to attract talent [2]. - There is a noticeable trend of shorter technology iteration cycles in the autonomous driving field, with a focus on advanced technologies such as visual large language models (VLA) and end-to-end systems [7][11]. Group 2: Community and Learning Resources - The "Autonomous Driving Heart Knowledge Planet" aims to create a comprehensive community for knowledge sharing, focusing on academic and engineering challenges in the autonomous driving industry [3][11]. - The community has established a structured learning path covering various aspects of autonomous driving technology, including perception, planning, and control [13][15]. Group 3: Educational Offerings - The community offers a range of educational resources, including video courses, hardware tutorials, and live sessions with industry experts, aimed at both newcomers and experienced professionals [3][15]. - There are dedicated modules for job preparation, including resume sharing and interview experiences, to help members navigate the job market effectively [5][12]. Group 4: Technical Focus Areas - Key technical areas of focus include visual language models, world models, and end-to-end autonomous driving systems, with ongoing discussions about their integration and application in real-world scenarios [11][36]. - The community emphasizes the importance of understanding the latest advancements in algorithms and models, such as diffusion models and generative techniques, for future developments in autonomous driving [16][36].
李飞飞:高校学生应追逐AI“北极星”问题
Hu Xiu· 2025-07-08 08:15
Core Insights - The article highlights the journey of Fei-Fei Li from her early academic achievements to her current role as CEO of a company, emphasizing her passion for starting from scratch and building innovative solutions in AI [1][2][24]. Group 1: ImageNet and AI Development - ImageNet was conceived around 18 years ago to address the lack of data in AI and machine learning, particularly in computer vision, which was essential for the development of algorithms [4][6]. - The project aimed to download 1 billion images from the internet to create a global visual classification system, which became a cornerstone for training and testing machine learning algorithms [6][7]. - The breakthrough moment for ImageNet came in 2012 with the introduction of AlexNet, which utilized convolutional neural networks (CNN) and significantly reduced the error rate in image recognition tasks [8][10]. Group 2: Vision and Future of AI - Li emphasizes the importance of spatial intelligence for achieving general artificial intelligence (AGI), arguing that without it, AGI remains incomplete [14]. - The evolution of AI has progressed from object recognition to scene understanding and now to generating 3D worlds, which presents a new set of challenges [12][16]. - The integration of language models and visual understanding is seen as a critical area for future research and application, particularly in fields like robotics and the metaverse [20][21]. Group 3: Advice for Students and Researchers - Li advises students to pursue fundamental "North Star" problems in AI that are not necessarily tied to industrial applications, as academic resources have shifted significantly [34][35]. - She encourages interdisciplinary research in AI, particularly in scientific discovery, and highlights the importance of curiosity and problem-solving in graduate studies [38][39]. - The article underscores the need for a new generation of researchers who are fearless and willing to tackle complex challenges in AI [32][33].
2025秋招开始了,这一段时间有些迷茫。。。
自动驾驶之心· 2025-07-08 07:53
Core Viewpoint - The article discusses the current trends and opportunities in the fields of autonomous driving and embodied intelligence, emphasizing the need for strong technical skills and knowledge in cutting-edge technologies for job seekers in these areas [3][4]. Group 1: Job Market Insights - The job market for autonomous driving and embodied intelligence is competitive, with a high demand for candidates with strong backgrounds and technical skills [2][3]. - Companies are increasingly looking for expertise in advanced areas such as end-to-end models, visual language models (VLM), and reinforcement learning [3][4]. - There is a saturation of talent in traditional robotics, but many startups in the robotics sector are rapidly growing and attracting significant funding [3][4]. Group 2: Learning and Development - The article encourages individuals to enhance their technical skills, particularly in areas like SLAM (Simultaneous Localization and Mapping) and ROS (Robot Operating System), which are relevant to robotics and embodied intelligence [3][4]. - A community platform is mentioned that offers resources such as video courses, hardware learning materials, and job information, aiming to build a large network of professionals in intelligent driving and embodied intelligence [5]. Group 3: Technical Trends - The article highlights four major technical directions in the industry: visual language models, world models, diffusion models, and end-to-end autonomous driving [8]. - It provides links to various resources and papers related to these technologies, indicating a focus on the latest advancements and applications in the field [9][10].
李飞飞最新访谈:没有空间智能,AGI就不完整
量子位· 2025-07-02 09:33
Core Viewpoint - The article emphasizes the importance of spatial intelligence in achieving Artificial General Intelligence (AGI), as articulated by AI expert Fei-Fei Li, who believes that understanding and interacting with the 3D world is fundamental to AI development [1][4][29]. Group 1: Spatial Intelligence and AGI - Fei-Fei Li asserts that without spatial intelligence, AGI is incomplete, highlighting the necessity of creating world models that capture the structure and dynamics of the 3D world [29]. - She identifies 3D world modeling as a critical challenge for AI, stating that understanding, generating, reasoning, and acting within a 3D environment are essential problems for AI [7][29]. - The pursuit of spatial intelligence is framed as a lifelong goal for Li, who aims to develop algorithms that can narrate the stories of the world by understanding complex scenes [20][29]. Group 2: Historical Context and Breakthroughs - The article discusses the inception of ImageNet, a pivotal project initiated by Li, which aimed to create a vast dataset for training AI in visual recognition, addressing the data scarcity issue in the early days of AI [11][14]. - The success of ImageNet led to significant advancements in computer vision, particularly with the introduction of AlexNet, which utilized convolutional neural networks and marked a turning point in AI capabilities [19][22]. - Li reflects on the evolution of AI from object recognition to scene understanding, emphasizing the importance of integrating natural language with visual signals to enable AI to describe complex environments [15][20]. Group 3: Future Directions and Applications - Li expresses excitement about the potential applications of spatial intelligence in various fields, including design, architecture, gaming, and robotics, indicating a broad utility for world models [35]. - The article mentions the challenges of data acquisition for spatial intelligence, noting that while language data is abundant online, spatial data is less accessible and often resides within human cognition [33][50]. - Li's new venture, World Labs, aims to tackle these challenges by developing innovative solutions for understanding and generating 3D environments, indicating a commitment to advancing the field of AI [29][35].
双非研究生,今年找工作有些迷茫。。。
自动驾驶之心· 2025-06-30 05:51
Core Viewpoint - The article emphasizes the importance of advanced skills and knowledge in the fields of autonomous driving and embodied intelligence, highlighting the need for candidates with strong backgrounds to meet industry demands. Group 1: Industry Trends - The demand for talent in autonomous driving and embodied intelligence is increasing, with a focus on cutting-edge technologies such as SLAM, ROS, and large models [3][4]. - Many companies are transitioning from traditional methods to more advanced techniques, indicating a shift in the required skill sets for job seekers [3][4]. - The article notes that while there is a saturation of talent in certain areas, the growth of startups in robotics presents new opportunities for learning and development [3][4]. Group 2: Learning and Development - The article encourages individuals to enhance their technical skills, particularly in areas related to robotics and embodied intelligence, which are seen as the forefront of technology [3][4]. - It mentions the availability of resources and community support for learning, including access to courses, hardware, and job information through platforms like Knowledge Planet [5][6]. - The community aims to create a comprehensive ecosystem for knowledge sharing and recruitment in the fields of intelligent driving and embodied intelligence [5][6]. Group 3: Technical Directions - The article outlines four major technical directions in the industry: visual large language models, world models, diffusion models, and end-to-end autonomous driving [7]. - It highlights the importance of staying updated with the latest research and developments in these areas, providing links to various resources and papers for further exploration [8][9].
100+自动驾驶数据集,这5个你总得知道吧?
自动驾驶之心· 2025-06-22 01:35
Core Viewpoint - The article emphasizes the growing importance of autonomous driving technology and highlights the availability of over 100 high-quality datasets for developers and researchers in the field. It introduces five key datasets that cover various tasks from perception to visual odometry, providing valuable resources for both beginners and experienced engineers [2]. Dataset Summaries 1. KITTI Dataset - The KITTI dataset is one of the most classic and widely used benchmark datasets in the autonomous driving field. It was collected in Karlsruhe, Germany, using high-precision sensors such as stereo color/gray cameras, Velodyne 3D LiDAR, and GPS/IMU. The dataset includes annotations for various perception tasks, including stereo vision, optical flow, visual odometry, and 3D object detection and tracking, making it a standard for evaluating vehicle vision algorithms [3]. 2. nuScenes Dataset - nuScenes is a large-scale multi-sensor dataset released by Motional, covering 1,000 continuous driving scenes in Boston and Singapore, totaling approximately 15 hours of data. It includes a full suite of sensors: six cameras, five millimeter-wave radars, one top-mounted LiDAR, and IMU/GPS. The dataset provides around 1.4 million high-resolution camera images and 390,000 LiDAR scans, annotated with 3D bounding boxes for 23 object categories, making it suitable for research on complex urban road scenarios [5][7]. 3. Waymo Open Dataset - The Waymo Open Dataset, released by Google Waymo, is one of the largest open data resources for autonomous driving. It consists of two main parts: a perception dataset with 2,030 scenes of high-resolution camera and LiDAR data, and a motion dataset with 103,354 vehicle trajectories and corresponding 3D map information. This extensive multi-sensor dataset covers various times, weather conditions, and urban environments, serving as a benchmark for target detection, tracking, and trajectory prediction research [10][12]. 4. PathTrack Dataset - PathTrack is a dataset focused on person tracking, containing over 15,000 trajectories across 720 sequences. It utilizes a re-trained existing person matching network, significantly reducing the classification error rate. The dataset is suitable for 2D/3D object detection, tracking, and trajectory prediction tasks [13][14][15]. 5. ApolloScape Dataset - ApolloScape, released by Baidu Apollo, is a massive autonomous driving dataset characterized by its large volume and high annotation accuracy. It reportedly exceeds similar datasets in size by over ten times, containing hundreds of thousands of high-resolution images with pixel-level semantic segmentation annotations. ApolloScape defines 26 different semantic categories and includes complex road scenarios, making it applicable for perception, map construction, and simulation training [17][19].
Meta launches AI 'world model' to advance robotics, self-driving cars
CNBC· 2025-06-11 14:17
Mark Zuckerberg, CEO of Meta Platforms. Artificial intelligence has been an integral focus for the tech giant's leader amid competition from players like OpenAI, Microsoft and Google.Meta on Wednesday announced it's rolling out a new AI "world model" that can better understand the 3D environment and movements of physical objects.The tech giant, which owns popular social media apps Facebook and Instagram, said its new open-source AI model V-JEPA 2 can understand, predict and plan in the physical world. Known ...
干货超标!腾讯混元3D负责人郭春超:真正的3D AIGC革命,还没开始!
AI科技大本营· 2025-05-16 01:33
分享嘉宾 | 郭春超 责编 | 梦依丹 出品丨AI 科技大本营(ID:rgznai100) 现在这个时代,我们玩的游戏、看的电影、甚至未来的虚拟世界,都离不开精细逼真的三维(3D)模型。然而,制作这些 3D 内容,过去常常意味着 耗时数周甚至数月的人工建模,成本高昂且效率低下。就像平面设计曾被 Photoshop 改变一样,人工智能正在瞄准 3D 领域,试图彻底革新数字内容 的生产方式。 在这场由 AI 驱动的 3D 生成浪潮中,腾讯混元团队推出的开源项目 Hunyuan 3D 成为了全球开发者社区的焦点。它不仅在 GitHub 上迅速积累了超过 9.6k 的 Star,跻身 3D 生成开源项目的第一梯队,更凭借其出色的模型生成效果,赢得了"几乎没有变形的 Image to 3D,恐怖如斯"这样的用户评 价。 AI 生成 3D 的能力发展到什么阶段了?它离真正改变游戏、影视、数字人等行业还有多远? 在 4 月 18-19 日举行的 2025 全球机器学习技术大会 (ML-Summit)上,腾讯混元 3D 负责人 郭春超 对此进行了详尽解读,并在会后接受了 CSDN 专访。 令人意外的是,尽管当前 3D AIG ...
小马智行上市后首份财报:2024年营收约5.5亿元创新高,坚持「三大优先」战略
IPO早知道· 2025-03-25 13:24
中国营收规模最高的L4自动驾驶公司。 本文为IPO早知道原创 作者|Stone Jin 微信公众号|ipozaozhidao 据IPO早知道消息,小马智行(Pony.ai)于3月25日美股盘前发布了2024年第四季度及全年财报, 这也是其2024年11月27日登陆纳斯达克、成为"全球Robotaxi第一股"后发布的首份财报。 财报显示, 2024年小马智行营收5.48亿元(7503万美元),再创新高,也是中国营收规模最高的 L4自动驾驶公司。其中,2024年第四季度的营收为2.59亿元(3550万美元)。 小马智行联合创始人、CEO彭军表示,在技术成熟和充足资金的共同驱动下,公司正加速推进自动 驾驶商业化的拐点到来。他强调,小马智行坚持"Robotaxi业务优先、中国市场优先、一线城市优 先"的业务战略,2024年在中国一线城市北京、上海、广州和深圳不断扩大部署自动驾驶服务,建 立强大的运营能力,在全球市场争取更多市场机会。 自动驾驶出行服务商业化持续推进 将Robotaxi开进城市中心、机场和高铁站 具体来看,2024年小马智行持续推进其核心业务自动驾驶出行服务(Robotaxi)的扩张,全年自动 驾驶出行 ...