Workflow
强化学习
icon
Search documents
上海车展|Momenta与六大品牌达成战略合作,累计合作量产车型超130款
Guan Cha Zhe Wang· 2025-04-29 01:48
Core Insights - Momenta announced further strategic collaborations with six major brands during the Shanghai Auto Show, including General Motors Buick, FAW Toyota, Honda China, Cadillac, SAIC Audi, and Zhiji [1][3] - The company has seen a significant increase in the number of mass-produced models delivered, from 1 model in 2022 to 8 in 2023, and projected to reach 26 models in 2024 [3] - Momenta's cumulative number of cooperative mass-produced models has exceeded 130, with an accelerating growth rate in successful deliveries [3] Delivery and Growth Metrics - The first 100,000 units equipped with Momenta's technology took two years to achieve, while the second 100,000 units were completed in just six months [3] - The company expects to complete the third batch of nearly 100,000 units by May of this year [3] Global Partnerships - Momenta's partners now include major global automakers such as Honda, Nissan, Chery, Audi, Volkswagen, and Cadillac, indicating a broad market reach [3] Technological Advancements - The "Flywheel Model" is a key upgrade in Momenta's algorithm capabilities, with plans to launch the end-to-end Momenta R6 Flywheel Model based on reinforcement learning in the second half of this year [5] - Momenta's intelligent driving solutions do not require high-precision maps, providing an advantage for deployment in various global markets [5] Focus on Robotaxi Development - Momenta is focusing on the development of autonomous Robotaxi services, addressing the challenge of safety standards for large-scale deployment [7] - The company aims to achieve safety levels for Robotaxi operations that are equivalent to or exceed human driving standards as fleet sizes grow [7] - The first mass-produced Robotaxi solution is set to launch this year, utilizing existing sensors and computing units to reduce costs [7] - The initial batch of unmanned Robotaxis is expected to enter trial operations by the end of 2025, offering users automated driving services [7]
小小井字棋难倒大模型??大神卡帕西被OpenAI在线踢馆了
量子位· 2025-04-28 03:43
克雷西 发自 凹非寺 量子位 | 公众号 QbitAI 宝可梦之后,让大模型玩 井字棋 又成了一个新的热门挑战。 起因是网友在X上吐槽大模型宝可梦玩得不够好,结果被大神 Karpathy 翻了牌子: 别盯着宝可梦了,让大模型玩井字棋会更有趣,它们不会。 结果Karpathy的话引发了大量围观,有人表示惊讶,也有人在分析原因,还有人表示那句经典的话含金量还在上升: 对人类而言很简单的任务,对机器来说反而很难;对人类而言难的任务,对机器来说反而简单。 不过也有人表示不服,其中就包括OpenAI的 Noam Brown ,他表示让o3玩井字棋完全没问题, 甚至还能看图下棋 。 大模型挑战井字棋 我们也尝试了一下,用不同的方式和o3对战。 第一种方式是用O和X表示棋子,-表示空位,每次直接把完整的棋局输入给o3,并要求其用同样的方式输出。 思考约12秒之后,o3首先占据了棋盘中央的位置,我们落子之后,o3又思考了23秒,放置了第二颗X棋子。 接下来的两个回合情况是这样,其实当o3占据对角线上两个位置的时候就已经锁定了胜局。 不过有意思的是,直到已经连成一条线, o3都没发现自己已经赢了 。 | | | | | XOO ...
重磅发布 | 复旦《大规模语言模型:从理论到实践(第2版)》全新升级,聚焦AI前沿
机器之心· 2025-04-28 01:26
机器之心发布 机器之心编辑部 《大规模语言模型:从理论到实践(第 2版)》 是一本理论与实践并重的专业 技术书 ,更是 AI时代不可或缺的知识工具书。 任何人 都能在本 书中找到属于自己的成长路径。 在人工智能浪潮席卷全球的今天,大语言模型正以前所未有的速度推动着科技进步和产业变革。从 ChatGPT 到各类行业应用,LLM 不仅重塑 了人机交互的方式,更成为推动学术研究与产业创新的关键技术。 面对这一飞速演进的技术体系,如何系统理解其理论基础、掌握核心算法与工程实践,已成为每一位 AI 从业者、研究者、高校学子的必修课。 2023 年 9 月,复旦大学张奇、桂韬、郑锐、黄萱菁研究团队面向全球学术界与产业界正式发布了《大规模语言模型:从理论到实践》。短短 两年,大语言模型在理论研究、预训练方法、后训练技术及解释性等方面取得了重要进展。业界对大语言模型的研究更加深入,逐渐揭示出许多 与传统深度学习和自然语言处理范式不同的特点。例如, 大语言模型仅需 60 条数据就能学习并展现出强大的问题回答能力,显示了其惊人的 泛化性 。然而,本书作者们也发现大语言模型存在一定的脆弱性。例如, 在一个拥有 130 亿个参数的模 ...
深度|清华姚班学霸、OpenAI姚顺雨:AI下半场从“算法竞赛”转向“效用定义”,重构评估框架,将技术能力转化为真实世界价值
Z Potentials· 2025-04-25 03:05
Core Insights - The article discusses the transition of AI from a phase focused on model innovation and benchmark testing to a new phase emphasizing problem definition and evaluation [3][23][30] - It highlights the importance of reinforcement learning achieving generalization capabilities, allowing it to tackle diverse tasks previously thought to be unrelated [3][4][21] Group 1: AI's First Half - The first half of AI's development was characterized by significant breakthroughs in training methods and models, such as Transformer and GPT-3, which focused on improving model performance on benchmarks [4][5][7] - The emphasis was on creating new models rather than defining tasks, leading to a cycle of developing increasingly difficult benchmarks that could be solved with existing methods [7][8][23] Group 2: Breakthrough Formula - The effective formula for AI's success includes large-scale language pre-training, scaling (data and compute), and the integration of reasoning and action [9][14] - The realization that prior knowledge is crucial for generalization has shifted the focus from solely algorithm development to understanding the environment and prior knowledge [15][21] Group 3: Transition to the Second Half - The second half of AI will focus on redefining evaluation frameworks and creating new assessment methods that reflect real-world applications rather than just benchmark performance [26][27][29] - The industry faces the "utility problem," where existing evaluation frameworks do not align with real-world tasks, necessitating a reevaluation of how AI's effectiveness is measured [27][29] Group 4: Future Directions - The new game in AI's second half involves leveraging existing formulas to solve real-world tasks while innovating new components to enhance these formulas [32] - Companies will need to create new hypotheses that challenge existing paradigms to achieve significant breakthroughs and develop valuable products worth billions or trillions [30][32]
卓驭科技接入通义大模型,联合打造端到端世界模型
阿里云· 2025-04-24 09:13
Core Insights - The article highlights the collaboration between Zhuoyu Technology and Alibaba Cloud, focusing on the integration of the Tongyi large model and the development of an end-to-end world model [1][2] - Zhuoyu's end-to-end world model incorporates reinforcement learning and chain reasoning technology, enhancing safety in urban navigation and enabling personalized driving styles and natural language interaction [2] Summary by Sections - **Integration with Alibaba Cloud** - Zhuoyu Technology has fully migrated its core business systems, including big data and intelligent manufacturing, to Alibaba Cloud [1] - The company has established a GPU resource pool on the Alibaba Cloud PAI platform to meet the high computational demands of its model training [2] - **Model Training Efficiency** - The training method combines pre-training and post-training, resulting in a training efficiency improvement of over 50% compared to single GPU clusters [2] - The utilization rate of GPUs has been increased to over 95% due to the serverless capabilities of the Alibaba Cloud PAI platform, which simplifies cluster operations and ensures full observability of the training process [2] - **Development Acceleration** - In the research and development domain, Zhuoyu has integrated Tongyi Lingma and Tongyi Qianwen to accelerate development, achieving a code adoption rate of 29% [2]
AI 智能体老“崩”?DeepSeek 前员工联手李飞飞等大佬开源新框架,教会模型真正推理
AI前线· 2025-04-24 03:03
Core Viewpoint - The article discusses the current state of AI agents, indicating that most are still in the "pilot purgatory" phase and have not yet transitioned to real-world applications, despite expectations for 2025 to be the "year of AI agents" [1][2]. Group 1: Current State of AI Agents - A survey on social platform X reveals that 64.2% of AI agents are stuck in pilot purgatory, while only 6.4% are smarter than the hype [2]. - The article highlights the need for advancements in AI systems to enhance their stability and reliability in enterprise applications [2]. Group 2: Introduction of RAGEN - A new system called RAGEN, developed by a team including researchers from Northwestern University, Microsoft, Stanford University, and the University of Washington, aims to improve AI agents' performance in real-world scenarios [2][5]. - RAGEN focuses on multi-turn interaction scenarios, requiring agents to reason under uncertainty and remember historical dialogues [5]. Group 3: StarPO Framework - RAGEN is built on a custom reinforcement learning framework named StarPO, which emphasizes learning through experience rather than rote memorization [5][7]. - The StarPO framework consists of two alternating phases: rollout, where the LLM generates complete interaction sequences, and update, where the model updates parameters based on normalized cumulative rewards [7]. Group 4: Training Challenges and Solutions - The article discusses the "Echo Trap" phenomenon, where agents generate repetitive responses due to early high rewards, leading to a decline in reasoning ability [12]. - To address training stability, the enhanced version StarPO-S introduces three key mechanisms: uncertainty-based rollout filtering, removal of KL penalty, and asymmetric PPO clipping [19]. Group 5: Evaluation Environments - RAGEN includes three symbolic testing environments to evaluate decision-making capabilities: Bandit, Sokoban, and Frozen Lake, each designed to assess different aspects of agent performance [15][17]. - These environments aim to minimize prior knowledge interference, allowing agents to rely solely on learned strategies for decision-making [15]. Group 6: Future Implications - RAGEN represents a significant step towards developing AI agents with autonomous reasoning capabilities, although challenges remain in applying these methods to real-world business processes [24]. - The article emphasizes the importance of optimizing reward mechanisms to focus on the quality of reasoning processes, not just the correctness of outcomes [24].
AI 智能体老“崩”?DeepSeek 前员工联手李飞飞等大佬开源新框架,教会模型真正推理
AI前线· 2025-04-24 03:03
很多人都觉得 2025 年会是"AI 智能体元年",也就是基于 OpenAI、Anthropic、Google 和 DeepSeek 等机构提供的大语言模型,打造专注特定任务的智能体系统。 但是,最近在社交平台 X 上有个调查显示,现在大部分 Agent 都在"玩票"阶段,还没真正走出实验 室,普遍滞留在"企业试点"的状态中。 编译 | Tina 推理智能体训练框架已开源 与解题或代码生成等静态任务不同,RAGEN 聚焦在多轮交互场景中训练智能体,要求它们能在不确 定性中进行推理、记忆历史对话并灵活应对变化。 | Al agents in the enterprise right now are ... | | | --- | --- | | Smarter than the hype | 6.4% | | Stuck in pilot purgatory | 64.2% | | Powerful, but high effort O | 24.8% | | Nearing real scale | 4.6% | 不过,李飞飞所在的一支团队或许即将带来改变:他们与西北大学、微软、斯坦福大学和华盛顿大学 的研究 ...
商汤绝影打造智能驾驶新路标——生成式智驾R-UniAD,让安全更有确定性,超越人类驾驶极限
Guan Cha Zhe Wang· 2025-04-24 01:18
.强化学习+世界模型,绝影构建VLAR技术架构,突破端到端瓶颈 .R-UniAD创新链路:挖掘复杂场景、4D仿真复现、强化学习、泛化验证 .近实时在线交互的4D世界模型"绝影开悟2.0",生成式智驾R-UniAD的核心基石 .绝影辅助驾驶目前已合作4家车企,上车7款车型,基于地平线征程6、英伟达DIRVE AGX Thor平台打造的方案将在今年上车东风、奇瑞等车企伙伴 随着辅助驾驶普及的不断深入,公众越来越关注驾驶系统的安全性,期待辅助驾驶带来更安全也更流畅的智慧出行体验。只是许多辅助驾驶方案在遇到新场 景时难以妥善处理,事故时有发生,暴露出当前技术方案的诸多瓶颈。 想要提高安全性,端到端模型需要海量高质量数据训练,然而,即使是百万量产车回流的数据量,极端场景有效信息提取率不足1%。 不仅如此,因为端到端的范式是模仿学习,遇到没有见过的新场景,它的驾驶决策存在很大的不确定性,安全边界模糊,给驾驶安全带来风险,更难以超越 人类的驾驶能力。 因此,商汤绝影发布了生成式智驾R-UniAD技术方案,将强化学习引入到智能驾驶,让端到端智驾与世界交互的不断深入,通过生成的方式真实还原并深度 理解驾驶环境,从而主动预测并处 ...
Agent、DeepSeek、多模态热点炸场!60+重磅嘉宾共探AI未来,2025全球机器学习技术大会完美收官!
AI科技大本营· 2025-04-21 10:24
以下文章来源于CSDN ,作者CSDN CSDN . 成就一亿技术人 作者 | 《新程序员》编辑部 出品 | CSDN(ID:CSDNnews) 在万物向 "智 " 生长的 2025 年,AI 领域的热潮持续升温,正引领着技术革新与产业探索的新浪潮。 了新的破解思路?围绕这些关键问题,欢迎回看大会首日视频,看众多技术大咖如何从理论、算法到实际应用层面进行了深度剖析 ,以此 了解 AI 技术 的更多最新进展: 大模型技术创新驱动的 AI 生态和应用演进 李建忠 CSDN 高级副总裁、 Boolan 首席技术专家 4 月 18-19 日,由 CSDN 联合高端 IT 咨询与教育平台 Boolan 举办的 2025 全球机器学习技术大会(ML-Summit 2025),在上海虹桥西郊庄园丽笙 大酒店隆重拉开帷幕。本次大会围绕 AI 最前沿的发展趋势与落地实践,聚焦大语言模型技术演进、AI 智能体、具身智能、DeepSeek 技术解析与行业 实践等 12 大专题,邀请了超 60 位来自全球顶尖科技企业与学术机构的重磅嘉宾齐聚一堂,全面呈现 AI 领域的技术风向与应用前沿。 在生成式 AI 重构技术边界的浪潮下,产业实 ...
机械设备行业点评报告:人形机器人首场马拉松收官,各家运动能力表现如何?
Soochow Securities· 2025-04-21 09:33
Investment Rating - The report maintains an "Accumulate" rating for the mechanical equipment industry [1] Core Insights - The first humanoid robot marathon took place on April 19, 2025, in Beijing, with 21 robot teams participating in a 21-kilometer race [1][2] - The event showcased the capabilities of humanoid robots, with notable performances from TianGong Ultra and SongYan Power N2, highlighting advancements in robotic movement and control [4][6] - The use of reinforcement learning technology was prevalent among the participating robots, indicating a promising direction for future development in humanoid robotics [5][36] Summary by Sections Event Overview - The first humanoid robot marathon was held on April 19, 2025, in Beijing, featuring 21 robot teams competing in a half marathon [1][14] Participating Teams - A total of 21 humanoid robot teams participated, including notable entries like TianGong Ultra, Kuavo, and SongYan Power N2 [2][16] Race Format and Rules - Robots participated in the marathon through remote operation, with engineers accompanying them. Each robot started at one-minute intervals, maintaining a distance of over one meter from each other [3][19] Race Results - TianGong Ultra won the marathon with a time of 2 hours 40 minutes 42 seconds, benefiting from advanced technology and design [4][22] - SongYan Power N2 secured second and third places, demonstrating excellent stability and humanoid gait without requiring dedicated support [4][26] Future Development Directions - The marathon set three world records, emphasizing the need for improved robustness and hardware stability for commercial viability [32][35] - The report suggests that enhancing the robots' endurance and joint cooling capabilities is crucial for their long-term operational success [37] Investment Recommendations - The report recommends focusing on the supply chains of TianGong Robotics and SongYan Power, highlighting specific companies for potential investment [6][38]