世界模型

Search documents
能空翻≠能干活,我们离通用机器人还有多远?
3 6 Ke· 2025-05-22 02:28
具身智能,作为近年来人工智能领域的热点之一,成为产业界和学术界重点关注的方向。特别是在人形机器人这个载体上,它所承载的感知、运 动、决策等能力,让具身智能从概念逐渐走向落地。但与此同时,也有不少值得深入探讨的问题浮出水面:为什么具身智能的发展似乎格外偏 爱"人形"?是否只有模仿人类形态,才是实现智能的最佳路径?在面对数据、算力、模型架构等现实挑战时,我们究竟处于怎样的阶段?距离真 正的通用机器人,还有多少"里程"要走? 基于此,CSDN《万有引力》栏目特别策划了一期以"十问具身智能:我们离通用机器人还有多远?"为主题的深度对话,邀请了北京邮电大学人 工智能学院副教授陈光@爱可可-爱生活、深圳市人工智能与机器人研究院副研究员夏轩、Roboraction.AI 首席执行官黄浴,在栏目主理人 CSDN &《新程序员》执行总编唐小引主持下,三位专家将从技术演进、研究现状、产业应用等多个角度切入,带大家一同拆解具身智能面临的"关键问 题",看清这条通往未来机器人的发展路径。 夏轩:在专业背景方面,我早期的研究主要集中于计算机视觉领域(CV),涵盖无人机图像处理、工业图像处理以及生成模型等方向。在扩散模 型兴起之前,我也 ...
谷歌IO大会点评
2025-05-21 15:14
谷歌 IO 大会点评 20250521 tokens 数量是传统 AI Overview 的两到三倍。此外,全美范围内全面推出增 强现实试穿功能,使消费者可以通过拍摄全身照片来虚拟试穿衣物。 谷歌在原生多模态方面有哪些进展? 在原生多模态方面,谷歌展示了 native language understanding 功能,该 功能支持原生语音和音频输出,可以实现机器人交流时声音由大变小、悄悄话 以及无缝切换语言。此外,还演示了视频和图像生成产品 ImageFour 的进一 步更新。这些进展显示出谷歌在多模态技术上的持续创新。 谷歌 Lens APP 新增哪些功能? 摘要 谷歌正积极应对 ChatGPT 等竞争对手的挑战,通过应用层面的创新, 如提升 AI 搜索器比例和推出升级版 AI 模式,显著增强了其 AI 搜索产品 的竞争力,月活跃用户已达 15 亿。 谷歌在原生多模态技术上取得显著进展,包括 native language understanding 功能和 ImageFour 的更新,展示了其在语音、音频、 视频和图像生成方面的持续创新能力。 Google Lens APP 新增了 Project Xt ...
Robotaxi新消息密集释放,量产元年来临谁在领跑?
美股研究社· 2025-05-21 11:59
来源 | 美股研究社 作者 | 在辉 2025年,Robotaxi领域暗流涌动。大洋两岸同步冲刺商业化,再次展开了一场"竞赛"。过去几年,这种竞赛发生在 大模型 、机器人、半导 体。现在终于轮到了Robotaxi。 日前,马斯克在接受CNBC采访时明确,特斯拉Robotaxi将在6月底上路。Waymo选择了和Uber联手推动Robotaxi商业化落地,官宣要在明年 将亚利桑那工厂的Robotaxi产能翻倍。 而在国内,滴滴和广汽埃安联合打造的L4车型将于年底量产交付;Momenta联手上汽,计划2026年实现数百辆车的运营;去年上市的小马智行 刚刚发布了一季报,核心Robotaxi业务收到的乘客订单车费暴涨约800%,二季度将开启第七代Robotaxi量产,到年底预计车队规模达千台。 乍看之下,双方可能都还要蓄力一两年,才能真正蜕变。然而就像年初DeepSeek的石破天惊一样,真正回顾Robotaxi产业的发展轨迹,我们 或许可以提前定论: 这一次仍将是中国的Robotaxi企业拿到先机。 L4级无人驾驶技术验证、小范围示范运营、对外运营体验、商业化验证、规模化量产部署。 现在,是第四阶段向第五阶段过渡的关键 ...
见谈 | 商汤绝影王晓刚:越过山丘,我如何冲刺智驾高地?
2 1 Shi Ji Jing Ji Bao Dao· 2025-05-20 12:31
Core Insights - The article discusses the evolution of SenseAuto, a subsidiary of SenseTime, focusing on its advancements in end-to-end autonomous driving technology and the challenges faced in the automotive industry [2][3][4]. Group 1: Company Background and Innovations - Wang Xiaogang, CEO of SenseAuto, was among the first to propose the "end-to-end" approach in computer vision, aiming to reduce errors in intermediate module transmissions [2][3]. - SenseAuto launched its first product, the SenseDrive DMS driver monitoring system, in 2018, and secured partnerships with major Tier 1 suppliers and over 10 OEMs [4][5]. - The company introduced the SenseAuto Pilot-P solution in 2021, achieving L2+ level advanced driver assistance functions [4][5]. Group 2: Market Position and Competition - SenseAuto's entry into the automotive sector was marked by a focus on intelligent cockpit solutions, while the autonomous driving sector was still in a chaotic phase with no consensus on the future direction [3][4]. - The emergence of Tesla and its successful implementation of end-to-end autonomous driving models in 2022 shifted industry dynamics, prompting other companies like Xiaopeng and Li Auto to adopt similar strategies [5][6]. Group 3: Strategic Development and Challenges - Wang Xiaogang emphasized the need for cost reduction and efficiency improvement to compete effectively in mass production, which poses a significant challenge for SenseAuto [6][7]. - The company is focusing on talent acquisition and platformization to address the challenges of adapting to various hardware platforms and software [7][8]. Group 4: Future Outlook and Business Strategy - SenseAuto aims to expand its delivery range in the mid-to-low-end market by 2025, with plans to collaborate with new partners like GAC Aion and FAW Hongqi [11][12]. - The company is also developing a multi-modal large model, DriveAGI, to enhance its autonomous driving technology, which is expected to exceed human capabilities [11][12]. - SenseAuto positions itself as an AI platform company in the automotive sector, focusing on building AI infrastructure and data pipelines for enterprises [11][12].
上海码极客联合同济大学发布多模态空间智能世界模型
news flash· 2025-05-20 06:57
5月20日,人工智能赋能学科创新行动发展大会,上海人工智能企业码极客、成都考拉悠然联合同济大 学发布多模态空间智能世界模型——悠然无界大模型,同时带来了系列空间智能体产品。 ...
第四范式一季度总收入超10亿元,但未披露消费电子业务收入|钛媒体AGI
Tai Mei Ti A P P· 2025-05-16 04:31
Core Insights - Fourth Paradigm (06682.HK) reported a total revenue of 1.077 billion yuan for Q1 of FY2025, marking a year-on-year increase of 30.1% [2] - The company's gross profit reached 444 million yuan, also reflecting a 30.1% year-on-year growth, with a gross margin of 41.2% [2] - Following the positive earnings report, the stock opened 4% higher and surged over 8% during trading on May 16, reaching 42.9 HKD per share and a market capitalization of 21.1 billion HKD [2] Business Segment Performance - The "Prophet AI Platform," which constitutes 74.8% of total revenue, generated 805 million yuan in Q1, showing a significant year-on-year growth of 60.5% [5] - The SHIFT intelligent solutions segment reported revenue of 212 million yuan, down 14.9% year-on-year, with its revenue share decreasing to 19.7% due to strategic business expansion [5] - The AIGS service segment contributed 60 million yuan, accounting for 5.6% of total revenue [5] R&D and Future Plans - R&D expenses for Q1 amounted to 368 million yuan, an increase of 5.7% year-on-year, with an R&D expense ratio of 34.2%, down 8 percentage points [5] - The company plans to establish Paradigm Group, with the original Fourth Paradigm business becoming a core subsidiary, while also entering new sectors like consumer electronics [6] - The focus remains on enhancing AI capabilities across various industries, with a commitment to not pivoting away from enterprise services [6][7] Market Position and Profitability Outlook - Fourth Paradigm's overall R&D and revenue scale is smaller compared to peers like SenseTime, but it has a larger profit margin potential [7] - Based on current trends, the company is projected to achieve breakeven or positive net profit for FY2025, potentially becoming the third domestic AI software company to report profitability [7] - The vision is to leverage accumulated experience in vertical world models to expand AI capabilities beyond enterprise software, aiming for a broader market reach [8]
公司深度报告智驾平权“最大公约数”,乘渗透率东风加速全域征程
Xinda Securities· 2025-05-16 00:30
44444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444 44444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444 44444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444 44444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444 地平线机器人公司深度:智驾平权"最大公 约数",乘渗透率东风加速全域征程 ——地平线机器人(9660.HK)公司深度报告 2025 年 5 月 15 日 庞倩倩 计算机行业首席分析师 执业编号:S1500522110006 邮箱:pangqianqian@cindasc.com ...
第四范式2025一季报:先知AI平台营收增60.5%,Agent战略显成效
Jing Ji Guan Cha Bao· 2025-05-15 11:25
Core Insights - Fourth Paradigm (06682.HK) reported a total revenue of RMB 1.077 billion for Q1 FY2025, marking a year-on-year growth of 30.1% [2] - The gross profit reached RMB 444 million, also reflecting a 30.1% increase, with a gross margin of 41.2% [2] - The core business, the Prophet AI platform, generated revenue of RMB 805 million, showing a significant year-on-year growth of 60.5% [2] Business Expansion - The Prophet platform underwent a major upgrade, introducing an AI Agent full-process development platform, allowing enterprise clients to integrate over 150 mainstream large models [3] - The platform includes a comprehensive suite of AI applications covering various enterprise scenarios such as AIGC, intelligent office, digital employees, and more [3] - The AI Agent has been implemented in over 14 industries, including finance, aviation, automotive, healthcare, and retail [3] Revenue Breakdown - The SHIFT intelligent solutions business generated RMB 212 million, accounting for 19.7% of total revenue, despite a year-on-year decline of 14.9% [4] - The AIGS service revenue was RMB 60 million, representing 5.6% of total revenue, with enhancements in programming Agent capabilities [4] Strategic Focus - The company is focusing on enhancing the core capabilities of the Prophet AI platform and the deployment of enterprise-level Agents, leading to a strategic resource reallocation [4] - The company has established a dual-core business structure, maintaining its enterprise services while launching a consumer electronics segment, Phancy, which focuses on integrated AI Agent solutions [6] - Phancy has released several user-level Agent solutions aimed at improving daily work efficiency for enterprise users [6]
第四范式一季度营收同比增长30%至超10亿元:企业级Agent规模化扩张
IPO早知道· 2025-05-15 10:19
第四范式正致力于以「Agent + 世界模型」的技术理念去赋能更多的产业。 本文为IPO早知道原创 作者| Stone Jin 微信公众号|ipozaozhidao 据 IPO早知道消息, 第四范式( 06682.HK) 于 5月15日发布 了 截至 2025年3月31日 的 2025财年第一季度核心业务进展报告。 报告显示, 第四范式第一季度总收入 10.77亿元 (人民币,下同) ,同比增长 30.1% ;毛利润 4.44亿元,同比增长30.1%,毛利率 则 为 41.2%。 第一季度, 第四范式标杆客户数为 59个,标杆客户平均收入为1167万元,同比增长31.3%。 具体来讲, 作为第四范式所有业务的内核,先知 AI平台 第一季度 营收 8.05亿元,同比增长 60.5% 。 第一季度 ,第四范式先知平台迎来全新升级,基于原有 AI模型开发工具链,进一步推出AI Agent全 流程开发平台。企业客户可通过先知平台上内置的大模型应用平台Model Hub,轻松集成150+主流 大模型,并利用内置的数十个Agent智能体框架,构建完整LLM Ops体系,覆盖AI Agent开发全生 命周期。先知AI平台配 ...
三问三解 | VLA
Zhong Guo Zhi Liang Xin Wen Wang· 2025-05-15 07:56
在自动驾驶领域,技术的演进如同一场接力赛,从早期的基于规则的系统,到端到端模型,再到视觉语言模型(VLM),如今已经发展到视觉语言行 动模型(VLA)阶段。每一步的跨越,都不仅仅是技术的迭代,"人工智能"实质性应用的范例。 什么是VLA? VLA(Vision-Language-Action Model)是视觉-语言-行为大模型,它融合了视觉、语言和行动三种能力,将其统一在一个模型里,只输入到机器就可执 行动作的端到端映射,从而赋予模型强大的3D空间理解、逻辑推理和行为生成能力,让自动驾驶能够感知、思考和适应环境。 VLA模型由多个关键模块组成,包括视觉编码器、语言编码器、跨模态融合模块和动作生成模块。视觉编码器负责从图像或视频中提取高层次视觉特 征,语言编码器则处理自然语言输入,跨模态融合模块将视觉和语言特征进行整合,而动作生成模块则根据融合后的信息生成车辆的控制指令。 VLA的核心特性包括多模态感知与决策、全局上下文理解和系统透明性。它能够基于视觉和语言信息进行实时感知,并通过"思维链"技术构建类人逻 辑,推理复杂场景下的最优驾驶决策。此外,VLA能够理解长达数十秒的全局路况信息,这对于施工工区、潮汐车道 ...