世界模型 - filings, earnings calls, financial reports, news - Reportify

世界模型

Search documents

商汤绝影世界模型负责人离职。。。

自动驾驶之心· 2025-06-21 13:15

Core Viewpoint - The article discusses the challenges and opportunities faced by SenseTime's autonomous driving division, particularly focusing on the competitive landscape and the importance of technological advancements in the industry. Group 1: Company Developments - The head of the world model development for SenseTime's autonomous driving division has left the company, which raises concerns about the future of their cloud technology system and the R-UniAD generative driving solution [2][3]. - SenseTime's autonomous driving division has successfully delivered a mid-tier solution based on the J6M model to GAC Trumpchi, but the mid-tier market is expected to undergo significant upgrades this year [4]. Group 2: Market Dynamics - The mid-tier market will see a shift from highway-based NOA (Navigation on Autopilot) to full urban NOA, which represents a major change in the competitive landscape [4]. - Leading companies are introducing lightweight urban NOA solutions based on high-tier algorithms, targeting chips with around 100 TOPS computing power, which are already being demonstrated to OEM clients [4]. Group 3: High-Tier Strategy - The key focus for SenseTime this year is the one-stage end-to-end solution, which has shown impressive performance and is a requirement for high-tier project tenders from OEMs [5]. - Collaborations with Dongfeng Motor aim for mass production and delivery of the UniAD one-stage end-to-end solution by Q4 2025, marking a critical opportunity for SenseTime to establish a foothold in the high-tier market [5][6]. Group 4: Competitive Landscape - SenseTime's ability to deliver a benchmark project in the high-tier segment is crucial for gaining credibility with OEMs and securing additional projects [6][7]. - The current window of opportunity for SenseTime in the high-tier market is limited, as many models capable of supporting high-tier software and hardware costs are being released this year [6][8].

SENSETIME(HK:00020)

端到端自动驾驶

UniAD一段式端到端

生成式智驾方案R-UniAD

端到端自动驾驶

UniAD一段式端到端

生成式智驾方案R-UniAD

人形机器人“闹展会”，量产易、应用难

3 6 Ke· 2025-06-20 12:15

当AI大模型以星火燎原之势渗透至千行百业，作为其重要落地载体的具身智能，正以"现实版钢铁侠"的姿态，成为科技展会中"最靓的仔"。从通信技术中来，往通信世界里去人形机器人向来是科技展会中最吸睛的存在。一大早，智元机器人展台早已挤满前来参观的观众。远征A2手持毛笔，一笔一画写着"福"字；灵犀X2不仅用"内心戏"模式与观众互动，还向观众表演了一段太极拳。这些能力的背后，既有智元对模型架构的创新构建，也少不了通信技术的支持。智元打造了"本体—小脑—大脑"的软硬件技术架构，让人形机器人实现了运动智能、交互智能和作业智能。"我们将一些基本能力，比如手脚运动，做在本体和小脑中，使机器人在断网的情况下，也能实现基本操作。"智元机器人首席运营官邱恒告诉《IT时报》记者，"大脑"作为人形机器人智慧的关键，由云平台+具身算法构建而成，通信技术被运用其中。"有了通信技术的加持，就像给人形机器人配备了一台可以实时获取信息的手机，联网后能获得更多智慧，一些复杂问题也将交由云端处理，交互就会更加'聪明'。" 具备这些能力后，人形机器人将走进通信场景。智元旗下的远征A2、精灵G1、灵犀X2等多款机器人将进入展厅、营业厅、机房 ...

SIASUN(SZ:300024)

北大卢宗青：现阶段世界模型和 VLA 都不触及本质｜具身先锋十人谈

雷峰网· 2025-06-20 11:54

" 互联网视频数据是唯一可以 scale up 的道路。 " 作者丨郭海惟编辑丨陈彩娴作为一名具身大脑的创业者，卢宗青有着金光闪闪的履历：他是紧随 DeepMind之后，中国新生代的强化学习研究者。北京大学计算机学院长聘副教授，担任过智源研究院多模态交互研究中心负责人，负责过首个国家自然科学基金委原创探索计划通用智能体项目，还同时在NeurIPS、ICLR、ICML等机器学习的国际顶级会议担任领域主席。早在 2023年，他旗下团队便有利用多模态模型研究通用 Agent 的研究尝试，让 Agent 玩《荒野大镖客 2》和办公，使其成为第一个从零开始在AAA级游戏中完成具体任务的 LLM 智能体。相关论文几经波折，今年终于被 ICML 2025 录用。不过他自述对那份研究其实不够满意，因为"泛化性不足"。当完成那些研究以后，卢宗青意识到 "当前的多模态模型缺乏与世界交互的能力"。因为模型缺少学习物理交互的数据，所以我们看到的那些泛化的能力本质都是 "抽象"的，它终究无法理解动作和世界的关系，自然也无法预测世界。这如今成为他想在具身智能创业的起点：开发一个通用的具身人工智能模型。卢 ...

通用人工智能

通用人工智能

Midjourney发布视频模型：不卷分辨率，但网友直呼画面惊艳

虎嗅APP· 2025-06-20 09:47

以下文章来源于APPSO ，作者发现明日产品的 APPSO . AI 第一新媒体，「超级个体」的灵感指南。 #AIGC #智能设备 #独特应用 #Generative AI 本文来自微信公众号： APPSO （ID：appsolution），作者：appso，原文标题：《这个AI生图神器首次发布视频模型：不卷分辨率，但网友直呼画面惊艳超预期|附提示词》，题图来自：AI生成面对迪士尼和环球影业的版权诉讼，老牌文生图"独角兽"Midjourney没有放慢节奏，反而于今天凌晨顶着压力推出了首个视频模型V1。调色精准、构图考究、情绪饱满，风格依旧在线。不卷分辨率、不卷长镜头、Midjourney卷的，是一股独有的氛围感和审美辨识度。Midjourney是有野心的，目标剑指"世界模型"，但目前略显"粗糙"的功能设计，能否让其走得更远，恐怕还是一个未知数。你卷你的分辨率，我走我的超现实。 Midjourney一直以奇幻、超现实的视觉风格见长，而从目前用户实测的效果来看，其视频模型也延续了这一美学方向，风格稳定，辨识度高。省流版如下：上传或生成图像后点击"Animate"即可，单次任务默认输出4段5秒视频 ...

Artificial Intelligence

Midjourney视频模型V1

Artificial Intelligence

Midjourney视频模型V1

本周精华总结：Meta发布世界模型，下一个ChatGPT时刻何时来临？

老徐抓AI趋势· 2025-06-19 16:47

欢迎大家点击【预约】按钮文字版速览预约我下一场直播本文重点观点来自： 6 月 16 日本周一直播【强烈建议直接看】本段视频精华，逻辑更完整自动驾驶系统要像老司机一样理解复杂的交通场景，不仅是识别路况，更要对潜在风险做出预判——例如，看到前车旁边有人过马路被遮挡，系统要能预测行人可能出现的位置，从而保证行车安全和平稳。没有对物理世界和事件的深刻理解，自动驾驶无法实现真正的安全与智能。更广泛来看，具备成熟世界模型的机器人将极大提升生产力，推动经济飞速发展，带动运输、物流、公共和私人交通等行业变革。我认为，拥有这一技术优势的企业将成为未来市场的最大受益者，提前布局相关机会尤为重要。此外，量子计算技术也在加速发展。黄仁勋最近在欧洲演讲中提到，量子计算的拐点即将到来，这将进一步促进科学研究和AI进步，加速人类科技革命的步伐。我认为，这场科技革命的节奏将越来越快，未来几年内我们可能迎来多次类似蒸汽机或电力革命级别的突破，全球经济和社会结构都将因此发生深刻变革。以上内容仅为案例展示，不构成投资建议，投资有风险，交易需谨慎。注：基金投顾服务由盈米--小帮投顾服务团队提供!投资有 ...

学习端到端大模型，还不太明白VLM和VLA的区别。。。

自动驾驶之心· 2025-06-19 11:54

Core Insights - The article emphasizes the growing importance of large models (VLM) in the field of intelligent driving, highlighting their potential for practical applications and production [2][4]. Group 1: VLM and VLA - VLM (Vision-Language Model) focuses on foundational capabilities such as detection, question answering, spatial understanding, and reasoning [4]. - VLA (Vision-Language Action) is more action-oriented, aimed at trajectory prediction in autonomous driving, requiring a deep understanding of human-like reasoning and perception [4]. - It is recommended to learn VLM first before expanding to VLA, as VLM can predict trajectories through diffusion models, enhancing action capabilities in uncertain environments [4]. Group 2: Community and Resources - The article invites readers to join a knowledge-sharing community that offers comprehensive resources, including video courses, hardware, and coding materials related to autonomous driving [4]. - The community aims to build a network of professionals in intelligent driving and embodied intelligence, with a target of gathering 10,000 members in three years [4]. Group 3: Technical Directions - The article outlines four cutting-edge technical directions in the industry: Visual Language Models, World Models, Diffusion Models, and End-to-End Autonomous Driving [5]. - It provides links to various resources and papers that cover advancements in these areas, indicating a robust framework for ongoing research and development [6][31]. Group 4: Datasets and Applications - A variety of datasets are mentioned that are crucial for training and evaluating models in autonomous driving, including pedestrian detection, object tracking, and scene understanding [19][20]. - The article discusses the application of language-enhanced systems in autonomous driving, showcasing how natural language processing can improve vehicle navigation and interaction [20][21]. Group 5: Future Trends - The article highlights the potential for large models to significantly impact the future of autonomous driving, particularly in enhancing decision-making and control systems [24][25]. - It suggests that the integration of language models with driving systems could lead to more intuitive and human-like vehicle behavior [24][25].

视觉语言模型（VLM）

视觉语言动作模型（VLA）

视觉语言模型（VLM）

视觉语言动作模型（VLA）

Midjourney发布视频模型：不卷分辨率，但网友直呼画面惊艳

Hu Xiu· 2025-06-19 06:56

面对迪士尼和环球影业的版权诉讼，老牌文生图"独角兽"Midjourney没有放慢节奏，反而于今天凌晨顶着压力推出了首个视频模型V1。调色精准、构图考究、情绪饱满，风格依旧在线。不卷分辨率、不卷长镜头、Midjourney卷的，是一股独有的氛围感和审美辨识度。Midjourney是有野心的，目标剑指"世界模型"，但目前略显"粗糙"的功能设计，能否让其走得更远，恐怕还是一个未知数。省流版如下：上传或生成图像后点击"Animate"即可，单次任务默认输出4段5秒视频，最长可扩展至21秒；支持手动和自动两种模式，用户可通过提示词设定画面生成效果；提供低运动和高运动选项，分别适合静态氛围或强动态场景； 0:00 / 2:24 Midjourney官方宣传demo 开卷氛围感，Midjourney视频模型正式上线你卷你的分辨率，我走我的超现实。 Midjourney一直以奇幻、超现实的视觉风格见长，而从目前用户实测的效果来看，其视频模型也延续了这一美学方向，风格稳定，辨识度高。 Prompt:The train passing through the station.|@PJaccetturo 知名X博主@ ...

Midjourney视频模型V1

Midjourney视频模型V1

Midjourney 推出其首个图生视频模型 V1：延续美学风格，目标是构建「世界模型」

Founder Park· 2025-06-19 05:52

内容转载自「 AI寒武纪」今天凌晨，Midjourney推出视频生成模型 V1，主打高性价比、易于上手的视频生成功能，作为其实现"实时模拟世界"愿景的第一步。用户现在可以通过动画化Midjourney图片或自己的图片来创作短视频，定位为有趣、易用、美观且价格亲民。 Midjourney一如既往，视频模型在美学细节上下了一番功夫，官方宣传视频：超 7000 人的「AI 产品市集」社群！不错过每一款有价值的 AI 应用。邀请从业者、开发人员和创业者，飞书扫码加群：进群后，你有机会得到： 01 图生视频，支持手动和自动两种模式最新、最值得关注的 AI 新品资讯；不定期赠送热门新品的邀请码、会员码；最精准的AI产品曝光渠道核心流程：采用"图像转视频" (Image-to-Video) 的工作方式。用户先生成一张满意的图片，然后点击新增的 "Animate" 按钮来使其动画化。支持外部图片：用户可以上传自己的图片，然后通过输入运动提示词来生成视频。两种动画模式：自动模式 (Automatic)：AI 会自动为你生成"运动提示"，简单快捷手动模式 (Manual)：用户可以自己写 ...

实时模拟世界

Artificial Intelligence

Midjourney视频生成模型V1

实时模拟世界

Artificial Intelligence

Midjourney视频生成模型V1

第四范式（06682）：2025Q1业绩超预期，Agent业务高歌猛进带动公司进入高速增长轨道

Haitong Securities International· 2025-06-17 11:33

Investment Rating - The report maintains an "Outperform" rating for the company [4][8]. Core Insights - The company has entered a high-growth trajectory supported by its Agent business, with a forecasted revenue growth of 30.85% in 2025, 28.75% in 2026, and 27.22% in 2027 [4][8]. - The first quarter of 2025 saw revenue of 1.08 billion RMB, a year-on-year increase of 30.1%, with a gross profit of 444 million RMB, also up 30.1% [4][8]. - The average revenue per key user reached 11.67 million RMB, reflecting a 31.3% year-on-year increase, indicating strong performance despite macroeconomic pressures [4][8]. Financial Summary - Revenue projections for 2025-2027 are 6.88 billion RMB, 8.86 billion RMB, and 11.28 billion RMB respectively, with EPS expected to be 0.11 RMB, 0.56 RMB, and 1.19 RMB [3][4][8]. - The company’s gross profit margin (GPM) for Q1 2025 was 41.2%, maintaining stability compared to the previous year [4][8]. - The Prophet AI platform generated 805 million RMB in revenue for Q1 2025, marking a 60.5% increase year-on-year [4][8]. Business Development - The company has upgraded to a dual 2B+2C business model, enhancing its capabilities in both enterprise and consumer sectors [4][8]. - The launch of the AI Agent development platform has enabled the company to cover the full lifecycle of AI Agent development, with applications across over 14 industries [4][8]. - The establishment of the Phancy consumer electronics sector aims to provide AI Agent solutions for devices, further diversifying the company's offerings [4][8].

PHANCY(HK:06682)

SHIFT智能解决方案

式说AIGS服务

SHIFT智能解决方案

式说AIGS服务

首个转型AI公司的新势力，在全球AI顶会展示下一代自动驾驶模型

机器之心· 2025-06-17 04:50

Core Viewpoint - The article emphasizes the significance of high computing power, large models, and extensive data in achieving Level 3 (L3) autonomous driving, highlighting the advancements made by XPeng with its G7 model and its proprietary AI chips [3][18][19]. Group 1: Technological Advancements - XPeng's G7 is the world's first L3 level AI car, featuring three self-developed Turing AI chips with over 2200 TOPS of effective computing power [3][18]. - The G7 introduces the VLA-OL model, which incorporates a "motion brain" for decision-making in intelligent assisted driving [4]. - The VLM (Vision Large Model) serves as the AI brain for vehicle perception, enabling new interaction capabilities and future functionalities like local chat and multi-language support [5][19]. Group 2: Industry Positioning - XPeng was the only invited Chinese car company to present at the global computer vision conference CVPR 2025, showcasing its advancements in autonomous driving models [6][13]. - The company has established a comprehensive system from computing power to algorithms and data, positioning itself as a leader in the autonomous driving sector [8][18]. Group 3: Model Development and Training - The next-generation autonomous driving base model developed by XPeng has a parameter scale of 72 billion and has been trained on over 20 million video clips [20]. - The model utilizes a large language model backbone and extensive multimodal driving data, enhancing its capabilities in visual understanding and reasoning [20][21]. - XPeng employs a distillation approach to adapt large models for vehicle-side deployment, ensuring core capabilities are retained while optimizing performance [27][28]. Group 4: Future Directions - The development of a world model is underway, which will simulate real-world conditions and enhance the feedback loop for continuous learning [36][41]. - XPeng aims to leverage its AI advancements not only for autonomous driving but also for AI robots and flying cars in the future [43][64]. - The transition to an AI company involves building a robust AI infrastructure, with a focus on optimizing the entire production process from cloud to vehicle [50][62].

端到端智能驾驶

小鹏自动驾驶基座模型

端到端智能驾驶

小鹏自动驾驶基座模型