空间智能
Search documents
复杂空间推理新SOTA,性能提升55%!中山大学新作SpatialDreamer
具身智能之心· 2025-12-22 01:22
Core Insights - The article discusses the introduction of SpatialDreamer, a framework developed by researchers from Sun Yat-sen University and MBZUAI, which enhances complex spatial task performance through active mental imagery and spatial reasoning [1][4]. Group 1: Limitations of Current Models - Despite significant advancements in multimodal large language models (MLLMs) for scene understanding, their performance remains limited in complex spatial reasoning tasks that require psychological simulation [2]. - Existing methods primarily rely on passive observation of spatial data, lacking the unique human ability for active imagination and dynamic internal representation updates [3]. Group 2: SpatialDreamer Framework - SpatialDreamer simulates human spatial cognition through a closed-loop reasoning process consisting of three steps: exploration, imagination, and reasoning [6]. - The exploration phase involves the model determining optimal self-centered actions based on the current scene, such as "move forward 0.75 meters" or "turn left 45 degrees" [6]. - The imagination phase generates new perspective images after executing actions using a world model [6]. - The reasoning phase integrates all accumulated visual evidence to produce a final answer [6]. Group 3: GeoPO Strategy Optimization - To address the issue of sparse rewards in long-sequence reasoning tasks, the research team introduced GeoPO, a strategy optimization method combining tree sampling structures and geometric consistency constraints [8]. - The tree sampling approach allows multiple action branches at each step, supporting backtracking and multi-path exploration [8]. - A multi-level reward design merges task-level and step-level rewards to provide fine-grained feedback [8]. - A geometric penalty mechanism imposes penalties on redundant or conflicting actions, encouraging efficient trajectory generation [8]. Group 4: Performance Validation - The effectiveness of SpatialDreamer was validated across multiple spatial reasoning benchmarks, achieving state-of-the-art (SOTA) results with an average accuracy of 93.9% and 92.5% on real and synthetic images, respectively, in the SAT benchmark [13]. - In the MindCube-Tiny benchmark, it achieved an overall accuracy of 84.9%, surpassing the baseline Qwen2.5-VL-7B by over 55% [13]. - In the VSI-Bench, it outperformed in tasks such as object counting, relative direction, and path planning, with an average accuracy of 62.2% [13]. Group 5: Significance of SpatialDreamer - The significance of SpatialDreamer lies not only in improving spatial reasoning accuracy but also in demonstrating that MLLMs can enhance reasoning capabilities through "imagination," marking a significant step towards human-like spatial intelligence [14].
「一脑多形」圆桌:世界模型、空间智能在具身智能出现了哪些具体进展?丨GAIR 2025
雷峰网· 2025-12-20 04:07
Core Viewpoint - The article discusses the current state and future potential of embodied intelligence, focusing on the challenges and opportunities presented by world models and spatial intelligence in the field of robotics and AI [2][4][10]. Group 1: Development of Embodied Intelligence - The technology route for embodied intelligence is still in an exploratory phase, with no convergence yet, which is seen as a positive sign for innovation [4][3]. - There is a consensus among experts that the core issues of embodied intelligence, such as interaction and human-machine collaboration, should be addressed by academic institutions, while industries focus on practical applications [4][5]. - The integration of AI with physical entities is expected to lead to significant advancements in intelligence, but the field must avoid reverting to industrial automation without achieving generalized intelligence [4][5][30]. Group 2: World Models in Autonomous Driving - World models are currently being utilized by leading companies like Tesla to enhance data generation and improve decision-making processes through closed-loop testing [11][12]. - The concept of world models has gained traction in autonomous driving due to the simplicity of generating scenarios compared to robotics, with advancements in generative AI enabling the creation of realistic training samples [12][13]. - There is ongoing debate regarding the definition and application of world models in both autonomous driving and robotics, with differing opinions on the necessity of pixel-level reconstruction versus latent state representation [12][13][14]. Group 3: Spatial Intelligence in Robotics - Spatial intelligence is a critical aspect of robotics, with a focus on perception and understanding spatial relationships, which has evolved from traditional SLAM techniques to more learning-based approaches [20][21]. - The current challenges in spatial intelligence include the need for better data representation and understanding of complex spatial relationships, which are still underdeveloped in robotic systems [22][23]. - The integration of visual and semantic information is essential for enhancing robots' spatial capabilities, but the field is still in its early stages [22][23][24]. Group 4: Commercialization and Future Applications - The future of drone applications is expected to expand significantly, with potential uses in various sectors, but the timeline for widespread adoption remains uncertain [26][27]. - The gap between technological capabilities and market needs poses challenges for entrepreneurs, as there is often a mismatch between innovative ideas and practical industrial requirements [30][31]. - The shift towards learning-based control paradigms is anticipated to increase the applicability of drones and robots in real-world scenarios, moving beyond traditional automation [28][29].
让人工智能“睁眼看世界” 走在国际科技变革最前沿 上海量子城市建设画卷正从复兴岛展开
Jie Fang Ri Bao· 2025-12-20 00:59
记者 肖彤 11月,斯坦福大学教授、World Labs联合创始人李飞飞发表长文称,"空间智能"是人工智能的下一 个前沿,定义着未来十年的发展方向。相隔一日,图灵奖得主、前Meta首席AI科学家杨立昆宣布离 职,将成立一家专注"世界模型"的新公司。 12月18日,上海复兴岛—全球创客岛启动暨2025上海量子城市年度大会举行。据介绍,复兴岛将建 设智能基础设施,按照每平方公里10万个的标准分步实施全岛智能感知设施布设;另外,提升时空智能 体能力,构建新质产业线上线下一体的实训场。 随着新一代人工智能技术快速演进,一幅承载无限想象力的城市画卷,即将从复兴岛向世界铺开。 为人工智能构建"世界模型" 人工智能技术加速迭代,唯有抢抓机遇,才能捕捉前沿的科技变革。 2024年12月,"上海量子城市时空创新基地"在复兴岛开启。清华大学建筑学院副教授、自然资源部 智慧人居环境与空间规划治理技术创新中心副主任杨滔认为,上海从时空智能开启量子城市建设,走在 国际科技变革最前沿。 过去几年,人工智能看起来越来越"聪明"了。然而科学家们发现,这些模型仍有较大局限性。语言 模型只读过书,却没接触过真实的物理世界。 为此,上海正在不断搭 ...
【金猿人物展】袋鼠云CEO宁海元:AI浪潮下,数据中台的生存与跃迁
Sou Hu Cai Jing· 2025-12-18 12:20
过去十年,数据中台经历了"全民建中台"的热潮,也走过"建用脱节"的迷茫。随着AI技术的爆发,尤其是大模型对高质量数据供给的迫切需求,数据中台 的定位正在被重塑——它不只是数据的"管理者",更要成为AI能力落地的"赋能者"。未来的数据中台,只有两条路:要么成为AI Infra的核心支撑,要么 在技术迭代中被边缘化出局。这是我深耕大数据产业十年,最坚定的判断。 十年前,我在阿里深耕大数据基础设施——搭平台、建数仓、做实时计算,服务电商、金融等核心业务。彼时一个判断愈发清晰:数据基础设施绝不会只 服务互联网公司,终将成为所有行业的"公共基建"。正是这个判断,让我选择离开阿里云,联合创办了袋鼠云,全力投身"让大数据走进产业"的事业。这 个决定在当时并不被普遍理解:从头部平台"下船",去做一件高投入、长周期、短期看不到回报的事,风险不言而喻。但对我而言,大数据已证明技术可 行,接下来必须回答:它在产业一线是否真的有价值?我想成为验证这件事的人。 宁海元 "【提示】2025第八届年度金猿颁奖典礼将在上海举行,此次榜单/奖项的评选依然会进行初审、公审、终审三轮严格评定,并会在国内外渠道大规模发布 传播欢迎申报。 回望袋鼠云 ...
Xiaomi MiMo 大模型落地应用,小米“人车家全生态”合作伙伴大会介绍IoT平台生态新进展
Sou Hu Wang· 2025-12-18 10:06
Core Insights - Xiaomi successfully held the "Human-Vehicle-Home Ecosystem" Partner Conference in Beijing, showcasing its latest IoT platform capabilities and user experience innovations [1][3] Group 1: IoT Platform Progress - As of Q3, Xiaomi's IoT platform has surpassed 1 billion connected devices, reaching 1.04 billion units, with the Mi Home app achieving over 110 million monthly active users [3] - The annual shipment of Xiaomi IoT modules has exceeded 10 million units for the first time, solidifying its position as a leading global smart ecosystem platform [3] - Xiaomi has partnered with over 15,000 companies globally, including renowned brands like Miele, Bosch, Siemens, and LG, while also focusing on social responsibility initiatives [3] Group 2: Future Innovations - Xiaomi introduced the Xiaomi Miloco smart home exploration plan, which integrates visual perception into smart home systems, allowing users to create smart rules through natural language [3] - The company is collaborating with leading brain-computer interface firms to enhance interaction possibilities for individuals with mobility impairments [4] Group 3: AI and Ecosystem Integration - The IoT Future Summit 2026 highlighted the role of AI in driving innovation across the entire ecosystem, moving beyond isolated breakthroughs to a comprehensive approach [6] - Various partners presented advancements in smart solutions, emphasizing user experience improvements and seamless integration of devices [6][7] - Xiaomi's IoT platform is transitioning towards "spatial intelligence," focusing on proactive decision-making through multi-modal perception and distributed computing technologies [7][11] Group 4: User Experience Enhancements - The IoT Ecosystem Access and Experience Innovation Forum focused on the new capabilities of the Mi Home 11.0 experience, addressing user demands for comfort, safety, and energy efficiency [9] - Xiaomi upgraded its scene capabilities and 3D central control interactions, enhancing user experience for over 110 million monthly active users [9] Group 5: Technical Developments - The IoT Platform Technology Forum showcased a full-stack upgrade of Xiaomi's IoT capabilities, including the launch of the IoT-BLE 2.0 module matrix and advancements in AI-driven device interactions [11] - The forum discussed strategies for AIoT developers in the context of global trends in security and privacy compliance [11] Group 6: Exhibition Highlights - The conference featured an IoT exhibition area displaying various smart home solutions, IoT connection technologies, and the overall capabilities of Xiaomi's IoT platform [13]
接入高德,千问打通“AI干活”最后一公里
华尔街见闻· 2025-12-18 09:58
在移动互联网流量红利见顶、 AI 技术狂飙的 2025 年,提前一个身位布局的阿里,出牌愈发密集。 12 月 17 日,千问接入了高德地图,从此有了认路、在现实世界干活的能力。这意味着,无论是餐馆推荐、路线规划,还是房产选址、旅行行程规划,千 问都能基于实时地图数据直接回答。 高德仅仅是第一步。千问 App 本体,正悄然将整个阿里的生态悉数收入囊中,成为那个 ALL IN ONE 的超级入口,实现 Manus 都没完成的愿景。 眼下,长出 " 手脚 " 的千问,跳出聊天框开始输出行动。这个 C 端的超级应用,也侧面回击了 AI 泡沫论调,它正将大模型和算力,转化为实实在在的生 产力价值和问题解决能力。 这标志着阿里的 AI 战略,终于从技术的 " 云端 " 转向了商业的 " 地面战争 " 。此刻的阿里正用 AI 黏合起整个集团的资源富矿,由此构建起的庞大护城 河,几乎无人能敌。 大模型有了 " 手脚 " " 没有空间智能, AGI 就不完整。 " 今年 7 月, "AI 教母 " 李飞飞在 YC 全球创业者峰会上如是说道。这样的观点,其实早已凝结为共识,高德地图 CEO 郭宁说,任何行动,都离不开对 时间和空 ...
特斯拉再一次预判潮水的方向
自动驾驶之心· 2025-12-18 09:35
Core Viewpoint - Tesla's AI leader Ashok Elluswamy revealed the technical methodology behind Tesla's Full Self-Driving (FSD) in a recent article, emphasizing the choice of an end-to-end neural network model and addressing the challenges faced in practice [4][6]. Group 1: End-to-End Neural Network Model - Tesla's decision to adopt an end-to-end neural network model is driven by the need to address complex driving scenarios that cannot be pre-defined by rules, such as the "trolley problem" and second-order effects [6][10]. - The end-to-end model is described as a complete overhaul of previous architectures, fundamentally changing design, coding, and validation processes, leading to a more human-like driving experience [11][19]. - The model outputs driving instructions alongside interpretable "intermediate results," utilizing technologies like generative Gaussian splatting to create dynamic 3D models of the environment in real-time [8][17]. Group 2: VLA and World Model Concepts - VLA (Vision-Language-Action) is an extension of the end-to-end model that incorporates language information, allowing for a more visual representation of driving behavior [12][14]. - The world model aims to establish a high-bandwidth cognitive system based on video/image data, addressing the limitations of language models in understanding complex, dynamic environments [15][19]. - The relationship between end-to-end, VLA, and world models is clarified, with end-to-end serving as the foundation, VLA as an upgrade, and the world model as the ultimate form of understanding spatial dynamics [12][19]. Group 3: Industry Perspectives and Trends - The industry is divided into three main technical routes: end-to-end, VLA, and world model, with companies like Horizon Robotics and Bosch primarily adopting end-to-end due to lower costs and higher stability [13][19]. - VLA has faced criticism from industry leaders who argue that its reliance on language models may not be essential for effective autonomous driving, emphasizing the need for spatial understanding instead [16][19]. - Tesla's recent publication has reignited discussions in the industry, positioning the company at the forefront of current technological directions and providing a systematic analysis of practical applications [20].
大模型的进化方向:Words to Worlds | 对话商汤林达华
量子位· 2025-12-17 09:07
金磊 发自 凹非寺 量子位 | 公众号 QbitAI 李飞飞 团队最新的空间智能模型 Cambrian-S ,首次被一个国产开源AI超越了。 从这张展示空间感知能力的雷达图中,一个名为 SenseNova-SI 的模型,它在多个维度上的能力评分均已将Cambrian-S给包围。 而且从具体的数据来看,不论是开源或闭源,不论是2B或8B大小,SenseNova-SI在各大空间智能基准测试中都拿下了SOTA的成绩: | Model | vsı | MMSI | MindCube-Tiny | ViewSpatial | SITE | | --- | --- | --- | --- | --- | --- | | Open-source Models (~2B) | | | | | | | InternVL3-2B | 32.9 | 26.5 | 37.5 | 32.5 | 30.0 | | Qwen3-VL-2B-Instruct | 50.3 | 28.9 | 34.5 | 36.9 | 35.6 | | MindCube-3B-RawQA-SFT | 17.2 | 1.7 | 51.7 | 24.1 | 6. ...
数码家电行业周度市场观察-20251217
Ai Rui Zi Xun· 2025-12-17 08:38
Investment Rating - The report does not explicitly provide an investment rating for the industry Core Insights - The digital home appliance industry is experiencing a transformation driven by AI technology, with significant developments in various sectors including education, retail, and robotics [1][2][3][4][6][9][10] Industry Trends - The education sector is leveraging generative AI to enhance personalized services, with companies like Fenbi exploring AI-driven products despite facing competition and the need for continuous investment [1] - New retail is shifting from supply-driven to demand-driven management through AI, addressing issues like inventory backlog and customer loyalty [2] - The "human-vehicle-home" ecosystem is evolving with 5G, AI, and IoT technologies, enhancing user experience and creating new business models [3] - AI video content is becoming longer and more sophisticated, democratizing the creative process in the film industry [4] - The AI terminal ecosystem is developing rapidly, with significant growth in AI smartphones and smart wearables, driven by advancements in domestic computing chips [4] - The humanoid robot market is projected to grow significantly, driven by labor shortages and technological advancements, although challenges remain [4][6] - AI entrepreneurship is transitioning from model competition to scenario-based applications, as showcased at the World Internet Conference [6] - The home appliance market is shifting towards quality and innovation, with air conditioners performing well despite price wars, while the black appliance sector faces challenges [9] - The coffee machine market is experiencing growth due to consumer demand for high-quality coffee experiences, reflecting a shift towards premium products [9] - The "Double 11" shopping festival highlighted the significant role of AI in driving sales and transforming consumer decision-making in the home appliance sector [10] Top Brand News - Soul App is preparing for an IPO, focusing on AI-driven emotional value services, with a strong user base among Generation Z [13] - Alibaba is launching new AI products aimed at the consumer market, seeking to enhance its ecosystem and address internal strategic challenges [14] - Yushun Technology is on the verge of going public, having established itself as a leader in the humanoid robot sector [14] - Rokid is gaining traction in the smart glasses market, collaborating with various partners to enhance product functionality and user experience [16] - Kuaishou reported strong revenue growth, attributing part of its success to AI technology that enhances online marketing [17] - Black Sesame Intelligence is addressing challenges in robot mass production with a new intelligent computing platform [18]
数字科技产业观察 | 双周要闻(2025.12.02—12.16)
Mei Ri Jing Ji Xin Wen· 2025-12-16 10:45
01 部委动态 (1)工信部修订印发《产业技术基础公共服务平台管理办法》 为加快推进新型工业化,筑牢产业技术基础根基,工业和信息化部近日印发新修订的《产业技术基础公 共服务平台管理办法》,包括总则、申报、审核发布、运行、动态管理、附则等6章22项条款,自2025 年12月5日起施行。《管理办法》提出,服务平台申报单位应当明确申报的服务行业领域及服务范围。 服务重点行业和领域包括装备、石化化工、钢铁、有色、建材、轻工、纺织、食品、医药、新一代信息 技术、生物技术、新能源、新材料、新能源汽车、人工智能、元宇宙、脑机接口等;服务范围主要包括 计量检测、标准验证与检测、质量可靠性试验检测、认证认可、产业信息、知识产权、技术成果转化 等。(来源:工业和信息化部科技司) 12月2日,江苏省元宇宙标准化技术委员会在南京成立。江苏省元宇宙标准化技术委员会的成立,填补 了省内元宇宙领域标准化体系的空白,将重点承担元宇宙标准化路线规划、发展策略制定及前沿标准前 期研究等顶层设计工作,为产业高质量发展划定"标准线"、明确"施工图"。(来源:新华日报·交汇 点) (2)国家发展改革委 国家数据局 教育部 科技部 中共中央组织部关于加 ...