主动感知
Search documents
致敬发明者:空调业的“风”,为何总由海尔掀起?
Xin Lang Cai Jing· 2025-10-29 09:24
Core Insights - The air conditioning industry has evolved from "mechanical obedience" to "humanized service," with Haier playing a pivotal role as an "inventor" [1][3] - The launch of the Haier Mairang Comfortable Wind Pro series marks a significant technological leap, featuring UWB human perception radar and WiFi sensing technology, enhancing user comfort through "active perception" [1][3] Innovation Pathway - Haier's innovation strategy is driven by a "dual helix" of user demand for "healthy wind quality" and "comfortable wind sensation," positioning the company at the forefront of industry advancements [3][6] - The introduction of self-cleaning air conditioners in 2015 and subsequent iterations have led to Haier maintaining the top global sales position in this category for seven consecutive years [3][6] - The Mairang Comfortable Wind Pro is recognized as the industry's first air conditioner capable of perceiving breathing, representing a shift from "wind avoiding people" to "wind understanding people" [3][6] Product Evolution - The Mairang Comfortable Wind Pro signifies the transition of air conditioners from "functional machines" to "intelligent entities," incorporating a complete "perception-decision-execution" system [6][9] - The UWB human perception radar enables rapid response and high precision in detecting human movements, covering an extensive area within a living space [6][9] Redefining Human-Machine Relationship - Haier's technological advancements redefine the relationship between users and air conditioners, transforming them from mere machines into intelligent partners that understand user states [8][9] - The "self-adaptive air supply" feature achieves a balance between energy efficiency and user comfort, embodying the vision of "active care" through intelligent decision-making [9] Industry Leadership - Haier's 40-year history in the air conditioning sector reflects a continuous evolution of technology, from the first split air conditioner in 1985 to the latest innovations in heat exchanger technology [9][10] - The company leads in shaping industry standards, having influenced 31 out of 34 major national standards for room air conditioners and participating in international standard development [10] - Haier emphasizes user experience over technical prowess, demonstrating that technology can be intuitive and responsive to user needs [10]
斑马智行司罗:智能座舱正经历范式重构,端到端+主动感知成破局关键
Zhong Guo Jing Ji Wang· 2025-09-22 09:07
Core Insights - The core argument presented by the CTO of Zebra Zhixing is that smart cockpits are becoming a crucial entry point for user experience and the Internet AI ecosystem in smart vehicles, representing a golden track with both technological depth and commercial value [3][4]. Industry Overview - Smart cars are identified as a significant testing ground for Physical AI, with the potential for AI value in physical spaces being more substantial than in digital realms [3]. - The smart cockpit is characterized by three core features: high complexity, high safety, and high commercial value, with Zebra Zhixing having collaborated on over 8 million vehicles to validate the feasibility of large-scale technology applications [3]. Technical Architecture - The smart cockpit's five-layer integration architecture includes: 1. Chip and computing power layer, centered around companies like NVIDIA and Qualcomm. 2. System layer, led by companies such as Zebra Zhixing and Huawei, providing efficient system-level services. 3. Large model layer, integrating general and vehicle-specific models to address multi-modal processing and data privacy. 4. Intelligent agent layer, responsible for central decision-making and service module coordination. 5. Platform service layer, enabling AI-native services through natural language interaction [4]. Development Phases - The development of smart cockpits is categorized into three phases: 1. "Verification Period" (2024 to early 2025) focusing on whether large models can be integrated into vehicles. 2. "Application Period" (2025) emphasizing the implementation of intelligent agent systems for practical service delivery. 3. "Reconstruction Period" (current to 2026) where the industry shifts from traditional assembly line architectures to end-to-end models [4][5]. Interaction Experience - The transition from a "passive response" to "active perception" in smart cockpits is highlighted, where intelligent assistants can proactively identify user needs through sensory inputs, evolving from mere tools to supportive partners [5]. - Zebra Zhixing aims to drive the smart cockpit towards a trillion-level commercial market, positioning it as a core hub in the Physical AI ecosystem [5].
告别被动感知!DriveAgent-R1:主动视觉探索的混合思维高级Agent
自动驾驶之心· 2025-08-01 07:05
Core Insights - DriveAgent-R1 is an advanced autonomous driving agent designed to tackle long-term, high-level behavioral decision-making challenges, leveraging a hybrid thinking framework and active perception to enhance decision-making capabilities in complex environments [3][4][32]. Innovation and Methodology - DriveAgent-R1 introduces two core innovations: a novel three-stage progressive reinforcement learning strategy and a mode grouping algorithm (MP-GRPO) that enhances the agent's dual-mode specificity, laying the groundwork for autonomous exploration [4][13]. - The agent's decision-making process is driven by active perception, allowing it to proactively seek information to reduce uncertainty, which is crucial for safe and reliable driving [5][6][32]. Performance Metrics - DriveAgent-R1 achieved state-of-the-art (SOTA) performance on the challenging SUP-AD dataset, surpassing leading multimodal models such as Claude Sonnet 4 and Gemini 2.5 Flash [4][13][27]. - The model demonstrated significant improvements in accuracy metrics, with first frame accuracy increasing by 14.2% and sequence average accuracy by 15.9% when utilizing visual tools [27][28]. Training Strategy - The training strategy consists of three phases: dual-mode supervised fine-tuning (DM-SFT), forced comparative mode reinforcement learning (FCM-RL), and adaptive mode selection reinforcement learning (AMS-RL), which collectively enhance the agent's ability to choose the optimal thinking mode based on context [24][30]. - The gradual training approach effectively transformed potential distractions from visual tools into performance amplifiers, significantly improving the agent's decision-making capabilities [28][30]. Active Perception and Visual Tools - Active perception is integrated into DriveAgent-R1, equipping it with a robust visual toolkit that allows the agent to actively explore its environment, thereby enhancing its perceptual robustness [5][19]. - The visual toolkit includes features such as high-resolution view retrieval, region of interest inspection, depth estimation, and 3D object detection, which collectively improve the agent's ability to make informed decisions in uncertain conditions [19][20]. Experimental Results - The experiments confirmed that reinforcement learning (RL) is critical for unlocking the agent's potential, with RL-trained variants significantly outperforming those trained solely through supervised fine-tuning [29][30]. - The results indicated that DriveAgent-R1's performance is heavily reliant on visual inputs, with a drastic drop in accuracy when visual information is removed, underscoring the importance of its active perception mechanism [31].
自动驾驶Agent来了!DriveAgent-R1:智能思维和主动感知Agent(上海期智&理想)
自动驾驶之心· 2025-07-29 23:32
Core Viewpoint - DriveAgent-R1 represents a significant advancement in autonomous driving technology, addressing long-term, high-level decision-making challenges through a hybrid thinking framework and active perception mechanism [2][31]. Group 1: Innovations and Challenges - DriveAgent-R1 introduces two core innovations: a novel three-stage progressive reinforcement learning strategy and the MP-GRPO (Mode Grouped Reinforcement Policy Optimization) to enhance the agent's dual-mode specificity capabilities [3][12]. - The current potential of Visual Language Models (VLM) in autonomous driving is limited by short-sighted decision-making and passive perception, particularly in complex environments [2][4]. Group 2: Hybrid Thinking and Active Perception - The hybrid thinking framework allows the agent to adaptively switch between efficient text-based reasoning and in-depth tool-assisted reasoning based on scene complexity [5][12]. - The active perception mechanism equips the agent with a powerful visual toolbox to actively explore the environment, improving decision-making transparency and reliability [5][12]. Group 3: Training Strategy and Performance - A complete three-stage progressive training strategy is designed, focusing on dual-mode supervised fine-tuning, forced comparative mode reinforcement learning, and adaptive mode selection reinforcement learning [24][29]. - DriveAgent-R1 achieves state-of-the-art (SOTA) performance on challenging datasets, surpassing leading multimodal models like Claude Sonnet 4 and Gemini 2.5 Flash [12][26]. Group 4: Experimental Results - Experimental results show that DriveAgent-R1 significantly outperforms baseline models, with first frame accuracy increasing by 14.2% and sequence average accuracy by 15.9% when using visual tools [26][27]. - The introduction of visual tools enhances the decision-making capabilities of state-of-the-art VLMs, demonstrating the potential of actively acquiring visual information in driving intelligence [27]. Group 5: Active Perception and Visual Dependency - Active perception is crucial for deep visual reliance, as evidenced by the drastic performance drop of DriveAgent-R1 when visual inputs are removed, confirming its decisions are genuinely driven by visual data [30][31]. - The training strategy effectively transforms potential distractions from tools into performance amplifiers, showcasing the importance of structured training in utilizing visual tools [27][29].