Grounding

Search documents
突破高分辨率图像推理瓶颈,复旦联合南洋理工提出基于视觉Grounding的多轮强化学习框架MGPO
机器之心· 2025-07-21 04:04
为解决这一问题,复旦大学、南洋理工大学的研究者提出一种基于视觉 Grounding 的多轮强化学习方法 MGPO,使 LMM 能在多轮交互中根据问题,自动预测关键 区域坐标,裁剪子图像并整合历史上下文,最终实现高分辨率图像的精准推理。相比监督微调(SFT)需要昂贵的 Grounding 标注作为监督,MGPO 证明了在强化 学习(RL)范式中,即使没有 Grounding 标注,模型也能从 「最终答案是否正确」的反馈中,涌现出鲁棒的视觉 Grounding 能力。 MGPO 的核心创新点包括: 1) 自上而下的可解释视觉推理: 赋予了 LMMs 针对高分辨率场景的 「自上而下、问题驱动」 视觉搜索机制,提供可解释的视觉 Grounding 输出; 2) 突破最大像素限制: 即使因视觉 Token 数受限导致高分辨率图像缩放后模糊,模型仍能准确识别相关区域坐标,从原始高分辨率图像中裁剪 出清晰子图像用于后续分析; 3) 无需额外 Grounding 标注: 可直接在标准 VQA 数据集上进行 RL 训练,仅基于答案监督就能让模型涌现出鲁棒的视觉 Grounding 能力。 标题: High-Resolution ...
武汉大学&北理工等SOTA方案!DEGround:增强具身三维环境中的语境理解
具身智能之心· 2025-07-12 13:59
点击下方 卡片 ,关注" 具身智能 之心 "公众号 一、你的3D Grounding 模型真的work吗? 在具身智能系统中,智能体需要依靠第一视角的3D感知算法来理解周边环境。作为其中的核心任务之一,Embodied 3D Grounding是指根据ego-centric的RGB- D图像序列以及语言描述在三维空间中定位目标对象,要求模型能够融合语言与三维视觉信息,准确识别出语句中所指代的物体。当前主流方法多采用两阶段策 略,即先利用检测模型提取三维区域特征,再进行语言引导的grounding微调。这自然引出一个疑问: 第二阶段这种针对 Grounding 的微调,其效果究竟如何,它 真的work吗? 令人颇感意外的是,实证结果显示,即便是当前最先进的Grounding模型,其实际表现也远未达到预期。相反,那些完全未接受语言监督、仅依赖目标类别进行筛 选的检测模型,在Grounding任务的评估中竟取得了更优的结果。具体而言,考虑到任务中的语言指令为模板生成,本文通过规则解析提取出目标物体的类别标 签,之后使用该类别从检测模型中筛选对应预测框,直接作为Grounding的输出。理论上,这种做法缺乏语言理解过 ...
Factory Co-Founder & CTO on Building Reliable AI Agents | LangChain Interrupt
LangChain· 2025-06-18 18:40
Core Idea - Factory believes software development is transitioning to agent-driven from human-driven [1] - To achieve significant productivity gains (5-20x), a shift from collaborating with AI to delegating tasks entirely to AI is needed [3] - Factory is building a platform for managing and scaling AI agents, integrating various engineering systems [3][4][5] Agentic System Characteristics - Agentic systems require planning to decide future actions [11] - Decision-making is crucial for agents to make calls based on the existing state [13][14] - Environmental grounding is necessary for agents to interact with and adapt to the external environment [14] Human-AI Collaboration - Humans will remain in software development, focusing on the outer loop (reasoning, requirements) [15][16] - Agents will handle the inner loop (coding, testing, code review) [17] - AI UX should blend delegation with control for situations where agents cannot complete tasks [17] Agent Reliability - Clear planning and boundaries are essential for reliable agents [32] - Subtask decomposition, model predictive control, and explicit plan templating can improve planning [19][20] - Control over the tools agents use is the most important differentiator in agent reliability [28] Environmental Interaction - New AI computer interfaces are needed for agents to interact with the world [28] - Processing information from the environment is crucial for complex systems [29][30] - Agents need to ground themselves in the environment to perform full software development work [32] Call to Action - Factory encourages teams not delegating at least 50% of engineering tasks to AI agents to engage with them [34]
Tesla Grounding Launches as Wearable Scalar Energy Device for EMF Overload, Grounding Support & Vibrational Balance
GlobeNewswire News Room· 2025-05-24 18:05
Core Insights - Tesla Grounding is a wearable energy harmonization device utilizing scalar wave technology inspired by Nikola Tesla, aimed at individuals experiencing energetic imbalances due to EMF exposure and modern lifestyle stressors [2][10][12] - The product offers a non-invasive, portable solution for grounding without the need for wires or batteries, making it suitable for various environments [5][13][17] Group 1: Product Overview - Tesla Grounding operates on the principle of bioenergetic coherence, designed to harmonize the body's natural electrical state with environmental frequencies [14][20] - The device is lightweight, waterproof, and requires no maintenance, allowing for easy integration into daily routines [17][65] - It is positioned as a modern alternative to traditional grounding methods, which often require physical contact with the Earth [29][63] Group 2: Target Audience - Ideal users include professionals in high-EMF environments, frequent travelers, energy-sensitive individuals, holistic wellness practitioners, students, and seniors seeking gentle wellness support [70][72][76] - The product is particularly beneficial for those experiencing symptoms of energetic fatigue, such as tension and mental fog, due to prolonged exposure to electronic devices [26][69] Group 3: User Experiences - User testimonials highlight benefits such as improved energetic alignment, stress relief, and enhanced vibrational coherence, although these experiences are anecdotal [48][49][56] - Many users report feeling more grounded and emotionally centered, especially in environments with high electromagnetic exposure [54][56] Group 4: Pricing and Warranty - Tesla Grounding is available in various pricing packages, with a total price ranging from $399 for a single unit to $999 for a family pack [80][81] - The product comes with a 30-day money-back guarantee, allowing customers to evaluate its compatibility with their wellness needs risk-free [84][89]