Workflow
视觉
icon
Search documents
视觉感知驱动的多模态推理,阿里通义提出VRAG,定义下一代检索增强生成
机器之心· 2025-06-03 08:57
在数字化时代,视觉信息在知识传递和决策支持中的重要性日益凸显。然而,传统的检索增强型生成(RAG)方法在处理视觉丰富信息时面临着诸多挑战。 一方面,传统的基于文本的方法无法处理视觉相关数据;另一方面,现有的视觉 RAG 方法受限于定义的固定流程,难以有效激活模型的推理能力。 来自阿里巴巴通义实验室的最新研究成果 ——VRAG-RL(Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning),将强化学习算法引入多模态智能体训练,借助迭代推理和视觉感知空间,全方位提升视觉语言 模型(VLMs)在检索、推理和理解视觉信息方面的能力,为纯视觉检索增强生成任务提供有效解决方案,代码、模型全面开源! Paper 地址:arxiv.org/pdf/2505.22019 Github 地址:https://github.com/Alibaba-NLP/VRAG 为了解决现有 RAG 方法在处理视觉丰富文档时面临的挑战,尤其 ...
基康仪器:北交所公司深度报告:国内智能安全监测领域领军者,对标海外龙头基恩士-20250603
KAIYUAN SECURITIES· 2025-06-03 06:23
Investment Rating - The investment rating for the company is "Accumulate" (maintained) [3] Core Viewpoints - The company, Jikang Instruments, is a leading player in the domestic intelligent safety monitoring sector, comparable to the overseas leader Keyence. The company has shown steady growth in recent years, with a Q1 2025 net profit of 20 million yuan, representing a 44.91% increase year-on-year. Revenue for the same period reached 78 million yuan, up 19.07% year-on-year [5][20]. - The company is expected to benefit from the accelerating infrastructure development in the energy and water conservancy sectors, which will drive demand for safety monitoring solutions. The company continues to invest in R&D, with new products like the machine vision deformation monitoring system gaining traction in various industries [6][7]. Summary by Relevant Sections Company Overview - Jikang Instruments, established in 1998, is one of the largest suppliers of outdoor safety monitoring instruments and system solutions in China. The company is recognized as a national high-tech enterprise and a "specialized and innovative" small giant enterprise [20][21]. Product Offerings - The company's main products include intelligent monitoring terminals (precision sensors and smart data acquisition devices) and safety monitoring IoT solutions. These products are widely applied in various sectors, including energy, water conservancy, transportation, smart cities, and geological disaster monitoring [21][22]. Financial Performance - In Q1 2025, the company achieved a revenue of 78 million yuan and a net profit of 20 million yuan. The company maintains its profit forecasts for 2025-2026 and has added a forecast for 2027, expecting net profits of 89 million, 103 million, and 116 million yuan for the years 2025, 2026, and 2027, respectively [5][6]. Industry Outlook - The energy sector is expected to see long-term growth, driven by the development of hydropower, nuclear power, and wind power. The water conservancy sector is also expanding, with significant investments planned for the reinforcement of dams and reservoirs [6][7]. Competitive Positioning - Jikang Instruments is positioned to compete with Keyence, a global leader in industrial automation. The company’s machine vision products have high gross margins and are increasingly being adopted in various applications, indicating a strong potential for market share growth [7][8].
基康仪器(830879):北交所公司深度报告:国内智能安全监测领域领军者,对标海外龙头基恩士
KAIYUAN SECURITIES· 2025-06-03 05:45
Investment Rating - The investment rating for the company is "Accumulate" (maintained) [3]. Core Views - The company, Jikang Instruments, is a leading player in the domestic intelligent safety monitoring sector, comparable to the overseas leader Keyence. The company has shown steady growth in recent years, with a Q1 2025 net profit of 20 million yuan, representing a 44.91% increase year-on-year. Revenue for the same period reached 78 million yuan, up 19.07% [5][20]. - The company is expected to benefit from the accelerating infrastructure development in the energy and water conservancy sectors, which will drive demand for safety monitoring solutions. The company has a strong focus on R&D, with new products like the machine vision deformation monitoring system gaining traction in various industries [6][7]. Summary by Sections Company Overview - Jikang Instruments, established in 1998, is one of the largest suppliers of outdoor safety monitoring instruments and system solutions in China. The company is recognized as a national high-tech enterprise and a "little giant" enterprise in Beijing, specializing in precision sensors, data collectors, and intelligent sensing terminals [20][21]. Product Offerings - The company's main products include intelligent monitoring terminals (precision sensors and intelligent data collection devices) and safety monitoring IoT solutions. These products are widely applied in various sectors, including energy, water conservancy, transportation, smart cities, and geological disaster monitoring [21][22]. Industry Insights - The energy sector is expected to see long-term growth, particularly in hydropower, nuclear power, and wind power, which will drive the company's performance. The water conservancy sector is also expanding, with significant investments planned for the reinforcement of dams and reservoirs [6][7]. Competitive Positioning - Jikang Instruments is positioned against Keyence, a global leader in industrial automation. The company aims to leverage the high-margin attributes of machine vision products and capitalize on the trend of domestic substitution in the market [7][18]. Financial Projections - The company forecasts net profits of 89 million yuan, 103 million yuan, and 116 million yuan for 2025, 2026, and 2027, respectively. Corresponding EPS estimates are 0.53 yuan, 0.62 yuan, and 0.70 yuan per share, with P/E ratios of 31.7, 27.5, and 24.4 times [5][6].
速腾聚创发布2025Q1业绩,机器人产品销量高增长
Ge Long Hui· 2025-06-02 18:10
Performance Summary - In Q1 2025, the company's revenue was 328 million yuan, a decrease of 12.4% year-on-year and 30.7% quarter-on-quarter, with a gross margin of 23.5%, an increase of 11.2 percentage points year-on-year and 1.4 percentage points quarter-on-quarter [1] - The net profit for Q1 2025 was -100 million yuan, compared to -130 million yuan in Q1 2024 and -130 million yuan in Q4 2024 [1] Product Sales Analysis - The total sales of LiDAR products for ADAS applications decreased from approximately 120,400 units in Q1 2024 to about 108,600 units in Q1 2025, primarily due to a decline in sales of automotive-grade solid-state LiDAR [1] - Sales of LiDAR products for robotics and other applications increased from approximately 4,200 units in Q1 2024 to about 11,900 units in Q1 2025, driven by the increased sales of E1R and Airy products [2] Company Developments - As of March 2025, the company secured mass production orders for over 100 models from 30 automotive manufacturers and tier-one suppliers, achieving SOP for 38 models from 12 clients [2] - In January 2025, the company launched the world's first 1,000-line ultra-long-range digital LiDAR EM4, featuring 1080 line emission capability and a detection range of up to 600 meters, along with new LiDAR products E1R and Airy, and the second-generation dexterous hand Papert 2.0 [2] - In February 2025, the company celebrated the off-line ceremony of its one-millionth LiDAR unit, marking it as the first company globally to achieve this milestone in high-line count LiDAR production [2] New Product Launches - In March 2025, the company introduced the AC1, a new robotic vision product based on the Active Camera platform, which integrates LiDAR, cameras, and IMU with a 120°×60° field of view and a measurement range of 70 meters [3] - In April 2025, the company released the next-generation digital LiDAR EMX, featuring true 192 lines, 2.88 million points per second high-resolution point cloud output, a detection range of 300 meters, and an angular resolution of 0.08°×0.1° [3] Strategic Partnerships - The company has achieved significant breakthroughs in the lawn care robot market, securing exclusive partnerships with two leading lawn care robot clients [3] - In May 2025, the company announced a strategic collaboration with Kuka Technology, setting a record for the largest LiDAR order in the lawn care robot industry, with plans to deliver 1.2 million automotive-grade solid-state LiDAR units over the next three years for developing advanced intelligent lawn care robot perception systems [3]
视觉感知技术机器人龙头冲刺港股,割草机器人已卖出15000台
Core Viewpoint - Shenzhen Ledong Robotics Co., Ltd. has submitted its prospectus for an IPO on the Hong Kong Stock Exchange, aiming to become a leading player in the intelligent robotics industry, particularly focusing on visual perception technology [1][2][3] Group 1: Company Overview - Founded in 2017, Ledong Robotics is a full-stack intelligent robotics company based on perception intelligence, specializing in visual perception technology and its applications across various intelligent robotics scenarios [1] - By 2024, Ledong Robotics is projected to be the largest intelligent robotics company globally based on visual perception technology, with over 6 million smart robots equipped with its technology [1][2] - The company has achieved the highest shipment volume of DTOF laser radar in the industry, exceeding 720,000 units in 2024 [1] Group 2: Growth Strategies - Ledong Robotics is actively developing a second growth curve through smart lawn mowers, leveraging its visual perception technology to penetrate markets in Europe, America, and Australia [2] - The first generation of smart lawn mowers is expected to achieve mass production in 2024, with sales projected to exceed 10,000 units in the first year [2] - The second generation of smart lawn mowers, incorporating AI models for scene recognition and boundary detection, is set for mass production in 2025, with steady monthly sales growth since early 2025 [2] Group 3: Financial Performance - The company's revenue from 2022 to 2024 is projected to grow from 234 million RMB to 467 million RMB, while net losses are expected to decrease from 73.13 million RMB to 56.48 million RMB during the same period [2] - Ledong Robotics has completed multiple rounds of financing, attracting investments from notable institutions, including a 1.79% stake held by a company controlled by Alibaba's CEO [2] Group 4: IPO Fund Utilization - The net proceeds from the IPO will primarily be used to enhance R&D in intelligent robotics visual perception technology, upgrade AI algorithms, and optimize product offerings [3] - Funds will also support brand building and international expansion to increase the global customer base and strengthen the company's position as a leading robotics firm [3] - Additional allocations include optimizing production capacity, exploring potential investments and acquisitions, and general corporate purposes [3]
5700问答对全面评估拷问AI空间感!最新空间智能评测基准来了丨浙大&成电&港中文
量子位· 2025-06-02 04:13
ZJU REAL Lab 投稿 量子位 | 公众号 QbitAI 杯子在我的左边还是右边? 这个对人类来说非常简单的问题,连GPT-4o这样级别的视觉语言大模型 (VLMs) 也可能答错。 ViewSpatial-Bench评估集中 包含5700个问答对,涵盖相机视角与人类视角两种框架下的五种空间定位识别任务 。 究其根本,还是 当前的视觉语言大模型在大规模图文数据中学习到的空间信息往往是片段化的,仅限于静态视角的理解,缺乏多维度、多视 角的空间推理能力 。 因此,当面对需要多视角空间推理的任务时,这些模型们就频频卡壳。 但是,具备稳健的空间推理能力与视角理解能力的AI系统,才能真正成为与人类协作的智能体。 为此,来自浙江大学、电子科技大学和香港中文大学的研究团队提出了 首个系统评估VLM多视角多任务下的空间定位能力的基准体系 —— ViewSpatial-Bench,涵盖五种不同的任务类型,从相机和人类视角出发,全面评估模型的空间推理能力。 同时还并配备了能够生成精确方向标签的自动化3D标注流水线。通过高效的3D方向标注生成流程,实现了超过5700个问答对,覆盖丰富的 3D场景。 通过在多视角空间数据集上的 ...
一线调研丨从4小时到20分钟 青岛港科技升级货物“秒通关”
Yang Shi Xin Wen· 2025-06-02 03:01
折射外贸新机遇 在山东港口青岛港前湾港区,这艘前往美国东部的集装箱轮已装卸完成,比原计划提前四小时离港启航。 这艘集装箱轮原计划在青岛港是只卸不装,现在面对新的市场形势,公司立即调整了运营策略,船舶将迅速装箱,继续前往美国。随着这条航线从之前每两 周一班重新调回每周一班,货代公司的业务量也出现了暴增。 作为中国北方重要的国际航运枢纽,青岛港的航线通达全球700多个港口,一季度集装箱吞吐量同比增速达到了7.4%。 在当下复杂多变的国际贸易形势下,这座港口如何找准方向为外贸企业"保驾护航"? 港口"流量"剧增 青岛瑞之航国际物流有限公司航线经理 岳富海:第一是找货,再就是舱位的问题。一切都是速度,海运费上涨的速度也是很快的。 自5月14日中美最新关税落地实施后,青岛港码头收到了不少对美出口客户的货运订单,加上6—9月是美国传统的进货旺季。青岛港前湾港区近万米长的集 装箱岸线上,26个泊位靠满货轮,集装箱被不断吊起,装满货物的卡车一辆接着一辆。 从家居到鞋履,不同行业的外贸企业也在全力抢时间、赶订单。 企业负责人告诉我们,这场变局中,他们也在摆脱对单一市场的依赖,加大开拓欧洲、非洲、东南亚等国外市场的力度。 青岛锦 ...
SFT在帮倒忙?新研究:直接进行强化学习,模型多模态推理上限更高
机器之心· 2025-06-01 03:30
机器之心报道 编辑:张倩 「尽管经过 SFT 的模型可能看起来在进行推理,但它们的行为更接近于模式模仿 —— 一种缺乏泛化推理能力的伪推理形式。」 随着 OpenAI 的 o1/o3 和 Deepseek-R1 等具备强大推理能力的大语言模型相继问世,学界普遍采用「监督微调 + 强化学习」的两阶段训练范式:先通过推理数据进 行监督微调(SFT),再通过强化学习(RL)进一步提升性能。这种成功模式启发了研究人员将其优势从纯文本领域拓展到视觉 - 语言大模型(LVLM)领域。 但近日的一项研究成果却给出了一个惊人的发现:「SFT 可能会阻碍学习 —— 经常导致出现伪推理路径,而 RL 则是在促进真正的多模态推理!」 这个发现来自加州大学圣克鲁兹分校和德克萨斯大学达拉斯分校等机构的一个研究团队,他们深入探讨了「SFT+RL」这一经典范式在视觉语言模型开发中的适用 性,其中重点关注了两个核心问题:1)SFT 与 RL 在多模态推理中分别产生何种独特作用?2)这种两阶段训练对 LVLM 的推理能力是否确有必要? 论文标题: SFT or RL? An Early Investigation into Training ...
多方联动 潘家园眼镜发力双节消费
Bei Jing Shang Bao· 2025-05-30 08:01
北京商报讯(记者 刘卓澜)双节将至,北京商业领域热潮涌动,各类特色活动竞相登场。5月 30日,第 二届潘家园眼镜节开幕,本次活动从5月30日持续至6月2日,覆盖端午假期和儿童节。据介绍,本届眼 镜节以北京眼镜城为主会场,联动名镜苑眼镜城、友瑞眼镜城等6个分会场,让消费者能够在潘家园区 域内,实现对不同品牌、款式眼镜的挑选与对比,完善购镜体验。 北京爱尔英智眼科医院将为潘家园眼镜节提供专业的眼部健康检查服务,帮助消费者了解自身视力状 况,科学选购眼镜产品;同时,双方还计划开展针对特殊群体的公益救助活动,践行社会责任。在科普 教育方面,未来将定期举办讲座、线上科普等活动,向大众普及科学用眼知识,提升全民视觉健康意 识。 开幕式上,潘家园眼镜吉祥物 IP 金丝雀"镜镜"亮相,为潘家园眼镜品牌注入了鲜活且充满科技感的形 象。这一 IP 的推出,旨在拉近品牌与消费者的距离。与此同时,潘家园眼镜行业联盟与北京爱尔英智 眼科医院共同签署《视觉健康共建合作协议》,双方将发挥在产业资源与医疗资源方面的优势,在眼部 健康检查、公益救助、科普教育等领域开展深度共建。 潘家园街道介绍,政企医三方联动的模式,是潘家园在眼镜产业发展中的 ...
机器狗能当羽毛球搭子了!仅靠强化学习从0自学,还涌现出类人回位行为 | Science子刊
量子位· 2025-05-30 07:10
衡宇 发自 凹非寺 量子位 | 公众号 QbitAI 来和机器狗一起运动不?你的羽毛球搭子来了! 无需人工协助,仅靠强化学习 ,机器狗子就学会了羽毛球哐哐对打,就像这样—— 在室外: 在室内: 都不在话下。 基于强化学习,研究人员开发了机器狗的全身视觉运动控制策略,同步控制腿部 (18个自由度) 移动,和手臂挥拍动作。 最终呈现出来的表现不赖,狗子最高挥拍速度达到12米/秒。 在与人类选手的协作比赛中, 某一回合连续击球10次 ,甚至涌现出如击球后回位中心的类人行为。 该研究在各种环境中进行了大量实验,验证了四足机器人预测羽毛球轨迹、有效导航服务区域,以及对人类球员进行最精准打击的能力。 证明了足式移动机器人在复杂和动态的体育场景中应用的可行性 。 研究背后团队来自 苏黎世联邦理工学院 。 相关论文刚刚发表在Science旗下子刊Science Robotics上。 然后生成关键指令,来控制四足底座。 羽毛球"大战"中涌现出类人行为 学会打羽毛球的机器狗是什么配置? 公开数据如下: 主体由 一个四足ANYmal-D底座 和 一个动态手臂DynaArm 组成。 它 配备了一个带有全局快门的ZED X立体相机用于 ...