Workflow
计算机视觉
icon
Search documents
混元3D开源端到端全景深度估计器,代码+精选全景数据已上线,在线可玩
量子位· 2025-10-14 04:08
Core Insights - The article discusses the development of DA, a novel end-to-end panoramic depth estimator by Tencent's Mixed Reality 3D team, which addresses the challenges of panoramic data scarcity and zero-shot generalization capabilities [2][8]. Group 1: Background and Challenges - Panoramic images provide a 360°×180° immersive view, essential for advanced applications like AR/VR and 3D scene reconstruction [5][6]. - Traditional methods for depth estimation in panoramic images are limited due to the scarcity of panoramic depth data and the inherent spherical distortion of panoramic images [10][12]. - The team aims to expand panoramic data and build a robust data foundation for DA [8]. Group 2: Data Augmentation Engine - The team developed a data management engine to convert high-quality perspective depth data into panoramic data, significantly increasing the quantity and diversity of panoramic samples [11][14]. - Approximately 543K panoramic samples were created, expanding the total sample size from about 63K to approximately 607K, addressing the issue of data scarcity [14]. Group 3: Model Architecture and Training - The SphereViT architecture was introduced to mitigate the effects of spherical distortion, allowing the model to focus on the spherical geometry of panoramic images [16][17]. - The training process incorporates distance loss for global accuracy and normal loss for local surface smoothness, enhancing the model's performance [18]. Group 4: Experimental Results - DA demonstrated state-of-the-art (SOTA) performance, with an average improvement of 38% in AbsRel performance compared to the strongest zero-shot methods [23][24]. - Qualitative comparisons showed that DA's training utilized approximately 21 times more panoramic data than UniK3D, resulting in more accurate geometric predictions [27]. Group 5: Application Scenarios - DA's exceptional zero-shot generalization capabilities enable a wide range of 3D reconstruction applications, such as panoramic multi-view reconstruction [28]. - The model can reconstruct globally aligned 3D point clouds from panoramic images of different rooms in a house or apartment, ensuring spatial consistency across multiple panoramic views [29].
苹果拟收购 Prompt AI,后者 Seemour 应用专攻监控识别分析领域
Huan Qiu Wang Zi Xun· 2025-10-11 04:31
Core Viewpoint - Apple is in the final stages of negotiations to acquire Prompt AI, a startup specializing in computer vision, which is seen as a significant move to enhance its smart home security and visual perception ecosystem [1][3]. Group 1: Acquisition Details - The acquisition aims to integrate Prompt AI's core technology and talent into Apple, enhancing its capabilities in smart home security [1]. - Prompt AI's leadership has informed employees about the transaction, indicating that some employees not joining Apple will receive compensation and can apply for other positions within the company [1][3]. - Investors in Prompt AI will recover some funds from the transaction, but not all of their initial investment [1][3]. Group 2: Company Background - Prompt AI was founded in 2023 and raised $5 million in seed funding led by AIX and Abstract Ventures [3]. - Its main product, Seemour, connects with home security cameras and offers advanced recognition and analysis features, including real-time alerts for unusual activities [3]. - Despite the technology's success, Prompt AI's CEO stated that the current business model did not meet expectations, leading to the decision to discontinue the Seemour application and delete user data for privacy [3]. Group 3: Industry Trends - The trend among Silicon Valley tech giants is to acquire AI talent through "acquihire" strategies, which helps enhance R&D capabilities while mitigating regulatory pressures [3]. - Compared to other major tech companies, Apple's acquisition is relatively small; for instance, Meta invested $14.3 billion in Scale AI, and Google spent $2.4 billion on Windsurf [3]. Group 4: Apple's Acquisition Strategy - Historically, Apple has maintained a cautious acquisition strategy, with its largest deal being the $3 billion purchase of Beats in 2014 [4]. - Apple prefers to acquire smaller tech teams to integrate their technology and talent into its product lines for organic upgrades [4]. - The company has been slow in the generative AI space, partly due to its reluctance to engage in large-scale acquisitions, resulting in a 2% decline in its stock price this year [4]. Group 5: Future Implications - If the acquisition is completed, Prompt AI's technology and team are expected to be integrated into Apple's HomeKit smart home division, strengthening its visual perception and security ecosystem [4].
Insta360最新全景综述:全景视觉的挑战、方法与未来
机器之心· 2025-10-04 03:38
Core Insights - The article discusses the transition from perspective vision to panoramic vision, highlighting the "perspective-panorama gap" as a central theme for understanding the challenges and opportunities in this field [6][19]. - It emphasizes the need for a systematic upgrade across data, models, and applications to enhance the usability of panoramic vision technologies [16][19]. Research Background and Motivation - The paper titled "One Flight Over the Gap: A Survey from Perspective to Panoramic Vision" aims to systematically analyze the differences between perspective and panoramic vision, covering over 300 papers and 20 representative tasks [4][19]. - The article provides a comprehensive overview of the challenges faced in panoramic vision, which are categorized into three main gaps: geometric distortion, non-uniform sampling, and boundary continuity [6][9]. Strategies Overview - Four main strategies are identified for adapting tasks to panoramic vision: 1. **Geometric Distortion**: Issues arise when spherical images are projected onto a plane, leading to shape distortion [7]. 2. **Non-uniform Sampling**: Pixel density varies significantly across different regions, affecting resolution [7]. 3. **Boundary Continuity**: The separation of boundaries in 2D images can lead to learning continuity issues [7]. - The article outlines a cross-method comparison to clarify the applicability of different strategies to various tasks [9][15]. Task Toolbox - The article lists over 20 tasks categorized into four main areas: enhancement and assessment, understanding, multi-modal, and generation, along with representative methods and key papers for each task [12][15]. - It highlights the rapid emergence of new paradigms such as diffusion and generative models, particularly in text-to-image/video and novel view synthesis [15]. Future Directions - To transition from "usable" to "user-friendly," advancements must be made in three main areas: data, model paradigms, and downstream applications [16][21]. - Key challenges include: 1. **Data Bottlenecks**: Lack of large-scale, diverse, and high-quality 360° datasets limits general training and reproducible evaluation [21]. 2. **Model Paradigms**: The need for robust models that can adapt from perspective to panoramic vision while maintaining performance across various tasks [21]. 3. **Downstream Applications**: Applications in spatial intelligence, XR, 3D reconstruction, and various industry sectors require effective deployment and compliance [21][22].
凌云光现5笔大宗交易 总成交金额1919.09万元
Core Viewpoint - Lingyun Guang conducted five block trades on September 18, with a total trading volume of 446,300 shares and a total transaction amount of 19.19 million yuan, reflecting a discount of 14.03% compared to the closing price of the day [2][4] Group 1: Trading Activity - The total transaction amount for the five block trades was 19.19 million yuan, with an average transaction price of 43.00 yuan per share [2] - In the last three months, Lingyun Guang has seen six block trades with a cumulative transaction amount of 24.11 million yuan [3] - The stock closed at 50.02 yuan on the same day, marking a 12.46% increase, with a daily turnover rate of 9.95% and a total trading volume of 2.23 billion yuan [3] Group 2: Institutional Participation - Institutional proprietary seats appeared in three of the block trades, with a total transaction amount of 12.04 million yuan and a net purchase of 12.04 million yuan [2] - The latest margin financing balance for the stock is 610 million yuan, having increased by 141 million yuan over the past five days, representing a growth of 29.93% [4]
凌云光: 凌云光技术股份有限公司前次募集资金使用情况鉴证报告
Zheng Quan Zhi Xing· 2025-08-29 17:47
Core Viewpoint - The report provides a detailed account of the fundraising activities and the utilization of the raised funds by Lingyun Optical Technology Co., Ltd. as of June 30, 2025, confirming compliance with regulatory guidelines and reflecting the company's financial management practices [1][2][3]. Fundraising and Storage - The company raised a total of RMB 1,973.70 million by issuing 90 million shares at RMB 21.93 per share, with net proceeds amounting to RMB 1,805.28 million after deducting underwriting and other fees [3]. - As of June 30, 2025, the company had a total of RMB 427.21 million in its fundraising accounts, with RMB 399.50 million invested in financial products [16]. Fund Utilization - The report indicates that the company has not changed the investment projects for the raised funds, and it has approved the use of funds for its wholly-owned subsidiary to implement specific projects [8][9]. - The company has allocated funds for various projects, including the Industrial Artificial Intelligence Taihu Industrial Base and the development of intelligent visual equipment, with a total investment of RMB 150 million planned for these initiatives [18]. Project Performance - The report highlights that the actual investment in projects has not deviated from the commitments made during the fundraising process, with no external transfers or replacements of the investment projects reported [11][12]. - The company has achieved a cumulative utilization rate of 28,083.06 for the Industrial Artificial Intelligence Taihu Industrial Base project, although it is still under construction and has not yet generated profits [20]. Cash Management - The company has been authorized to use up to RMB 170 million of idle funds for cash management, investing in safe and liquid financial products, with the aim of optimizing the use of funds [12][13]. - As of June 30, 2025, the company had not used any of the raised funds for share subscriptions, indicating a focus on project investment rather than equity financing [12][16].
虹软科技(688088):2025年半年报点评:汽车业务量产驱动增长,AI眼镜+商拍未来可期
Minsheng Securities· 2025-08-26 08:59
Investment Rating - The report maintains a "Recommended" rating for the company [4][7]. Core Viewpoints - The company achieved a revenue of 410 million yuan in the first half of 2025, representing a year-on-year growth of 7.73%, with a net profit attributable to the parent company of 89 million yuan, up 44.06% [1]. - The mobile intelligent terminal segment generated revenue of 339 million yuan, growing by 2.23%, while the smart automotive segment saw a significant increase of 49.09% in revenue, reaching 65 million yuan [2][3]. - The company is positioned to benefit from the recovery of emerging markets and aims to tap into the blue ocean market of smart commercial photography, projecting revenues of 1.011 billion yuan, 1.274 billion yuan, and 1.618 billion yuan for 2025, 2026, and 2027 respectively [4]. Summary by Sections Mobile Intelligent Terminals - The Turbo Fusion technology has enhanced stability and efficiency in mobile devices, leading to improved image processing and reduced power consumption [2]. - The company has established partnerships with leading manufacturers, solidifying its position in the AI glasses market [2]. Smart Automotive - The company has successfully launched core products for in-cabin applications and is progressing steadily with driver assistance systems [3]. - The Tahoe product, a comprehensive vehicle-mounted visual solution, has been delivered in mass production to renowned luxury brands in Europe [3]. AI Vision - The ArcMuse 2025 V1.1 model has been upgraded, enhancing capabilities across various business sectors [4]. - The PSAI product has introduced new features tailored for the apparel industry, significantly expanding its market presence on major e-commerce platforms [4]. Financial Projections - The company forecasts revenues of 1.011 billion yuan for 2025, with a net profit of 234 million yuan, reflecting a growth rate of 32.2% [6][12]. - The projected PE ratios for 2025, 2026, and 2027 are 91X, 68X, and 48X respectively, indicating a favorable valuation trend [4][6].
36氪广东首发|全国首家、专注计算成像研发,「西湖智能」完成超五千万元Pre-A轮融资
3 6 Ke· 2025-07-14 10:15
Company Overview - Xihu Intelligent Vision Technology (Hangzhou) Co., Ltd. is a leading enterprise in the domestic computational imaging field, focusing on hyperspectral imaging equipment innovation [1] - The company was established in 2022 and provides integrated imaging system solutions that combine sensing, storage, and computation [1][3] - Xihu Intelligent has completed over 50 million yuan in Pre-A financing, led by Dongfang Fuhai, to enhance its technology product matrix, increase production capacity, and accelerate global expansion [1] Technology and Product Development - The company aims to develop the next generation of low-cost, low-power, low-bandwidth, and high-throughput intelligent visual imaging systems, focusing on chip integration [1][4] - Xihu Intelligent's core technology matrix includes: 1. Video-level hyperspectral imaging, allowing complete hyperspectral data acquisition in a single exposure [4] 2. Ultra-high-speed dynamic capture at tens of thousands of frames per second [4] 3. Micron-level 3D imaging for high-precision visual perception [4] Market Potential and Applications - The computational imaging market is projected to be at least a hundred billion yuan, with the industry currently in a sprint phase [3] - Applications of Xihu Intelligent's technology include smart agriculture, forest protection, and various fields such as medical early screening, industrial quality inspection, consumer electronics, and autonomous driving [3][5] - The company is positioned to reshape many sectors by leveraging AI and computational imaging to collect multi-dimensional information efficiently [5] Future Plans - Xihu Intelligent's three-year plan focuses on expanding its product line and maintaining a technological edge based on its three core technologies [5] - The five-year plan includes building standardized production lines to enhance scalable delivery capabilities and promote the establishment of standards in the computational imaging field [5]
工业异常检测新突破,复旦等多模态融合监测入选CVPR 2025
量子位· 2025-06-16 06:59
Core Viewpoint - The article discusses a significant breakthrough in industrial anomaly detection through the introduction of the Real-IAD D³ dataset and a novel multi-modal fusion detection method called D³M, which enhances detection performance by integrating various data types [1][11][12]. Group 1: Dataset Overview - The Real-IAD D³ dataset was developed to address limitations in existing anomaly detection methods, providing a comprehensive resource that includes high-resolution RGB images, pseudo 3D photometric images, and micron-level precision 3D point cloud data [3][4]. - The dataset encompasses 20 industrial product categories and 69 defect types, totaling 8,450 samples, with 5,000 normal samples and 3,450 abnormal samples [4]. - Real-IAD D³ significantly outperforms existing datasets like MVTec 3D-AD and Real3D-AD in terms of data scale, defect diversity, and point cloud precision, achieving a point cloud precision of 0.002 mm compared to 0.11 mm and 0.011-0.015 mm for the others [4]. Group 2: Methodology and Performance - The D³M method leverages the Real-IAD D³ dataset by integrating RGB, point cloud, and pseudo 3D depth information, which enhances the performance of anomaly detection [6][11]. - Experimental results indicate that D³M outperforms single and dual-modal methods in both image-level and pixel-level anomaly detection metrics, underscoring the importance of multi-modal fusion in industrial anomaly detection [6][8]. - A comparative analysis of different modality combinations shows that D³M achieves the highest detection accuracy, validating the effectiveness of the multi-modal approach [8][9]. Group 3: Implications and Future Directions - The research is expected to advance the field of industrial anomaly detection, providing more reliable solutions for quality control in manufacturing [12]. - This study is part of the Real-IAD series, with the first work also being recognized at CVPR 2024, indicating ongoing contributions to the field [13].
虹软科技:超域融合推进顺利,智能驾驶持续高增-20250415
SINOLINK SECURITIES· 2025-04-15 01:35
Investment Rating - The report maintains a "Buy" rating for the company, expecting a price increase of over 15% in the next 6-12 months [5][13]. Core Insights - In 2024, the company achieved revenue of 815 million RMB, a year-on-year increase of 21.6%, and a net profit attributable to shareholders of 177 million RMB, up 99.7% year-on-year [2]. - For Q1 2025, the company reported revenue of 209 million RMB, a 13.8% increase year-on-year, and a net profit of 50 million RMB, reflecting a 45.4% year-on-year growth [2]. - The mobile intelligent terminal visual solutions segment generated revenue of 675 million RMB in 2024, growing 16.2% year-on-year, while the smart driving and other IoT intelligent devices visual solutions segment saw revenue of 127 million RMB, up 71.2% year-on-year [3]. - The company has successfully penetrated the market with its Turbo Fusion technology across various smartphone models, including mid-range and low-end devices, and has launched the first domestic AI glasses [3][4]. Financial Performance - Revenue projections for 2025-2027 are 992 million RMB, 1.218 billion RMB, and 1.5 billion RMB, with year-on-year growth rates of 21.74%, 22.77%, and 23.15% respectively [5][10]. - The net profit forecasts for the same period are 220 million RMB, 270 million RMB, and 339 million RMB, with growth rates of 24.54%, 22.81%, and 25.62% respectively [5][10]. - The company’s P/E ratios are projected to be 79.24, 64.52, and 51.36 for 2025, 2026, and 2027 respectively [5][10].