深度学习

Search documents
重磅直播!清华&博世开源SOTA性能纯血VLA:Impromptu-VLA告别双系统~
自动驾驶之心· 2025-07-01 12:58
Core Viewpoint - The article discusses the advancements and challenges in autonomous driving systems, particularly in unstructured environments, and introduces the Impromptu VLA framework developed by Tsinghua AIR and Bosch Research Institute to address data gaps in these scenarios [1]. Group 1: Advancements in Autonomous Driving - Current autonomous driving systems have made significant progress in structured environments like cities and highways, but face challenges in unstructured scenarios such as rural roads and construction zones [1]. - Existing large-scale autonomous driving datasets primarily focus on conventional traffic conditions, leading to a lack of specialized, large-scale, and finely annotated data for complex unstructured environments [1]. Group 2: Impromptu VLA Framework - The Impromptu VLA framework aims to provide an open-weight and open-data driving vision-language-action model, which is a fully end-to-end system that extracts multimodal features directly from driving video segments [1]. - Impromptu VLA generates driving commands in natural language format without the need for manually designed perception modules or intermediate representations [1]. - In the NeuroNCAP closed-loop safety evaluation system, Impromptu VLA demonstrates strong decision robustness and generalization capabilities, significantly outperforming the latest BridgeAD system proposed at CVPR 2025 (2.15 vs. 1.60) [1].
你的扫描全能王,作价217亿冲刺港股IPO
量子位· 2025-06-27 10:57
Core Viewpoint - The company, Shanghai Hehe Information Technology, is aiming to become the "first stock of intelligent text recognition" in Hong Kong, following its previous listing on the A-share Sci-Tech Innovation Board. The company has shown significant growth in revenue and user engagement, positioning itself as a leader in the AI sector with a focus on text intelligence technology [2][3][4]. Financial Performance - In 2024, the company reported a revenue of 1.438 billion RMB, a net profit of 400 million RMB, and a gross margin of 84.3% [4][25]. - The revenue growth from 2022 to 2024 was approximately 21% CAGR, with revenues of 989 million RMB, 1.187 billion RMB, and 1.438 billion RMB respectively [25]. - The C-end business accounted for a significant portion of total revenue, with contributions of 82.2%, 84.3%, and 83.8% from 2022 to 2024 [27]. User Engagement - The monthly active users (MAU) for C-end products reached 171 million in 2024, with a paid user ratio of 4.3% [21]. - The company ranks first in China and fifth globally among efficiency AI companies with MAU exceeding 100 million [21][22]. Product Portfolio - The company offers a range of products targeting both C-end and B-end markets, including "Scan All-in-One" and "Business Card All-in-One" for C-end, and "TextIn" and "Qixin Huayan" for B-end [8][12]. - The core technology is based on multi-modal text intelligence, which enhances efficiency in various applications [14][15]. Market Position - The company is positioned as a leading AI firm with a focus on text recognition and processing, competing with major players like OpenAI, Google, Adobe, and Microsoft [5][6][21]. - The global AI product market is projected to grow significantly, with estimates of 46.5 billion USD in 2024 and 228 billion USD by 2029, indicating a robust growth trajectory for the industry [66]. Research and Development - The company has been increasing its R&D investment, with expenditures of 280 million RMB, 323 million RMB, and 390 million RMB from 2022 to 2024, representing about 27% of total revenue [33]. - The workforce consists of 1,053 employees, with 60.6% in R&D roles, highlighting the company's commitment to innovation [35]. Future Plans - The funds raised from the Hong Kong listing will primarily be used for R&D, international expansion, and exploring investment and acquisition opportunities [50].
Cell子刊:盛斌/戴荣平团队开发新型AI模型DeepSLE,从视网膜图像检测系统性红斑狼疮
生物世界· 2025-06-27 03:38
Core Viewpoint - The article discusses the development of a deep learning system called DeepSLE for detecting systemic lupus erythematosus (SLE) from retinal images, highlighting its potential to improve early diagnosis and management of the disease and its complications [4][5][12]. Group 1: Disease Overview - Systemic lupus erythematosus (SLE) is a severe autoimmune disease affecting approximately 3.4 million people globally, with an estimated 3 million being women [2]. - The likelihood of women developing SLE is several times higher than that of men, with a peak incidence typically occurring between the ages of 15 and 45 [2]. Group 2: Screening Challenges - There is a significant challenge in the early detection of SLE due to the lack of widely accepted, standardized, non-invasive, and cost-effective screening tools, especially for asymptomatic or mildly symptomatic individuals [3]. - Current screening methods for SLE-related complications, such as lupus retinopathy (LR) and lupus nephritis (LN), are not routinely implemented in primary care settings, particularly in resource-limited environments [7]. Group 3: DeepSLE Development - The DeepSLE system was developed using a dataset of 666,383 retinal images from 173,346 participants for pre-training, followed by training and validation on over 254,246 images from 91,598 participants across diverse ethnic backgrounds [9]. - The system demonstrated a robust performance in detecting SLE, achieving an area under the receiver operating characteristic curve (AUC) ranging from 0.822 to 0.969 in a multi-ethnic validation dataset [11]. Group 4: Clinical Implications - DeepSLE offers a digital solution for detecting SLE and its related complications from retinal images, presenting significant clinical application potential [12]. - The system showed higher sensitivity compared to primary care physicians in a prospective reader study, indicating its effectiveness in clinical settings [11].
ICCV 2025放榜!录取率24%,夏威夷门票你抢到了吗?
机器之心· 2025-06-26 06:10
机器之心报道 编辑:+0 ICCV 2025 将于 10 月 19 日至 25 日在美国夏威夷举行。刚刚,ICCV 官方向投稿者发送了今年论文 接收结果的通知。 数据显示,今年大会共收到了 11239 份有效投稿,所有投稿均已进入审稿流程。程序委员会推荐录用 2699 篇论文,最终录用率为 24%。 对比前几届数据,2025 年的投稿量几乎接近 2019 年的三倍,这反映了计算机视觉领域的快速扩张和 学术研究的日益活跃。 尽管投稿数量大幅增加,ICCV 的录用率在过去几年中保持了相对稳定,基本维持在 25% - 26% 的 区间内。 继 CVPR 2025 之后,ICCV 2025 会议也实施了一项旨在强化问责制与诚信的新政策。程序委员会主 席团识别出了 25 名极不负责任的审稿人,并因此对与他们相关的 29 篇论文进行了直接拒稿处理。 这些被拒的论文中,有 12 篇若无此关联本应被录用,但这也引发了争议。 ICCV 2023 投稿 8260 篇,录用 2160 篇,录用率约为 26.15%。 ICCV 2021 投稿 6152 篇,录用 1612 篇,录用率为 26.20%。 ICCV 2019 投稿 43 ...
开源晨会-20250625
KAIYUAN SECURITIES· 2025-06-25 14:44
Core Insights - The report highlights the significant growth of the semiconductor third-party testing industry, with a projected domestic market space reaching 180-200 billion yuan by 2027, driven by rapid technological iterations and increased R&D investments in the semiconductor sector [15][16]. Company Overview - The specific company, Victory Nano (688757.SH), is recognized as a leading semiconductor third-party testing service provider in China, often referred to as the "chip general hospital." The company has experienced rapid growth, with a CAGR of 35% in revenue and 43% in net profit from 2021 to 2024 [4][14]. - In 2023, the company achieved a market share of 7.86% in the failure analysis and material analysis sectors, solidifying its position as a top player in the industry [14][16]. - The company's testing capabilities extend to 3nm process technology, with nearly 80% of its advanced process revenue coming from the first half of 2024. Future investment projects are expected to further enhance revenue from advanced processes [14][16]. Industry Analysis - The semiconductor third-party testing industry is characterized by a "small, scattered, and weak" competitive landscape, but leading companies are expected to benefit significantly from industry demand growth and the deepening of the Labless model [15][16]. - Key drivers of industry demand include the rapid iteration of semiconductor technology, which increases R&D spending, and the rising requirements for fault tolerance due to advanced process iterations [15][16]. - The report emphasizes that leading companies in the sector are well-positioned to capitalize on the growth opportunities presented by the expanding semiconductor industry and can achieve counter-cyclical growth by relying on resilient R&D demand during market fluctuations [15][16]. Technology Trends - The report discusses the emergence of AI glasses as the next generation of personal smart devices, with major companies like Meta and Xiaomi leading the innovation [5][18]. - Key trends in AI glasses include electrochromic technology, SIP packaging, AR/VR displays, and bone conduction technology, which are expected to enhance user experience and functionality [19][21][22].
NVIDIA Tensor Core 的演变:从 Volta 到 Blackwell
半导体行业观察· 2025-06-24 01:24
Core Insights - The article emphasizes the rapid evolution of GPU computing capabilities in artificial intelligence and deep learning, driven by Tensor Core technology, which significantly outpaces Moore's Law [1][3] - It highlights the importance of understanding the architecture and programming models of Nvidia's GPUs to grasp the advancements in Tensor Core technology [3] Group 1: Performance Principles - Amdahl's Law defines the maximum speedup achievable through parallelization, emphasizing that performance gains are limited by the serial portion of a task [5] - Strong and weak scaling are discussed, where strong scaling refers to improving performance on a fixed problem size, while weak scaling addresses solving larger problems in constant time [6][8] Group 2: Data Movement and Efficiency - Data movement is identified as a significant performance bottleneck, with the cost of moving data being much higher than computation, leading to the concept of the "memory wall" [10] - Efficient data handling is crucial for maximizing GPU performance, particularly in the context of Tensor Core operations [10] Group 3: Tensor Core Architecture Evolution - The article outlines the evolution of Nvidia's Tensor Core architecture, including Tesla V100, A100, H100, and Blackwell GPUs, detailing the enhancements in each generation [11] - The introduction of specialized instructions like HMMA for half-precision matrix multiplication is highlighted as a key development in Tensor Core technology [18][19] Group 4: Tensor Core Generations - The first generation of Tensor Core in the Volta architecture supports FP16 input and FP32 accumulation, optimizing for mixed-precision training [22][27] - The Turing architecture introduced the second generation of Tensor Core with support for INT8 and INT4 precision, enhancing capabilities for deep learning applications [27] - The Ampere architecture further improved performance with asynchronous data copying and introduced new MMA instructions that reduce register pressure [29][30] - The Hopper architecture introduced Warpgroup-level MMA, allowing for more flexible and efficient operations [39] Group 5: Memory and Data Management - The introduction of Tensor Memory (TMEM) in the Blackwell architecture aims to alleviate register pressure and improve data access efficiency [43] - The article discusses the importance of structured sparsity in enhancing Tensor Core throughput, particularly in the context of the Ampere and Hopper architectures [54][57] Group 6: Performance Metrics - The article provides comparative metrics for Tensor Core performance across different architectures, showing significant improvements in FLOP/cycle and memory bandwidth [59]
不止是爬山神器,更是四肢增强“外挂”
红杉汇· 2025-06-22 05:03
真正的技术突破在1967年才到来,美国通用电气公司研制的"Hardiman"外骨骼机器人原型机横空出世。这款 原型机采用半仿生构型设计,通过液压驱动,并且存在力量反馈系统,包含30多个动力关节,能辅助普通 人轻松举起一百多公斤的物体。然而,"Hardiman"680公斤的自重、迟缓的动作节奏和惊人的能耗,严重限 制了该机器人项目的落地。不过,它的诞生依然为外骨骼机器人的未来探索指引了方向。 在泰山十八盘的陡峭石阶上,一位白发登山者轻松越过年轻游客的队伍。他腰腿都包裹着流线型金属支架,步 伐稳定而轻快——这不是科幻电影里的场景,而是泰山景区内常见的真实画面。80元租用3小时的外骨骼机器 人,正让曾经遥不可及的"机械战甲"走进普通人的生活。 所谓外骨骼机器人,是一种通过机械结构与人体关节紧密耦合,增强或替代人体上肢、下肢运动能力的智能辅 助设备,宛如为人体安装了"物理外挂",赋予人们应对各类体力挑战的非凡能力。 就如电影《钢铁侠》中,托尼·斯塔克的能量战甲让他成为名副其实的钢铁侠,《流浪地球》中的动力装甲为人 类在极端环境下的生存和工作提供了强大的支持,在现实中,除了户外运动,外骨骼机器人还被应用至工业、 医疗、 ...
【广发金工】基于AGRU因子聚合的ETF轮动策略
广发金融工程研究· 2025-06-19 05:03
Core Viewpoint - The rapid development of ETFs in the A-share market has led to a significant increase in their scale and number, surpassing actively managed funds, indicating a growing preference for passive investment strategies among investors [4][5]. Group 1: ETF Growth and Market Dynamics - As of June 15, 2025, the total scale of stock ETFs (including off-market linked funds) reached 3.81 trillion yuan, with the number of ETFs totaling 2,031, exceeding the scale of actively managed funds at 2.84 trillion yuan [4][5]. - The A-share market exhibits significant industry and style differentiation, suggesting that merely holding a single ETF for the long term may not yield optimal investment experiences [4][6]. - The investment objective of ETFs is to closely track the net value performance of specific indices, making the choice of index crucial for investors seeking substantial returns [6][10]. Group 2: ETF Rotation Strategy Development - A common method for constructing ETF rotation strategies involves aggregating effective stock factors at the index level, allowing for index rotation effects [2][11]. - The use of the AGRU model based on daily K-line volume and price data has resulted in the identification of high-performing stock selection factors in the A-share market [12][16]. - Monthly rebalancing of the strategy yielded an average IC of 7.80%, with an annualized excess return of 4.92% and a maximum drawdown of -14.02% [31][39]. Group 3: Performance of Fixed Number ETF Rotation Strategies - Limiting the number of held ETFs to 5, 10, or 15 resulted in varying annualized excess returns: 12.34% for 5 ETFs, 8.75% for 10 ETFs, and 8.13% for 15 ETFs, with corresponding maximum drawdowns of -12.17%, -8.83%, and -8.66% respectively [59][65]. - The strategy consistently achieved positive excess returns annually, with a notable 8.74% excess return year-to-date [63][65]. Group 4: Factor Testing and Adjustments - The factor's performance was enhanced through the adjustment of the loss function, leading to improved multi-directional return performance [17][19]. - The AGRU factor demonstrated strong stock selection effects across various stock pools, with annualized excess returns of 21.97% for the CSI 300 pool and 11.46% for the CSI 500 pool [64][65]. Group 5: MMR Algorithm and Risk Diversification - The MMR (Maximum Marginal Relevance) algorithm was employed to reduce the correlation among selected investment targets, enhancing the stability of the strategy's performance [45][50]. - The strategy's annualized excess return improved from 7.94% to 8.43% after implementing the MMR adjustments, with a corresponding increase in the information ratio [50][52].
【广发金工】强化学习与价格择时
广发金融工程研究· 2025-06-18 01:33
Core Viewpoint - The article discusses the potential of Reinforcement Learning (RL) in quantitative investment, particularly in developing timing strategies that can maximize cumulative returns through trial and error learning mechanisms [1][2]. Summary by Sections 1. Introduction to Reinforcement Learning - Reinforcement Learning (RL) is a machine learning method that enables decision-making systems to learn optimal actions in specific situations to maximize cumulative rewards. This method is particularly suitable for environments with clear goals but no direct guidance on achieving them [6][12]. 2. Timing Strategy - The article focuses on the Double Deep Q-Network (DDQN) model, which uses 10-minute frequency price and volume data as input. The goal is for the model to learn to provide buy/sell/hold signals at various time points to maximize end-period returns. The backtesting phase outputs timing signals every 10 minutes, adhering to a t+1 trading rule [2][3]. 3. Empirical Analysis - The strategy was tested on various liquid ETFs and stocks from January 1, 2023, to May 31, 2025. The results showed that the strategy generated 72, 30, 73, and 188 timing signals for different assets, with average win rates of 52.8%, 53.3%, 54.8%, and 51.6%, respectively. Cumulative returns outperformed benchmark assets by 10.9%, 35.5%, 64.9%, and 37.8% [3][74][80]. 4. Summary and Outlook - Despite the impressive performance of RL in various fields, challenges such as stability issues remain in the quantitative investment domain. Future reports will explore more RL algorithms to develop superior strategies [5]. 5. Data Description - The timing strategy was applied to the CSI 300 Index, CSI 500 Index, CSI 1000 Index, and a specific stock, utilizing liquid ETFs corresponding to these indices. The training data spanned from January 1, 2014, to December 31, 2019, with validation and testing periods defined [74][75]. 6. Performance Metrics - The performance metrics for the RL timing strategy included total returns, annualized returns, maximum drawdown, annualized volatility, Sharpe ratio, information ratio, and return-to-drawdown ratio, demonstrating the strategy's effectiveness compared to benchmark assets [77][80].
初赛报名截止倒计时!75万奖池+心动Offer,启元实验室重磅赛事等你来战!
机器之心· 2025-06-16 05:16
编辑:吴昕 大赛报名于 2025年6月25日截止,感兴趣的团队尽快报名参赛。 百舸争流,「启智杯」 初赛火热进行中 随着人工智能技术的不断突破,智能化浪潮正深刻改变千行百业, 中国也迎来人工智能加速应用期。 为推动智能算法从理论创新走向实际落地, 5 月 20 日,启元实验室正式启动「启智杯」算法大赛。 本届大赛围绕「卫星遥感图像鲁棒实例分割」「面向嵌入式平台的无人机对地目标检测」以及「面向多 模态大模型的对抗」三大命题,聚焦鲁棒感知、轻量化部署与对抗防御三大关键技术,旨在引导技术创 新精准对接真实场景,加快算法能力的转化落地与规模化应用。 赛事一经发布,便迅速点燃全国 技术圈 热情,目前已有来自高校、科研院所、科技企业的 500 余支 队伍报名。其中不乏清华、北大、复旦、上交、南大、武大、华科、中科大、哈工大、国防科大、西 交、成电等顶尖高校队伍,以及中科院自动化所、 中科院 空天信息创新研究院等科研机构团队,为赛 事注入强劲科研力量。 目前,赛事正处于初赛的关键节点。三大赛道的选手们正围绕核心任务展开高强度的建模与调优,争分 夺秒攻克技术难点,不断迭代优化模型方案,部分赛题的竞争已经进入白热化阶段。 三大 ...