Workflow
深度学习
icon
Search documents
超低标注需求,实现医学图像分割,UCSD提出三阶段框架GenSeg
3 6 Ke· 2025-08-12 03:24
Core Insights - GenSeg utilizes AI to generate high-quality medical images and corresponding segmentation labels, significantly reducing the manual labeling burden on medical professionals [1][20] - The framework addresses the critical challenge of dependency on large amounts of high-quality annotated data in medical image semantic segmentation [1][20] Summary by Sections Technology Overview - GenSeg is a three-stage framework that tightly couples data augmentation model optimization with semantic segmentation model training, ensuring that generated samples effectively enhance segmentation model performance [2][10] - It can be applied to various segmentation models, such as UNet and DeepLab, improving their performance in both in-domain and out-of-domain scenarios [4][20] Methodology - The framework consists of two main components: a semantic segmentation model that predicts segmentation masks and a mask-to-image generation model that predicts corresponding images [9] - The training process involves three phases: training the generation model with real image-mask pairs, augmenting real segmentation masks to create synthetic image-mask pairs, and evaluating the segmentation model on a validation set to update the generation model [9][10] Experimental Results - GenSeg demonstrates significant sample efficiency, achieving comparable or superior segmentation performance while drastically reducing the number of training samples required [11][20] - In in-domain experiments, GenSeg-UNet requires only 50 images to achieve a Dice score of approximately 0.6, compared to 600 images for standard UNet, representing a 12-fold reduction in data [13] - In out-of-domain tasks, GenSeg-DeepLab achieves a Jaccard index of 0.67 using only 40 images, while standard DeepLab fails to reach this level with 200 images [13] Comparative Analysis - The end-to-end data generation mechanism of GenSeg outperforms traditional separate training strategies, as evidenced by improved performance metrics in various segmentation tasks [15] - Regardless of the type of generation model used, the end-to-end training strategy consistently outperforms the separate training strategy [17] Generalization and Efficiency - GenSeg exhibits strong generalization capabilities across 11 medical image segmentation tasks and 19 datasets, achieving absolute performance improvements of 10-20% while requiring only 1/8 to 1/20 of the training data compared to existing methods [20]
理想VLA实质是强化学习占主导的持续预测下一个action token
理想TOP2· 2025-08-11 09:35
Core Viewpoints - The article presents four logical chains regarding the understanding of "predict the next token," which reflects different perceptions of the potential and essence of LLMs or AI [1] - Those who believe that predicting the next token is more than just probability distributions are more likely to recognize the significant potential of LLMs and AI [1] - A deeper consideration of AI and ideals can lead to an underestimation of the value of what ideals accomplish [1] - The ideal VLA essentially focuses on reinforcement learning dominating the continuous prediction of the next action token, similar to OpenAI's O1O3, with auxiliary driving being more suitable for reinforcement learning than chatbots [1] Summary by Sections Introduction - The article emphasizes the importance of Ilya's viewpoints, highlighting his significant contributions to the AI field over the past decade [2][3] - Ilya's background includes pivotal roles in major AI advancements, such as the development of AlexNet, AlphaGo, and TensorFlow [3] Q&A Insights - Ilya challenges the notion that next token prediction cannot surpass human performance, suggesting that a sufficiently advanced neural network could extrapolate behaviors of an idealized person [4][5] - He argues that predicting the next token well involves understanding the underlying reality that leads to the creation of that token, which goes beyond mere statistics [6][7] Ideal VLA and Reinforcement Learning - The ideal VLA operates by continuously predicting the next action token based on sensor information, indicating a real understanding of the physical world rather than just statistical probabilities [10] - Ilya posits that the reasoning process in the ideal VLA can be seen as a form of consciousness, differing from human consciousness in significant ways [11] Comparisons and Controversial Points - The article asserts that auxiliary driving is more suited for reinforcement learning compared to chatbots due to clearer reward functions [12][13] - It highlights the fundamental differences in the skills required for developing AI software versus hardware, emphasizing the unique challenges and innovations in AI software development [13]
高频选股因子周报:高频因子上周有所分化,深度学习因子持续强势。 AI 增强组合均录得正超额。-20250810
Quantitative Factors and Models Summary Quantitative Factors and Construction Process - **Factor Name**: Intraday Skewness Factor **Construction Idea**: This factor captures the skewness of intraday stock returns, reflecting the asymmetry in return distribution[13][16][18] **Construction Process**: The factor is calculated based on the third moment of intraday return distribution, normalized by the cube of standard deviation. The detailed methodology is referenced in the report "Stock Selection Factor Series Research (19) - High-Frequency Factors on Stock Return Distribution Characteristics"[13][16][18] - **Factor Name**: Downside Volatility Proportion Factor **Construction Idea**: This factor measures the proportion of downside volatility in the total realized volatility of a stock[18][19][20] **Construction Process**: The factor is derived by decomposing realized volatility into upside and downside components. The methodology is detailed in the report "Stock Selection Factor Series Research (25) - High-Frequency Factors on Realized Volatility Decomposition"[18][19][20] - **Factor Name**: Post-Open Buying Intention Proportion Factor **Construction Idea**: This factor quantifies the proportion of buying intention in the early trading period after market open[22][23][24] **Construction Process**: The factor is constructed using high-frequency data to identify and aggregate buying signals in the post-open period. The methodology is detailed in the report "Stock Selection Factor Series Research (64) - Low-Frequency Applications of High-Frequency Data Based on Intuitive Logic and Machine Learning"[22][23][24] - **Factor Name**: Post-Open Buying Intensity Factor **Construction Idea**: This factor measures the intensity of buying activity in the early trading period after market open[27][28][29] **Construction Process**: Similar to the proportion factor, this factor aggregates the magnitude of buying signals during the post-open period, normalized by trading volume[27][28][29] - **Factor Name**: Post-Open Large Order Net Buying Proportion Factor **Construction Idea**: This factor captures the proportion of large order net buying in the early trading period after market open[32][34][35] **Construction Process**: The factor is calculated by summing the net buying of large orders during the post-open period and dividing by total trading volume[32][34][35] - **Factor Name**: Post-Open Large Order Net Buying Intensity Factor **Construction Idea**: This factor measures the intensity of large order net buying in the early trading period after market open[37][39][40] **Construction Process**: The factor aggregates the net buying of large orders during the post-open period, normalized by the total number of large orders[37][39][40] - **Factor Name**: Improved Reversal Factor **Construction Idea**: This factor captures the reversal effect in stock returns, adjusted for high-frequency data characteristics[40][43][44] **Construction Process**: The factor is constructed by identifying stocks with extreme short-term returns and measuring their subsequent reversal performance[40][43][44] - **Factor Name**: Deep Learning Factor (Improved GRU(50,2)+NN(10)) **Construction Idea**: This factor leverages a deep learning model combining GRU and neural networks to predict stock returns[63][65][66] **Construction Process**: The model uses 50 GRU units and 10 neural network layers, trained on historical high-frequency data to predict short-term stock returns[63][65][66] - **Factor Name**: Deep Learning Factor (Residual Attention LSTM(48,2)+NN(10)) **Construction Idea**: This factor employs an LSTM model with residual attention mechanisms to enhance prediction accuracy[65][66][68] **Construction Process**: The model uses 48 LSTM units and 10 neural network layers, incorporating residual connections to capture long-term dependencies in high-frequency data[65][66][68] - **Factor Name**: Multi-Granularity Model Factor (5-Day Label) **Construction Idea**: This factor predicts stock returns over a 5-day horizon using a multi-granularity deep learning model[68][69][70] **Construction Process**: The model is trained using bidirectional AGRU (Attention-Gated Recurrent Unit) to capture multi-scale temporal patterns in stock data[68][69][70] - **Factor Name**: Multi-Granularity Model Factor (10-Day Label) **Construction Idea**: Similar to the 5-day label factor, this factor predicts stock returns over a 10-day horizon[69][70][71] **Construction Process**: The model uses the same AGRU architecture as the 5-day label factor but is trained with a 10-day prediction horizon[69][70][71] Factor Backtesting Results - **Intraday Skewness Factor**: - IC: 0.024 (2025), 0.019 (historical) - e^(-RankMAE): 0.327 (2025), 0.324 (historical) - Long-Short Return: 16.90% (2025 YTD), -0.66% (last week) - Long-Only Excess Return: 1.84% (2025 YTD), -0.79% (last week)[9][10][13] - **Downside Volatility Proportion Factor**: - IC: 0.020 (2025), 0.016 (historical) - e^(-RankMAE): 0.325 (2025), 0.323 (historical) - Long-Short Return: 12.93% (2025 YTD), -1.19% (last week) - Long-Only Excess Return: -0.12% (2025 YTD), -1.07% (last week)[9][10][18] - **Post-Open Buying Intention Proportion Factor**: - IC: 0.026 (2025), 0.026 (historical) - e^(-RankMAE): 0.322 (2025), 0.321 (historical) - Long-Short Return: 13.98% (2025 YTD), 0.27% (last week) - Long-Only Excess Return: 7.20% (2025 YTD), 0.28% (last week)[9][10][22] - **Post-Open Buying Intensity Factor**: - IC: 0.029 (2025), 0.030 (historical) - e^(-RankMAE): 0.327 (2025), 0.326 (historical) - Long-Short Return: 18.53% (2025 YTD), 0.05% (last week) - Long-Only Excess Return: 7.09% (2025 YTD), 0.43% (last week)[9][10][27] - **Post-Open Large Order Net Buying Proportion Factor**: - IC: 0.027 (2025), 0.036 (historical) - e^(-RankMAE): 0.319 (2025), 0.322 (historical) - Long-Short Return: 18.25% (2025 YTD), 0.31% (last week) - Long-Only Excess Return: 9.48% (2025 YTD), 0.43% (last week)[9][10][32] - **Post-Open Large Order Net Buying Intensity Factor**: - IC: 0.019 (2025), 0.025 (historical) - e^(-RankMAE): 0.318 (2025), 0.321 (historical) - Long-Short Return: 10.50% (2025 YTD), 0.31% (last week) - Long-Only Excess Return: 7.08% (2025 YTD), 0.24% (last week)[9][10][37] - **Improved Reversal Factor**: - IC: 0.025 (2025), 0.031 (historical) - e^(-RankMAE): 0.331 (2025), 0.330 (historical) - Long-Short Return: 17.44% (2025 YTD), 0.12% (last week) - Long-Only Excess Return: 6.14% (2025 YTD), 0.33% (last week)[9][10][40] - **Deep Learning Factor (Improved GRU(50,2)+NN(10))**: - IC: 0.045 (2025), 0.066 (historical) - e^(-RankMAE): 0.335 (2025), 0.336 (historical) - Long-Short Return: 28.86% (2025 YTD), 1.36% (last week) - Long-Only Excess Return: 2.19% (2025 YTD), 0.06% (last week)[9][10][63] - **Deep Learning Factor (Residual
昔日高考状元,今日AI顶尖科学家:何恺明的“开挂”人生
华人AI科学家视频系列之一 扎克伯格最近疯抢AI科学家,尤其是华人科学家,动不动就开出1亿美元甚至2亿美元的薪酬包。 有一位AI大神似乎被忽略了。 今年3月,Facebook首席AI科学家杨立昆在一次访谈中,提到了"一件不为人知的事",科学(AI)领域 被引用次数最多的论文,是关于深度学习领域的,来自10年前的2015年,这篇论文起源于北京。 这篇论文的主要作者叫做,何恺明。 《自然》杂志给出了一个21世纪引用量最高的最新Top 25,排在第一位的就是"Deep Residual Learning for Image Recognition", 是一篇关于ResNets研究的论文,作者包括何恺明、张祥雨、任少卿和孙剑。 何恺明是何方神圣? 何恺明1984年出生于广州,他在执信中学的时候,因为获得全国物理竞赛一等奖,拿到了清华大学的保 送资格,但他还是参加高考来证明自己。以标准分900分的成绩,成为当年广东省9位满分状元之一。 2007年何恺明进入香港中文大学读研,师从汤晓鸥。港中大认识何凯明的,都说他是超级拼命三郎,早 上六点多出门晚上十二点回寝室,天才还这么拼命,"普通人没法玩"。2011年博士毕业后,进入 ...
最前线|RoboMaster 2025机甲大师超级对抗赛收官,从高校开始以赛促学
3 6 Ke· 2025-08-05 07:54
Core Insights - The RoboMaster 2025 competition concluded with Shanghai Jiao Tong University's team winning the national championship, highlighting the event's role in fostering innovation and talent in robotics [1] - The competition emphasizes practical applications of robotics technology, with advancements in areas such as machine vision, embedded systems, and autonomous navigation [1][2] Group 1: Technological Advancements - The 2025 season focused on the iteration of various robot technologies, particularly in terrain adaptability and control algorithms, enhancing capabilities like overcoming height differences and self-resetting after falls [1] - Teams utilized advanced technologies such as edge computing and neural networks for real-time target recognition and predictive analytics, improving combat performance and exploring applications in complex environments [2] - The introduction of aerial robots aimed at lightweight and high payload capabilities, with notable performance from the China University of Petroleum (East China) team, which developed a drone weighing only 12.4 kg with a 5 kg payload capacity [3] Group 2: Educational and Employment Impact - Participation in RoboMaster significantly boosts employment opportunities for students, with nearly 100% employment rates reported among participants, and many tech companies prioritizing candidates with RoboMaster experience [3] - The competition encourages students to transform innovative ideas into commercial ventures, leading to the establishment of startups that leverage the technologies and experiences gained during the competition [3] - RoboMaster has contributed to educational reform, promoting the "learning through competition" model as a valuable complement to traditional education methods [3]
CVPR 2025中稿新高的背后,录用率却仅22.1%。。。
自动驾驶之心· 2025-08-04 03:23
Core Viewpoint - The article highlights the challenges faced by researchers in the AI field, particularly in the paper submission process, leading to a high rejection rate due to various issues such as writing quality, methodological flaws, and misalignment with journal focus [1][2]. Group 1: Submission Challenges - Pain Point 1: 60% of desk rejections are due to misalignment with the journal's focus [3]. - Pain Point 2: Lack of innovation is a critical issue, with reviewers criticizing submissions for not addressing relevant problems [3]. - Pain Point 3: 65% of rejections stem from methodological flaws, indicating that many experiments are not reproducible [3]. - Pain Point 4: 78% of papers are rejected due to poor writing structure, with many authors failing to effectively communicate their research [3]. - Pain Point 5: 23% of initial rejections occur due to formatting errors in the submission process [2]. Group 2: Support and Solutions - The company offers personalized guidance from over 300 experienced mentors in the fields of autonomous driving and embodied intelligence, with a high success rate of 96% for students [4]. - The mentoring process includes comprehensive support from topic selection to submission, ensuring that students are well-prepared for the publication process [11]. - The program aims to help students build a clear research framework, improve coding skills, and enhance their overall research capabilities [9][12].
秋招面经!大疆卓驭感知算法工程师面试~
自动驾驶之心· 2025-08-03 23:32
Core Viewpoint - The article discusses the recruitment process and job responsibilities for a perception algorithm engineer in the autonomous driving industry, emphasizing the importance of skills in computer vision, deep learning, and sensor fusion technologies [1][5][6]. Group 1: Job Responsibilities - The role involves processing large amounts of autonomous driving data, building automated ground truth labeling systems, and designing cutting-edge AI and vision technologies [6]. - Algorithms and code developed will be deployed in millions of mass-produced vehicles [6]. - Key tasks include detecting static scene elements, tracking dynamic targets, and developing calibration methods for various sensors [10]. Group 2: Job Qualifications - Candidates should have a master's degree or higher in relevant fields such as computer science, automation, or mathematics [7]. - Proficiency in programming languages like C++ or Python, along with solid knowledge of algorithms and data structures, is required [7]. - Familiarity with multi-view geometry, computer vision, deep learning, and sensor technology applications is essential [7]. Group 3: Preferred Qualifications - Experience in developing perception algorithms for autonomous driving systems or ADAS, such as lane detection and obstacle tracking, is a plus [9]. - Candidates with experience in sensor fusion involving visual, LiDAR, and millimeter-wave radar are preferred [9]. - Publications in top conferences or journals in the fields of computer vision, machine learning, or robotics are advantageous [9]. Group 4: Community and Resources - The article mentions a community platform for job seekers in autonomous driving and robotics, providing resources such as interview questions, industry reports, and salary negotiation tips [12][13]. - The community aims to assist members in preparing for job applications and understanding industry trends [12][21].
RoboMaster 2025机甲大师超级对抗赛全国赛收官
Huan Qiu Wang Zi Xun· 2025-08-03 13:44
Core Insights - The RoboMaster 2025 National Competition concluded with Shanghai Jiao Tong University's team winning the championship, while teams from the University of Science and Technology of China, South China University of Technology, and Northeast University secured the second, third, and fourth places respectively [1][3]. Group 1: Technological Advancements - The competition showcased significant advancements in robot technology, particularly in the development of bipedal robots capable of navigating complex terrains, including stair climbing and self-resetting after falls [3]. - Teams focused on optimizing structures and algorithms to enhance robot load capacity, mobility, and adaptability to various terrains, which aligns with practical applications in security inspections and disaster rescue [3][5]. Group 2: Application of AI and Algorithms - The champion team's robot utilized edge computing and neural networks for enemy armor recognition and motion prediction, enhancing its combat capabilities and exploring applications of deep learning in dynamic environments [5]. - The runner-up team implemented laser radar and advanced algorithms for autonomous navigation, obstacle avoidance, and multi-machine communication, addressing complex offensive and defensive strategies [5]. Group 3: Educational Impact - The competition provided participants with a comprehensive engineering experience, requiring them to engage in demand analysis, design, prototyping, and iterative optimization, simulating real-world challenges in research and industry development [5][7]. - This "full-process practical" model prepares students for future careers in intelligent security, disaster rescue, and low-altitude economy sectors, establishing a solid foundation for practical application [7].
AI教父Hinton,重新能坐下了
Hu Xiu· 2025-08-03 04:53
Group 1 - Geoffrey Hinton, the AI pioneer, recently sat down comfortably in Shanghai, marking a significant moment in his life after nearly 18 years of discomfort that prevented him from sitting for extended periods [1][6][30] - Hinton's journey in AI began in 1972 when he chose to pursue neural networks, a path that was largely dismissed by his peers at the time [12][20] - His persistence in the field led to breakthroughs in deep learning, particularly during the ImageNet competition in 2012, where his team achieved a remarkable error rate of 15.3% [30][31][32] Group 2 - Hinton's contributions to AI were recognized with the Turing Award in 2019, which he received while standing, reflecting his long-standing discomfort with sitting [59][63] - Following his resignation from Google in May 2023, Hinton expressed concerns about the risks associated with AI, stating that he regretted his role in its development [67][68] - In recent interviews, Hinton has been able to sit for longer periods, indicating a potential improvement in his health, and he has been vocal about the dangers of AI, suggesting a 10%-20% chance of human extinction due to AI in the next 30 years [70][76]
DeepTiming:日内信息与相似度学习驱动择时
Minsheng Securities· 2025-07-31 09:02
Quantitative Models and Construction Methods 1. Model Name: Deep Learning Stock Return Prediction Model - **Model Construction Idea**: This model is based on a deep learning framework tailored to the current market environment. It integrates daily and minute-frequency inputs to predict stock returns and generate trading signals based on historical rolling thresholds[1][10][22] - **Model Construction Process**: - **Input Layer**: Combines 51 technical/sentiment daily features, 7 basic daily price-volume indicators, 10 enhanced style factors, and 52 minute-frequency features aggregated to daily frequency[22] - **Training Layer**: Utilizes meta-learning to adapt to new market data dynamically, avoiding overfitting to historical data[14] - **Output Layer**: Employs LinSAT neural networks to impose constraints on the output, ensuring specific objectives like controlling style and industry exposures[18] - **Loss Function**: Multi-period mean squared error (MSE) is used to stabilize predictions for timing strategies[22] - **Formula**: Multi-period return prediction as \( y = (n, 1) \), where \( n \) represents the number of stocks[22] - **Model Evaluation**: Demonstrates robustness in adapting to market changes and controlling exposures, with significant predictive power for timing strategies[10][22] 2. Model Name: SimStock - **Model Construction Idea**: SimStock uses self-supervised learning to predict stock similarities, incorporating both static and dynamic correlations. It leverages contrastive learning to dynamically capture time-series information beyond traditional industry and style classifications[2][47][48] - **Model Construction Process**: - **Input**: Past 40-day price-volume data, Barra style factors, and capital flow indicators[52] - **Positive and Negative Sample Construction**: Positive samples are generated as \( X_{pos} = X + (1-\alpha)X_{rand} \), where \( \alpha = 0.75 \) and \( X_{rand} \) is a random feature sample[52] - **Embedding**: LSTM initializes dynamic attention weights, and CLS tokens aggregate sequence information into stock attribute vectors[52] - **Similarity Calculation**: Stock similarity is measured using cosine similarity between attribute vectors[52] - **Model Evaluation**: Effectively identifies stocks with high similarity, primarily within the same industry, but without clear patterns in market capitalization or sub-industry[56] 3. Model Name: Improved GRU Model with SimStock Integration - **Model Construction Idea**: Enhances the GRU-based stock return prediction model by initializing hidden states with SimStock-generated stock attribute vectors, improving stability across different stock types[57][59] - **Model Construction Process**: - **Initialization**: SimStock attribute vectors replace the GRU model's initial hidden state[57] - **Training**: Retains the same training setup as the baseline GRU model, with adjustments to incorporate the new initialization[59] - **Model Evaluation**: Demonstrates improved predictive performance and stability, particularly in timing strategies across diverse stocks[60][63] 4. Model Name: Index Timing Model - **Model Construction Idea**: Aggregates individual stock signals into index signals using weighted predictions based on market capitalization, followed by threshold-based signal generation[77] - **Model Construction Process**: - **Aggregation**: Combines stock return predictions into index return predictions using market-cap weights[77] - **Signal Generation**: Uses the 60th percentile of past-year predictions as the buy threshold and the 40th percentile as the sell threshold[77] - **Holding Period**: Maintains positions for at least 5 trading days to reduce turnover[77] - **Model Evaluation**: Effective in generating excess returns, particularly in high-volatility sectors[79][82][84] --- Model Backtest Results 1. Deep Learning Stock Return Prediction Model - **Cumulative Excess Return**: 77% over 5 years[33] - **Annualized Return**: 27%[33] - **Excess Return vs. Stocks**: 11.3% (pre-cost)[33] 2. SimStock - **Cumulative Excess Return**: 109% over 5 years[60] - **Annualized Return**: 30%[60] - **Excess Return vs. Stocks**: 14.8% (pre-cost)[60] - **Daily Win Rate**: 57.4%[60] - **Holding Probability**: 45.7%[60] 3. Index Timing Model - **HS300**: Annualized Return 5.1%, Excess Return 5.6%, Max Drawdown 7.7%[79] - **CSI500**: Annualized Return 12.4%, Excess Return 12.2%, Max Drawdown 7.1%[82] - **CSI1000**: Annualized Return 15.1%, Excess Return 14.9%, Max Drawdown 11.3%[84] 4. Sector Timing - **Best Sector**: Electric Power Equipment & New Energy, Annualized Return 36%, Excess Return 31.1%[101] --- Quantitative Factors and Construction Methods 1. Factor Name: Reinforced Style Factor (PPO Model) - **Factor Construction Idea**: Uses PPO reinforcement learning to predict market style preferences, generating more interpretable and robust risk factors compared to traditional deep learning[12] - **Factor Construction Process**: - **Input**: Traditional style factors and recent stock price-volume data[12] - **Reward Function**: Stability-penalized market return goodness-of-fit[12] - **Output**: Enhanced style factor representing AI market preferences[12] - **Factor Evaluation**: Provides a stable and interpretable representation of market style dynamics[12] --- Factor Backtest Results 1. Reinforced Style Factor - **RankIC**: Weekly average of 4.5% since 2019[36] - **Annualized Return**: 23.2% for long-only portfolios, Excess Return 18.3% vs. CSI800[36]