Workflow
因子挖掘
icon
Search documents
市场微观结构系列(32):深度学习赋能因子挖掘2.0:综合应用方案
KAIYUAN SECURITIES· 2026-01-28 09:14
金融工程专题 高 鹏(分析师) 证书编号:S0790520090002 苏俊豪(分析师) 金融工程研究团队 魏建榕(首席分析师) 证书编号:S0790519120001 傅开波(分析师) 证书编号:S0790520090003 证书编号:S0790522020001 胡亮勇(分析师) 证书编号:S0790522030001 2026 年 01 月 28 日 王志豪(分析师) 证书编号:S0790522070003 盛少成(分析师) 证书编号:S0790523060003 蒋 韬(分析师) 证书编号:S0790123070037 相关研究报告 《遗传算法赋能交易行为因子 —市场微观结构(20)》-2023.8.6 《深度学习赋能交易行为因子 —市场微观结构(24)》-2024.5.24 《深度学习赋能风格轮动与多 策 略 融 合 — 开 源 量 化 评 论 (103)》-2024.12.12 《深度学习赋能技术分析—开 源量化评论(109)》-2025.6.25 深度学习赋能因子挖掘 2.0:综合应用方案 ——市场微观结构系列(32) 魏建榕(分析师) 盛少成(分析师) weijianrong@kysec.cn ...
金工专题报告 20260110:深度学习系列之一:AI重塑量化,基于大语言模型驱动的因子改进与情绪Alpha挖掘
Soochow Securities· 2026-01-10 11:09
Core Insights - The report presents a systematic framework for automated factor research based on Large Language Models (LLM) and Prompt Engineering, aiming to explore the potential applications of AI in the entire quantitative investment chain [1] - The framework was first applied to low-frequency price-volume factors, optimizing the classic Alpha158 factor library and transitioning from an "optimization" paradigm to a "generation" paradigm [1] - AI demonstrated strong factor discovery capabilities in both fundamental and high-frequency data domains, successfully generating new factors and enhancing traditional factor libraries [1] - The report also explores AI's application in unstructured text analysis, utilizing the Gemini model to interpret sentiment from extensive research memos, creating unique sentiment indicators that effectively integrate into stock selection strategies [1] Group 1: Low-Frequency Price-Volume Factor Optimization - The framework was initially applied to the optimization of low-frequency price-volume factors, using the Alpha158 factor library as a foundation for optimization experiments [1] - AI identified logical flaws in original factors and proposed effective improvements, with optimization effects being consistent across multiple time windows from 5 to 60 days [1] - New factors generated by AI, with low correlation to sample factors, showed robust out-of-sample performance, with some factors achieving an Information Coefficient Information Ratio (ICIR) above 1.0 [1] Group 2: Fundamental and High-Frequency Factor Discovery - In the fundamental dimension, AI not only generated enhanced versions of classic factors but also innovatively expanded value, quality, and growth factors from novel perspectives [1] - In the high-frequency dimension, AI was empowered to directly generate Python code, uncovering a set of novel and high-performing high-frequency factors, with some strong signal factors achieving annualized returns exceeding 60% [1] - Integrating the AI-generated high-frequency factor library into the AGRU neural network model significantly improved annualized excess returns from 18.24% to 25.28% [1] Group 3: Alternative Data Processing and Sentiment Analysis - The report investigates AI's potential in processing alternative data, analyzing nearly one million words of research memos using the Gemini 2.5 Pro model [1] - A weekly sentiment factor was constructed, revealing unique asymmetric predictive capabilities, where negative sentiment strongly predicted future price declines, achieving annualized excess returns of 8.26% [1] - This sentiment factor exhibited low correlation with traditional price-volume and fundamental factors, serving as an independent and effective supplementary information source [1] Group 4: Comprehensive Strategy Development - A multi-dimensional information fusion strategy was developed, integrating AI-discovered high-frequency factors with low-frequency market data into the AGRU neural network to form a core Alpha [1] - The final strategy, enhanced by AI sentiment factors for risk adjustment, improved annualized excess returns from 11.15% to 11.81% while maintaining turnover rates [1] - The strategy demonstrated a significant increase in the information ratio from 2.18 to 2.31, validating AI's potential to empower quantitative research across multiple stages and achieve a "1+1>2" effect [1]
海量Level2数据因子挖掘系列(六):用逐笔订单数据改进分钟频因子
GF SECURITIES· 2025-12-04 14:05
Quantitative Factors and Construction Factor Name: KeyPeriod_ret_zero - **Construction Idea**: This factor focuses on the return characteristics during horizontal trading periods within key intraday timeframes, leveraging Level 2 tick data to refine minute-frequency factors[7][25][41] - **Construction Process**: - Identify horizontal trading periods based on minimal price fluctuations - Calculate returns during these periods using tick-level data - Aggregate and smooth the data over different time horizons (e.g., 5-day, 20-day)[25][27] - **Evaluation**: Demonstrates strong predictive power for stock selection, with high IC stability and win rates[7][25] Factor Name: KeyPeriod_ret_low5pct - **Construction Idea**: This factor captures return characteristics during significant downward price movements within key intraday timeframes[7][25][64] - **Construction Process**: - Identify periods where returns fall within the bottom 5% of all intraday returns - Calculate and aggregate these returns over different time horizons - Apply smoothing techniques to enhance signal stability[25][27] - **Evaluation**: Exhibits robust performance in identifying underperforming stocks, with high IC values and win rates[7][25] Factor Name: KeyPeriod_price_low5pct - **Construction Idea**: This factor focuses on price levels during periods of low prices (bottom 5%) within key intraday timeframes[7][25][88] - **Construction Process**: - Identify periods where prices fall within the bottom 5% of all intraday prices - Aggregate and smooth the data over different time horizons - Incorporate buy/sell distinctions for further refinement[25][32] - **Evaluation**: Effective in capturing undervalued stocks, with strong IC performance and high win rates[7][25] Factor Name: KeyPeriod_amount_top30pct - **Construction Idea**: This factor targets periods of high transaction amounts (top 30%) within key intraday timeframes[7][25][110] - **Construction Process**: - Identify periods where transaction amounts are in the top 30% of all intraday amounts - Aggregate and smooth the data over different time horizons - Differentiate between buy and sell transactions for enhanced granularity[25][35] - **Evaluation**: Demonstrates strong predictive power for high-liquidity stocks, with high IC values and win rates[7][25] Factor Name: KeyPeriod_amount_low50pct - **Construction Idea**: This factor captures periods of low transaction amounts (bottom 50%) within key intraday timeframes[7][25][133] - **Construction Process**: - Identify periods where transaction amounts are in the bottom 50% of all intraday amounts - Aggregate and smooth the data over different time horizons - Incorporate buy/sell distinctions for further refinement[25][35] - **Evaluation**: Useful for identifying low-liquidity stocks, though performance is less consistent compared to other factors[7][25] Factor Name: KeyPeriod_sync_low50pct - **Construction Idea**: This factor measures volume-price divergence during periods of low synchronization (bottom 50%) within key intraday timeframes[7][25][155] - **Construction Process**: - Identify periods where volume and price movements are least synchronized - Aggregate and smooth the data over different time horizons - Differentiate between buy and sell transactions for enhanced granularity[25][38] - **Evaluation**: Effective in capturing unique market dynamics, with strong IC performance and high win rates[7][25] --- Backtesting Results KeyPeriod_ret_zero - **IC Mean**: -5.36% (20-day horizon)[27] - **Win Rate**: 85.1% (20-day horizon)[27] - **IR**: 1.34 (2020-2025)[55] KeyPeriod_ret_low5pct - **IC Mean**: 5.47% (20-day horizon)[27] - **Win Rate**: 84.1% (20-day horizon)[27] - **IR**: 1.41 (2020-2025)[77] KeyPeriod_price_low5pct - **IC Mean**: 5.59% (20-day horizon)[32] - **Win Rate**: 85.3% (20-day horizon)[32] - **IR**: 2.22 (2020-2025)[97] KeyPeriod_amount_top30pct - **IC Mean**: 11.23% (20-day horizon)[35] - **Win Rate**: 84.8% (20-day horizon)[35] - **IR**: 1.37 (2020-2025)[123] KeyPeriod_amount_low50pct - **IC Mean**: -10.50% (20-day horizon)[35] - **Win Rate**: 75.0% (20-day horizon)[35] - **IR**: 0.77 (2020-2025)[145] KeyPeriod_sync_low50pct - **IC Mean**: 6.00% (20-day horizon)[38] - **Win Rate**: 81.5% (20-day horizon)[38] - **IR**: 1.44 (2020-2025)[172]