Workflow
开源金工因子挖掘2.0模型
icon
Search documents
市场微观结构系列(32):深度学习赋能因子挖掘2.0:综合应用方案
KAIYUAN SECURITIES· 2026-01-28 09:14
Quantitative Models and Construction Methods 1. Model Name: GRU+GAT_SA Weighted Model - **Model Construction Idea**: Combines GRU for time-series information extraction and GAT for cross-sectional information extraction, with SA weighting to optimize factor performance[24][28][32] - **Model Construction Process**: - GRU is used to extract time-series features from input data, replacing LSTM due to its superior performance in terms of speed and factor effectiveness[24] - GAT is applied to capture cross-sectional relationships among stocks, using three types of networks: industry, financial, and capital flow networks[25][28] - SA weighting is introduced by adding an MLP layer during training, with Barra style factor returns as input to determine the weights of the three networks[32] - Financial indicators are incorporated at the output layer before the final factor output, enhancing multi-head performance[36] - **Model Evaluation**: The model demonstrates superior performance in multi-head returns and information ratio compared to simpler models like GRU or GAT alone[31][32] 2. Model Name: GRU+GAT_SA Weighted Model with Financial Consideration - **Model Construction Idea**: Enhances the GRU+GAT_SA Weighted Model by incorporating financial indicators to improve factor performance[36] - **Model Construction Process**: - Financial indicators are categorized into nine dimensions, including growth, profitability, quality, solvency, capital structure, turnover, goodwill, R&D, and valuation[36][37] - These indicators are standardized and integrated into the model at the output layer, combined with GRU and GAT outputs[36] - **Model Evaluation**: The inclusion of financial indicators significantly improves multi-head performance, especially in long-only strategies[36][39] --- Model Backtesting Results GRU+GAT_SA Weighted Model - **10-day RankIC**: 11.5%[44] - **Annualized RankICIR**: 5.9[44] - **Long-Short Annualized Return**: 58.3%[44] - **Long-Short Information Ratio**: 5.3[44] - **Maximum Drawdown**: -5.8%[44] - **Long-Only Annualized Return**: 22.1%[44] - **Long-Only Information Ratio**: 2.8[44] GRU+GAT_SA Weighted Model with Financial Consideration - **10-day RankIC**: 11.7%[44] - **Annualized RankICIR**: 5.7[44] - **Long-Short Annualized Return**: 58.9%[44] - **Long-Short Information Ratio**: 5.1[44] - **Maximum Drawdown**: -4.8%[44] - **Long-Only Annualized Return**: 24.1%[44] - **Long-Only Information Ratio**: 3.0[44] --- Quantitative Factors and Construction Methods 1. Factor Name: PV (Price-Volume) - **Factor Construction Idea**: Derived from basic market data, including open, high, low, close prices, average price, and trading volume[19] - **Factor Construction Process**: - Basic market data is directly used as input features for GRU+GAT_SA Weighted Model[19] - **Factor Evaluation**: Demonstrates strong performance in both long-short and long-only strategies[44] 2. Factor Name: G (Technical Indicators and K-line State Variables) - **Factor Construction Idea**: Derived from technical indicators and K-line state variables based on basic market data[19][45] - **Factor Construction Process**: - Technical indicators and K-line state variables are calculated and encoded as input features for the model[45][46] - **Factor Evaluation**: Shows incremental performance improvement when combined with financial indicators[49] 3. Factor Name: C (Capital Flow) - **Factor Construction Idea**: Based on large and small order capital flow data, including original data, derived indicators, and state variables[19][52] - **Factor Construction Process**: - Capital flow data is processed into state variables, such as net buy/sell and active buy/sell proportions, and used as input features[54][55] - **Factor Evaluation**: Outperforms traditional manually constructed capital flow factors[58] 4. Factor Name: HF (High-Frequency Features) - **Factor Construction Idea**: Derived from high-frequency data, aggregated into daily features[19][59] - **Factor Construction Process**: - High-frequency data, such as minute-level returns and trading volume, is aggregated and used as input features for the model[59] - **Factor Evaluation**: Demonstrates strong performance in long-short and long-only strategies[62] 5. Factor Name: DP (Genetic Algorithm Factors) - **Factor Construction Idea**: Derived from genetic algorithm-based factor mining, further enhanced using deep learning[19][60] - **Factor Construction Process**: - Genetic algorithm is used to generate 185 factors, of which 48 are selected based on historical performance and completeness[60] - These factors are used as input features for the GRU+GAT_SA Weighted Model[65] - **Factor Evaluation**: Deep learning significantly improves the performance of genetic algorithm factors[66] 6. Factor Name: ML_C (Composite Deep Learning Factor) - **Factor Construction Idea**: Combines multiple dimensions of factors using SA weighting and multi-dimensional optimization[69] - **Factor Construction Process**: - Factors from different dimensions (e.g., PV, G, C, HF, DP) are combined using SA weighting and long-only return weighting to create a composite factor[69] - **Factor Evaluation**: Achieves the best overall performance among all tested factors[72] --- Factor Backtesting Results PV Factor - **10-day RankIC**: 11.7%[68] - **Annualized RankICIR**: 5.7[68] - **Long-Short Annualized Return**: 58.9%[68] - **Long-Short Information Ratio**: 5.1[68] - **Maximum Drawdown**: -4.8%[68] - **Long-Only Annualized Return**: 24.1%[68] - **Long-Only Information Ratio**: 3.0[68] G Factor - **10-day RankIC**: 11.0%[68] - **Annualized RankICIR**: 5.8[68] - **Long-Short Annualized Return**: 59.9%[68] - **Long-Short Information Ratio**: 6.2[68] - **Maximum Drawdown**: -2.5%[68] - **Long-Only Annualized Return**: 23.3%[68] - **Long-Only Information Ratio**: 3.3[68] C Factor - **10-day RankIC**: 10.6%[68] - **Annualized RankICIR**: 5.1[68] - **Long-Short Annualized Return**: 56.4%[68] - **Long-Short Information Ratio**: 5.2[68] - **Maximum Drawdown**: -4.4%[68] - **Long-Only Annualized Return**: 19.5%[68] - **Long-Only Information Ratio**: 2.8[68] HF Factor - **10-day RankIC**: 11.6%[68] - **Annualized RankICIR**: 5.9[68] - **Long-Short Annualized Return**: 57.5%[68] - **Long-Short Information Ratio**: 5.8[68] - **Maximum Drawdown**: -5.2%[68] - **Long-Only Annualized Return**: 19.1%[68] - **Long-Only Information Ratio**: 2.6[68] DP Factor - **10-day RankIC**: 11.4%[68] - **Annualized RankICIR**: 6.2[68] - **Long-Short Annualized Return**: 49.2%[68] - **Long-Short Information Ratio**: 4.4[68] - **Maximum Drawdown**: -4.7%[68] - **Long-Only Annualized Return**: 20.3%[68] - **Long-Only Information Ratio**: 2.8[68] ML_C Factor - **10-day RankIC**: 14.2%[72] - **Annualized RankICIR**: 6.3[72] - **Long-Short Annualized Return**: 72.7%[72] - **Long-Short Information Ratio**: 6.1[72] - **Maximum Drawdown**: -4.8%[72] - **Long-Only Annualized Return**: 26.1%[72] - **Long-Only Information Ratio**: 3.1[72]