Workflow
因子聚类
icon
Search documents
【国信金工】财务报表中的Alpha因子扩容与增强
量化藏经阁· 2025-08-12 00:08
Financial Factor Extraction - The generation paradigm of financial factors involves defining operational rules (operators) and calculating factors from various financial indicators, resulting in approximately 100,000 factors extracted from financial statements, forecasts, and notes [1][27][40] - A preliminary screening of factors based on RankIC mean values and annualized RankICIR led to the identification of 4,427 effective factors, indicating a significant number of factors with potential predictive power [27][35][40] Multi-Dimensional Financial Factor Expansion - The introduction of new operators and data sources enhances the performance of classic factors, with the development of a cross-sectional percentile difference operator improving the performance of classic factors significantly [1][42][45] - The addition of financial note data has provided incremental information, with the composite factor derived from notes showing a RankIC mean of 4.78% and an annualized RankICIR of 2.69, indicating strong predictive capabilities [1][62][69] Alpha Factor Library Expansion and Enhancement - Traditional factor synthesis methods face challenges such as style drift when combining numerous factors without considering their styles, leading to unstable performance [1][3] - The proposed "clustering-expansion-synthesis" method effectively groups factors based on their correlation, resulting in the creation of eight major categories of factors, which outperform direct synthesis of all factors [1][3][6] Performance of Enhanced Factors - Empirical research shows that the performance of clustered enhanced factors is superior, with a monthly RankIC mean of 12.08% and an annualized RankICIR of 5.32 since 2013, indicating strong predictive power [4][6] - The enhanced factors demonstrate significant improvements in monthly excess returns compared to traditional composite factors, particularly in value and growth categories [4][6][62]
金融工程专题研究:财务报表中的Alpha因子扩容与增强
Guoxin Securities· 2025-08-05 14:26
Quantitative Models and Factor Construction Quantitative Factors and Construction Methods - **Factor Name**: Financial Statement Alpha Factors **Construction Idea**: Define an operator to calculate factors using financial indicators from financial statements, forecasts, quick reports, and financial notes[1][11][175] **Construction Process**: 1. Use 14 operators (e.g., ratio, YOY growth) to combine financial indicators[1][29] 2. Generate approximately 100,000 factors[1][175] 3. Filter factors based on criteria: RankIC mean > 2%, annualized RankICIR > 1.5, long-only monthly excess return > 0.3%, long-short monthly return > 0.6%[1][45][175] **Evaluation**: Effective in identifying 4,427 valid factors from the initial pool[1][42][175] - **Factor Name**: Percentile Difference Operator (EPRank) **Construction Idea**: Address the distortion caused by extreme denominator values in ratio-based factors by using percentile differences[54][176] **Construction Process**: 1. Calculate the percentile of numerator and denominator indicators 2. Compute the difference between the two percentiles Formula: $PercentileA2B = PercentileA - PercentileB$ $EPRank = Percentile(NetProfit) - Percentile(MV)$[54][176] **Evaluation**: Reduces the impact of extreme values and improves factor performance[54][176] - **Factor Name**: Financial Notes Composite Factor **Construction Idea**: Utilize financial notes data to capture incremental information not included in traditional factors[69][176] **Construction Process**: 1. Extract sub-items from financial notes (e.g., inventory details) 2. Construct factors such as sub-item ratios, growth rates, and changes in ratios[70][73] 3. Combine 390 financial note factors into a composite factor using rolling 12-month RankICIR weighting[78][176] **Evaluation**: Demonstrates low correlation with traditional factors and strong predictive ability[80][86] - **Factor Name**: Income Tax Composite Factor **Construction Idea**: Reflect the "cash nature" of income tax to verify the authenticity of profits[91][176] **Construction Process**: 1. Use various operators (e.g., ratio, industry share, YOY growth) to construct income tax factors 2. Combine factors using rolling 12-month RankICIR weighting[94][95] **Evaluation**: Provides stable stock selection ability and low correlation with traditional factors[96][99] - **Factor Name**: NPQYOY with Forecast and Quick Report Data **Construction Idea**: Enhance the timeliness of traditional factors by incorporating forecast and quick report data[101][176] **Construction Process**: 1. Replace formal financial data with forecast/quick report data (e.g., median of forecasted net profit range) 2. Compare the performance of the updated factor with the original[108][109] **Evaluation**: Significant improvement in RankIC mean, annualized RankICIR, and excess returns[109][112] Composite Factor Construction and Enhancement - **Factor Name**: Weighted Composite Factor **Construction Idea**: Combine multiple factors using rolling 12-month RankICIR weighting[115][176] **Construction Process**: 1. Select factors from the existing factor library 2. Weight factors based on their RankICIR performance[115][116] **Evaluation**: Strong stock selection ability but prone to style bias when the number of factors increases[118][122] - **Factor Name**: Clustered Composite Factor **Construction Idea**: Address style bias by clustering factors based on their correlation[123][176] **Construction Process**: 1. Define factor correlation using "group-weighted method" 2. Apply Leiden clustering algorithm to group factors into eight categories (e.g., value, growth, low volatility)[130][134] 3. Combine factors within each category and then across categories[137][141] **Evaluation**: Outperforms weighted composite factors in RankIC mean, annualized RankICIR, and excess returns[141][142] - **Factor Name**: Cluster-Enhanced Factor **Construction Idea**: Expand clustered factors by incorporating newly discovered factors and applying incremental screening[146][176] **Construction Process**: 1. Assign new factors to existing categories based on correlation 2. Use incremental screening to select effective factors within each category 3. Combine factors within categories and across categories[146][149] **Evaluation**: Achieves the best performance among all composite factors, with significant improvements in RankIC mean, annualized RankICIR, and excess returns[150][158] Backtest Results of Factors and Models - **Financial Statement Alpha Factors**: RankIC mean 2%-5%, annualized RankICIR > 1.5, long-only monthly excess return > 0.3%, long-short monthly return > 0.6%[1][45] - **EPRank**: RankIC mean 5.46%, annualized RankICIR 2.01, long-only monthly excess return 0.64%, long-short monthly return 1.37%[60][64] - **Financial Notes Composite Factor**: RankIC mean 4.78%, annualized RankICIR 2.69, long-only monthly excess return 0.77%, long-short monthly return 1.79%[78][86] - **Income Tax Composite Factor**: RankIC mean 4.62%, annualized RankICIR 2.60, long-only monthly excess return 0.67%, long-short monthly return 1.14%[95][99] - **NPQYOY with Forecast Data**: RankIC mean 4.26%, annualized RankICIR 2.60, long-only monthly excess return 0.72%, long-short monthly return 1.36%[109][112] - **Weighted Composite Factor**: RankIC mean 11.38%, annualized RankICIR 4.07, long-only monthly excess return 1.21%, long-short monthly return 2.96%[116][158] - **Clustered Composite Factor**: RankIC mean 11.43%, annualized RankICIR 4.54, long-only monthly excess return 1.31%, long-short monthly return 3.18%[141][158] - **Cluster-Enhanced Factor**: RankIC mean 12.08%, annualized RankICIR 5.32, long-only monthly excess return 1.62%, long-short monthly return 3.53%[150][158]