随机森林模型 - filings, earnings calls, financial reports, news

随机森林模型

Search documents

从微观出发的风格轮动月度跟踪-20251103

Soochow Securities· 2025-11-03 05:04

Quantitative Models and Construction Methods 1. Model Name: Style Rotation Model - **Model Construction Idea**: The model is built from basic style factors such as valuation, market capitalization, volatility, and momentum, gradually constructing a style timing and scoring system[4][9] - **Model Construction Process**: 1. Construct 640 micro features based on 80 basic micro indicators[9] 2. Use common indices as style stock pools to replace the absolute proportion division of style factors, constructing new style returns as labels[4][9] 3. Use a random forest model for style timing and obtain the current score for each style[4][9] 4. Integrate the timing results and scoring results to construct a monthly frequency style rotation model[4][9] - **Model Evaluation**: The model effectively avoids overfitting risks through rolling training of the random forest model and constructs a comprehensive framework from style timing to style scoring and from style scoring to actual investment[9] Model Backtesting Results 1. **Style Rotation Model**: - Annualized Return: 16.18%[10][11] - Volatility: 20.28%[10][11] - Information Ratio (IR): 0.80[10][11] - Win Rate: 59.43%[10][11] - Maximum Drawdown: 25.20%[11] 2. **Market Benchmark (Hedged)**: - Annualized Return: 10.36%[10][11] - Volatility: 10.85%[10][11] - Information Ratio (IR): 0.95[10][11] - Win Rate: 54.72%[10][11] - Maximum Drawdown: 8.53%[11]

从微观出发的风格轮动月度跟踪-20251013

Soochow Securities· 2025-10-13 15:39

- The style rotation model is constructed based on the Dongwu quantitative multi-factor system, starting from micro-level stock factors. It selects 80 underlying factors as original features, including valuation, market capitalization, volatility, and momentum, and further constructs 640 micro features. The model replaces the absolute proportion division of style factors with common indices as style stock pools, creating new style returns as labels. A random forest model is trained in a rolling manner to avoid overfitting risks, optimizing features and obtaining style recommendations. The framework integrates style timing, scoring, and actual investment[9][4] - The performance of the style rotation model during the backtesting period (2017/01/01-2025/09/30) shows an annualized return of 16.41%, annualized volatility of 20.43%, IR of 0.80, monthly win rate of 58.49%, and a maximum drawdown of 25.54%. When hedging against the market benchmark, the annualized return is 10.54%, annualized volatility is 10.85%, IR is 0.97, monthly win rate is 55.66%, and the maximum drawdown is 8.79%[10][11] - The style rotation model's latest timing directions for October 2025 are value, large market capitalization, momentum, and low volatility[2][19] - The latest holdings of the style rotation model for October 2025 include indices such as CSI Central Enterprise Dividend (ETF code: 561580.SH), CSI Bank (ETF code: 512700.SH), CSI Film and Television (ETF code: 159855.SZ), CS Battery (ETF code: 159796.SZ), and CSI All Real Estate (ETF code: 512200.SH)[3][19]

“风起云涌”风格轮动系列研究（一）：从微观出发的风格轮动—找到风格切换的领先特征

Soochow Securities· 2025-08-20 12:31

Group 1 - The report focuses on constructing a style timing and rotation model from a micro perspective, utilizing micro data to enhance the strategy system [6][62] - The model is based on four style factors: valuation, market capitalization, volatility, and momentum, using 80 micro indicators to create a scoring system [7][62] - The backtesting period from January 1, 2014, to July 31, 2025, shows an annualized return of 20.90% with a volatility of 26.12% and a maximum drawdown of -40.57% [57][62] Group 2 - The model's out-of-sample performance has been stable since its development in March 2024, with a return of 55.36% for the entire year of 2024, outperforming the market benchmark by 35.72% [57][62] - The report highlights the construction of style labels based on specific broad indices to overcome limitations of using the entire A-share market for style timing [18][20] - The random forest model is selected for predicting the direction of style factors, enhancing the performance of the timing strategy [23][25] Group 3 - The performance metrics for the valuation factor before timing show an annualized return of 7.90% compared to the benchmark's 6.85%, with a maximum drawdown of -60.33% [30][32] - After applying the timing model, the valuation factor's annualized return improves to 15.19%, significantly outperforming the benchmark [38][40] - The momentum factor shows a pre-timing annualized return of 10.11%, which increases to 15.73% post-timing, indicating improved performance [42][47] Group 4 - The volatility factor's pre-timing performance indicates an annualized return of 10.93%, while post-timing performance shows an increase to 15.73% [48][53] - The equal-weighted composite factor, derived from the four style factors, achieves an annualized return of 20.05% with a maximum drawdown of -41.97% [52][55] - The scoring system for the style factors is based on historical prediction accuracy, further refining the composite factor's performance [56][59]

Minsheng Securities· 2025-08-06 08:45

Quantitative Models and Construction Methods - **Model Name**: Random Forest Model **Model Construction Idea**: The core idea of the Random Forest model is to build multiple decision tree models and integrate their results to make predictions. It enhances generalization performance by reducing feature correlation through random feature selection and uses ensemble learning to improve accuracy and robustness[7][15][30]. **Model Construction Process**: 1. Use bootstrap sampling to extract multiple subsets from the original dataset[30]. 2. Build a decision tree for each subset, randomly selecting a portion of features at each node for splitting[31][32]. 3. Repeat the above steps until the specified number of decision trees is generated[33]. **Formula**: Entropy of the dataset: $ H(D)=-\sum\nolimits_{i=1}^{m}p_{i}log_{2}(p_{i}) $ Conditional entropy for feature "Market Sentiment": $ H(D|A)=\sum_{v\in{Values(A)}}\frac{|D_{v}|}{|D|}H(D_{v}) $ Information gain: $ Gain(D,A)=H(D)-H(D|A) $[22][24][25] **Model Evaluation**: The Random Forest model has strong generalization ability, effectively handles high-dimensional and missing data, and provides feature importance analysis. However, it has high computational complexity and sensitivity to parameter changes[12][66][70]. Model Backtesting Results - **Random Forest Model**: - **Annualized Return**: 39.76%[11][65] - **Excess Return**: 40.01%[11][65] - **Sharpe Ratio**: 2.82[11][65] - **Year-to-Date Return (2025)**: 73.81%[11][65] - **Year-to-Date Excess Return (2025)**: 60.49%[11][65] - **Maximum Drawdown**: -17.14%[65] - **Excess Maximum Drawdown**: -6.59%[65] - **Information Ratio (IR)**: 4.78[65] Quantitative Factors and Construction Methods - **Factor Selection**: Factors with an IC absolute value greater than 2.5% were selected for model fitting[8][44]. - **Factor List**: - **Positive IC Factors**: Turnover rate (6.44%), P/NAV (17.21%), Flow market value (22.02%), Previous week's return (15.38%), etc. - **Negative IC Factors**: Previous closing price (-9.95%), Opening price (-10.81%), Valuation of CSI REITs (-8.41%), etc.[45][46] Factor Backtesting Results - **Selected Factors**: - **IC Range**: From -23.87% to 22.02%[45][46] Model Construction Details - **Parameter Sensitivity Analysis**: - **Number of Trees (n_estimators)**: Tested within the range of 1 to 200. Optimal value selected at 100 based on RMSE minimization[49][51]. - **Feature Count (max_features)**: No restriction applied due to the limited number of features (27 factors)[52][53]. - **Tree Depth (max_depth)**: Optimal depth determined as 15 through grid search[55][58]. - **Minimum Samples per Leaf (min_samples_leaf)**: Optimal value determined as 15 through grid search[55][58]. Model Performance Metrics - **In-Sample Results**: - **Mean Squared Error (MSE)**: 0.00044[59] - **Root Mean Squared Error (RMSE)**: 0.021[60] - **R² (Coefficient of Determination)**: 0.501[61] - **Out-of-Sample Results**: - **R²**: 0.51983 (max_depth=15, min_samples_leaf=10)[56][58] Model Evaluation - **Advantages**: - Captures nonlinear relationships and complex interactions[68]. - Robust against noise and overfitting[68]. - Provides feature importance evaluation for better interpretability[68][69]. - **Disadvantages**: - High computational cost and complexity[70]. - Requires extensive hyperparameter tuning[70]. - Limited interpretability compared to linear models[70]. - Potential overfitting with deep trees[70]. - Challenges in handling imbalanced datasets[70].

从微观出发的风格轮动月度跟踪-20250801

Soochow Securities· 2025-08-01 03:34

Quantitative Models and Construction Methods - **Model Name**: Style Rotation Model **Model Construction Idea**: The model is built from micro-level stock characteristics, leveraging valuation, market capitalization, volatility, and momentum factors to construct a style timing and scoring system. It integrates micro-level indicators and machine learning techniques to optimize style rotation strategies[4][9] **Model Construction Process**: 1. Select 80 base factors as original features based on the Dongwu multi-factor system[9] 2. Construct 640 micro-level features from these base factors[4][9] 3. Replace absolute proportion division of style factors with common indices as style stock pools to create new style returns as labels[4][9] 4. Use rolling training with a Random Forest model to avoid overfitting risks, optimize feature selection, and generate style recommendations[4][9] 5. Develop a framework from style timing to scoring, and from scoring to actual investment decisions[9] **Model Evaluation**: The model effectively avoids overfitting risks and provides a comprehensive framework for style rotation strategies[9] Model Backtesting Results - **Style Rotation Model**: - Annualized Return: 16.66%[10][11] - Annualized Volatility: 19.57%[10][11] - Information Ratio (IR): 0.85[10][11] - Monthly Win Rate: 56.31%[10][11] - Maximum Drawdown: -29.34%[11] - Excess Return (vs Benchmark): 11.40%[10][11] - Excess Volatility (vs Benchmark): 13.04%[10][11] - Excess IR (vs Benchmark): 0.87[10][11] - Excess Monthly Win Rate (vs Benchmark): 57.28%[10][11] - Excess Maximum Drawdown (vs Benchmark): -9.73%[11] Quantitative Factors and Construction Methods - **Factor Name**: Valuation, Market Capitalization, Volatility, Momentum **Factor Construction Idea**: These factors are derived from micro-level stock characteristics and are used to construct style timing and scoring systems[4][9] **Factor Construction Process**: 1. Extract micro-level features from base factors[4][9] 2. Use these features to create style returns as labels for machine learning models[4][9] 3. Apply Random Forest models to optimize factor selection and timing[4][9] **Factor Evaluation**: These factors are foundational to the style rotation model and contribute to its effectiveness in timing and scoring[4][9] Factor Backtesting Results - **Valuation Factor**: Monthly Returns (2025/01-2025/05): -2.00%, 0.00%, 2.00%, 4.00%, 6.00%[13][20] - **Market Capitalization Factor**: Monthly Returns (2025/01-2025/05): -4.00%, -2.00%, 0.00%, 2.00%, 4.00%[13][20] - **Volatility Factor**: Monthly Returns (2025/01-2025/05): -6.00%, -4.00%, -2.00%, 0.00%, 2.00%[13][20] - **Momentum Factor**: Monthly Returns (2025/01-2025/05): -8.00%, -6.00%, -4.00%, -2.00%, 0.00%[13][20]