分位数随机森林
Search documents
ETF策略系列:基于QRF分布预测的科技类ETF轮动策略
Yin He Zheng Quan· 2025-12-22 09:36
Quantitative Models and Construction Methods 1. Model Name: Quantile Regression Forest (QRF) - **Model Construction Idea**: QRF is an extension of Random Forest, designed to estimate the full conditional distribution of response variables. It predicts not only the conditional mean but also the quantiles of the distribution, making it suitable for short-term risk control and tail risk identification in volatile markets like technology indices [28][40][41] - **Model Construction Process**: 1. Random Forest generates a collection of decision trees using subsets of data. Each tree predicts the conditional mean of the response variable [36][37] 2. QRF extends this by retaining all observed values in each node, enabling the estimation of conditional quantiles. The conditional distribution is expressed as: $$F(y|X=x)=P(Y\leq y|X=x)=E(1_{[Y\leq y]}|X=x)$$ [40] 3. The prediction for a quantile is calculated as: $$E(1_{\{Y\leq y\}}|X=x)=\sum_{i=1}^{n}w_{i}(x)\,1_{\{Y\leq y\}}=P(Y\leq y|X=x)$$ [41] 4. The process involves selecting dense quantile points, generating trees, calculating weights, and approximating the distribution through quantile interpolation [43] - **Model Evaluation**: QRF effectively captures the short-term distribution of asset returns, especially for tail risks, and provides reliable predictions for risk control and asset selection [28][40][41] 2. Model Name: Fama-French Five-Factor Model - **Model Construction Idea**: This model evaluates and attributes the returns of risky assets by incorporating five systematic risk dimensions: market, size, value, profitability, and investment factors [44][45] - **Model Construction Process**: 1. The model extends the CAPM formula: $$E(R_i) = R_f + \beta_1(R_m - R_f) + \beta_2SMB + \beta_3HML + \beta_4RMW + \beta_5CMA$$ [45] 2. Factor definitions: - **MKT**: Market factor, calculated as the weighted average return of all stocks minus the risk-free rate [49] - **SMB**: Size factor, representing the return difference between small-cap and large-cap stocks [49] - **HML**: Value factor, representing the return difference between high book-to-market and low book-to-market stocks [49] - **RMW**: Profitability factor, representing the return difference between high and low profitability stocks [49] - **CMA**: Investment factor, representing the return difference between conservative and aggressive investment firms [49] 3. Weekly factor data is used as input variables for QRF to predict weekly return quantiles of technology indices [52] - **Model Evaluation**: The model provides a comprehensive framework for explaining asset returns and serves as a robust input for QRF predictions [44][45][49] --- Model Backtesting Results 1. QRF Model - **Annualized Return**: 24.19% (2020-2025), 87.17% (2025) [86] - **Sharpe Ratio**: 1.16 (2020-2025), 2.91 (2025) [86] - **Calmar Ratio**: 0.91 (2020-2025), 8.73 (2025) [86] - **Maximum Drawdown**: -26.70% (2020-2025), -9.99% (2025) [86] - **Cumulative Return**: 245.45% (2020-2025), with an excess return of 156.10% over the Sci-Tech Innovation 50 Index [86] --- Quantitative Factors and Construction Methods 1. Factor Name: Quantile-Based Return Metrics - **Factor Construction Idea**: Quantile-based metrics (e.g., 50% and 75% quantiles) represent the central tendency and upper tail of the predicted return distribution [61] - **Factor Construction Process**: 1. Use QRF to predict the 50% and 75% quantiles of the return distribution [61] 2. Calculate the average return as: $$E(X) = \int_{-\infty}^{+\infty} Xf(X)dX$$ where \(f(X)\) is the probability density function [61] - **Factor Evaluation**: The Spearman IC values for these metrics are 0.0642 (50% quantile), 0.0582 (75% quantile), and 0.0719 (average return), indicating predictive effectiveness [62] 2. Factor Name: Risk-Adjusted Return Metrics - **Factor Construction Idea**: These metrics evaluate returns per unit of risk, incorporating Sharpe, Sortino, and Omega ratios [63] - **Factor Construction Process**: 1. **Sharpe Ratio**: $$Sharpe = \frac{E(R) - R_f}{\sigma}$$ 2. **Sortino Ratio**: $$Sortino = \frac{E(R) - R_f}{\sigma_{down}}$$ 3. **Omega Ratio**: $$Omega = \frac{E(R_{up}) \cdot P_{up}}{E(R_{down}) \cdot P_{down}}$$ where \(P_{up}\) and \(P_{down}\) are the probabilities of positive and negative returns, respectively [63] - **Factor Evaluation**: The Spearman IC values are 0.0616 (Sharpe), 0.0581 (Sortino), and 0.0602 (Omega), confirming their effectiveness [64] 3. Factor Name: Win Rate - **Factor Construction Idea**: Win rate measures the probability of achieving positive returns [64] - **Factor Construction Process**: $$WinRate = \frac{\text{Number of positive return samples}}{\text{Total number of samples}}$$ [64] - **Factor Evaluation**: The Spearman IC value is 0.0586, indicating its predictive validity [64] --- Factor Backtesting Results 1. Quantile-Based Return Metrics - **50% Quantile IC**: 0.0642 [62] - **75% Quantile IC**: 0.0582 [62] - **Average Return IC**: 0.0719 [62] 2. Risk-Adjusted Return Metrics - **Sharpe Ratio IC**: 0.0616 [64] - **Sortino Ratio IC**: 0.0581 [64] - **Omega Ratio IC**: 0.0602 [64] 3. Win Rate - **Win Rate IC**: 0.0586 [64]