机器学习系列之一:mHC对Barra机器学习因子的改进

Quantitative Models and Construction Methods Model Name: mHC-MLP - Model Construction Idea: The mHC-MLP model introduces manifold-constrained hyper-connections (mHC) into the traditional MLP framework to address issues such as low signal-to-noise ratio, non-stationarity, and extreme tail behavior in financial data. It achieves this by incorporating multi-stream residual channels, gated fan-in/fan-out mappings, and doubly stochastic manifold projections (via Sinkhorn-Knopp) to enhance numerical stability and extrapolation resistance[1][16][22]. - Model Construction Process: 1. Multi-Stream Residual Channels: The model expands the single residual channel in traditional ResNet to multiple parallel sub-streams, allowing independent feature representations and dynamic routing between streams[19][20]. 2. Manifold Constraints: - Residual mixing matrices are constrained to the Birkhoff polytope (doubly stochastic matrices), ensuring non-negativity, row sums of 1, and column sums of 1. This is achieved using the Sinkhorn-Knopp algorithm during training[22][23][54]. - Fan-in and fan-out mappings are constrained to non-negative values using sigmoid functions, ensuring that output features remain within the convex hull of input features[24]. 3. Dynamic Routing Mechanism: The model uses a combination of linear mixing (via residual matrices) and non-linear transformations (via MLP blocks) to balance feature interaction and noise suppression[49][50][51]. 4. Deep Stacking: The mHC-MLP extends the network depth to six layers, leveraging the numerical stability provided by manifold constraints to capture higher-order interactions[56][57]. 5. Initialization and Regularization: Parameters are initialized with minimal values (e.g., alpha = 0.01) to ensure stable gradient flow during early training stages. Regularization is achieved through manifold constraints rather than traditional dropout or L2 regularization[25][55]. - Model Evaluation: The mHC-MLP model demonstrates improved numerical stability, reduced overfitting, and enhanced robustness against noise. However, it may underperform in short-term, high-volatility scenarios due to its conservative nature[2][75][86]. --- Model Backtesting Results mHC-MLP Model - Cumulative Return: 49% (compared to 56% for the unconstrained MLP model)[75] - t-Statistic: Not explicitly mentioned for mHC-MLP - IC_IR: Not explicitly mentioned for mHC-MLP - Turnover: Lower than the unconstrained MLP model, indicating better stability[2][75] - Maximum Drawdown: Lower than the unconstrained MLP model, reflecting reduced risk exposure[2][75] --- Quantitative Factors and Construction Methods Factor Name: Barra MLP Factor - Factor Construction Idea: The Barra MLP factor leverages neural networks to capture non-linear interactions and complex relationships between Barra style factors and residual stock returns, overcoming the limitations of traditional linear factor models[30][31]. - Factor Construction Process: 1. Baseline Risk Model: A long-term risk model is constructed using the Barra CNE6 framework, incorporating one country factor, 31 industry factors, and 15 style factors (e.g., size, beta, momentum, value)[36][37][38]. 2. Residual Return Extraction: Stock returns are decomposed into common factor contributions and residual returns via cross-sectional regression. The residual returns serve as the prediction target for the MLP model[40]. 3. Rolling Training: The MLP model is trained using rolling windows of 24, 36, and 72 months to balance bias and variance. Features include the 15 style factors, and the target is the next-period residual return[41]. 4. Multi-Period Signal Synthesis: Predictions from the three training windows are standardized (Z-score) and combined using equal weighting or IC-based weighting to generate a composite factor[42][43]. 5. Orthogonalization: The composite factor is regressed against the 15 style factors to remove linear correlations, ensuring it provides incremental information[44]. 6. Pure Factor Return Calculation: The orthogonalized factor is incorporated into an enhanced Barra risk model, and its pure factor return is estimated via cross-sectional regression[45]. - Factor Evaluation: The Barra MLP factor effectively captures non-linear alpha signals and demonstrates significant cumulative returns and IC_IR values, validating its utility in quantitative strategies[46]. --- Factor Backtesting Results Barra MLP Factor - Cumulative Return: Over 15%[46] - t-Statistic: 2.8[46] - IC_IR: 0.45[46] - Turnover: Not explicitly mentioned - Maximum Drawdown: Not explicitly mentioned --- Composite Model: mHC-Enhanced Barra MLP Factor - Model Construction Idea: The mHC-enhanced Barra MLP factor integrates the mHC architecture into the Barra MLP framework to improve robustness and stability while retaining the ability to capture non-linear interactions[48]. - Model Construction Process: The MLP core in the Barra MLP factor is replaced with the mHC-MLP architecture, maintaining the same input features, target variables, and training framework. This modification introduces manifold constraints and dynamic routing to enhance numerical stability and reduce overfitting[48][49][50]. - Model Evaluation: While the mHC-enhanced factor demonstrates superior stability and robustness, it may lag in short-term, high-volatility markets due to its conservative design[75][86]. --- Composite Model Backtesting Results mHC-Enhanced Barra MLP Factor - Cumulative Return: Not explicitly mentioned - t-Statistic: Not explicitly mentioned - IC_IR: Not explicitly mentioned - Turnover: Lower than the original Barra MLP factor[2][75] - Maximum Drawdown: Lower than the original Barra MLP factor[2][75]