时间卷积网络(TCN)
Search documents
机器学习应用系列:强化学习驱动下的解耦时序对比选股模型
Southwest Securities· 2025-12-25 11:40
Quantitative Models and Construction Model Name: DTLC_RL (Decoupled Temporal Contrastive Learning with Reinforcement Learning) - **Model Construction Idea**: The model aims to combine the nonlinear predictive power of deep learning with interpretability by decoupling feature spaces, enhancing representation through contrastive learning, ensuring independence via orthogonal constraints, and dynamically fusing spaces using reinforcement learning[2][11][12] - **Model Construction Process**: - **Feature Space Decoupling**: Three orthogonal latent spaces are constructed to capture market systemic risk (β space), stock-specific signals (α space), and fundamental information (θ space). Each space is equipped with a specialized encoder: TCN for β space, Transformer for α space, and gated residual MLP for θ space[11][12][92] - **Contrastive Learning**: Introduced within each space to enhance robustness by constructing positive and negative sample pairs based on return similarity. The InfoNCE loss function is used to maximize the similarity of positive pairs while minimizing that of negative pairs: $$L_{\mathrm{InfotNCE}}=-E\left[l o g~\frac{e x p\left(f(x)^{\top}f(x^{+})/\tau\right)}{e x p\left(f(x)^{\top}f(x^{+})/\tau\right)+\sum_{i=1}^{N-1}~e x p\left(f(x)^{\top}f(x_{i}^{-})/\tau\right)}\right]$$ where \(f(x)\) is the feature representation, \(x^+\) is the positive sample, \(x^-\) is the negative sample, and \(\tau\) is the temperature parameter[55][56] - **Orthogonal Constraints**: A loss function is added to ensure the outputs of the three spaces are statistically independent, reducing multicollinearity and enhancing interpretability[12][104] - **Reinforcement Learning Fusion**: A PPO-based reinforcement learning mechanism dynamically adjusts the weights of the three spaces based on market conditions. The reward function includes components for return correlation, weight stability, and weight diversification: $$r_{t}=R_{t}^{I C}\big(\widehat{y_{t}},y_{y}\big)+\lambda_{s}R_{t}^{s t a b l e}+\lambda_{d}R_{t}^{d i v}$$ The PPO optimization process includes GAE advantage estimation and a clipped policy loss: $$L^{C L P}=E\left[\operatorname*{min}(r\dot{A},c l i p(r,1-\varepsilon,1+\varepsilon)\dot{A})\right]$$[58][120][121] - **Model Evaluation**: The DTLC_RL model demonstrates strong predictive power and interpretability, with dynamic adaptability to market conditions[2][12][122] Model Name: DTLC_Linear - **Model Construction Idea**: A baseline model for comparison, using a linear layer to fuse the three feature spaces[98][100] - **Model Construction Process**: - The encoded information from the three spaces is concatenated and passed through a linear layer with a Softmax activation to generate fusion weights. The model is trained with a multi-task loss function, including IC maximization, contrastive learning loss, and orthogonal constraints[98][104] - **Model Evaluation**: Provides a benchmark for evaluating the contribution of reinforcement learning in DTLC_RL[98][103] Model Name: DTLC_Equal - **Model Construction Idea**: A simpler baseline model that equally weights the three feature spaces without dynamic adjustments[98] - **Model Construction Process**: The outputs of the three spaces are directly averaged to generate predictions[98] - **Model Evaluation**: Serves as a control group to assess the benefits of dynamic weighting in DTLC_RL[98][103] --- Model Backtesting Results DTLC_RL - **IC**: 0.1250[123] - **ICIR**: 4.38[123] - **Top 10% Portfolio Annualized Return**: 34.77%[123] - **Annualized Volatility**: 25.41%[123] - **IR**: 1.37[123] - **Maximum Drawdown**: 40.65%[123] - **Monthly Turnover**: 0.71X[123] DTLC_Linear - **IC**: 0.1239[105] - **ICIR**: 4.25[105] - **Top 10% Portfolio Annualized Return**: 32.95%[105] - **Annualized Volatility**: 24.39%[105] - **IR**: 1.35[105] - **Maximum Drawdown**: 35.94%[105] - **Monthly Turnover**: 0.76X[105] DTLC_Equal - **IC**: 0.1202[105] - **ICIR**: 4.06[105] - **Top 10% Portfolio Annualized Return**: 32.46%[105] - **Annualized Volatility**: 25.29%[105] - **IR**: 1.28[105] - **Maximum Drawdown**: 40.65%[105] - **Monthly Turnover**: 0.71X[105] --- Quantitative Factors and Construction Factor Name: Beta_TCN - **Factor Construction Idea**: Captures market systemic risk by quantifying stock sensitivity to common risk factors like macroeconomic fluctuations and market sentiment[67] - **Factor Construction Process**: - Five market-related features are selected, including beta to market returns, volatility sensitivity, liquidity beta, size exposure, and market sentiment sensitivity[72] - A TCN encoder processes 60-day time-series data, using dilated causal convolutions to capture short- and medium-term trends. The output is a 32-dimensional vector representing systemic risk features[68] - **Factor Evaluation**: Demonstrates moderate stock selection ability and effectively captures market-related information[73] Factor Name: Alpha_Transformer - **Factor Construction Idea**: Extracts stock-specific alpha signals from price-volume time-series data[76] - **Factor Construction Process**: - Thirteen price-volume features are encoded using a multi-scale Transformer model, with separate layers for short-, medium-, and long-term information. Outputs are fused using a gated mechanism and passed through a fully connected layer for return prediction[77][78] - **Factor Evaluation**: Exhibits strong predictive power and stock selection ability, with relatively low correlation to market benchmarks[81][82] Factor Name: Theta-ResMLP - **Factor Construction Idea**: Focuses on fundamental information to assess financial safety margins and risk resistance[88] - **Factor Construction Process**: - Eight core financial indicators, including PE, PB, ROE, and dividend yield, are encoded using a gated residual MLP. The architecture includes input projection, gated residual blocks, and a final output layer[92] - **Factor Evaluation**: Provides stable stock selection performance with lower turnover and drawdown compared to other spaces[95][96] --- Factor Backtesting Results Beta_TCN - **IC**: 0.0969[73] - **ICIR**: 3.73[73] - **Top 10% Portfolio Annualized Return**: 27.73%[73] - **Annualized Volatility**: 27.19%[73] - **IR**: 1.02[73] - **Maximum Drawdown**: 45.80%[73] - **Monthly Turnover**: 0.79X[73] Alpha_Transformer - **IC**: 0.1137[81] - **ICIR**: 4.19[81] - **Top 10% Portfolio Annualized Return**: 32.66%[81] - **Annualized Volatility**: 23.04%[81] - **IR**: 1.42[81] - **Maximum Drawdown**: 27.59%[81] - **Monthly Turnover**: 0.83X[81] Theta-ResMLP - **IC**: 0.0485[95] - **ICIR**: 1.87[95] - **Top 10% Portfolio Annualized Return**: 23.88%[95] - **Annualized Volatility**: 23.96%[95] - **IR**: 0.99[95] - **Maximum Drawdown**: 37.41%[95] - **Monthly Turnover**: 0.41X[95]