股票收益预测模型

Search documents
DeepTiming:日内信息与相似度学习驱动择时
Minsheng Securities· 2025-07-31 09:02
Quantitative Models and Construction Methods 1. Model Name: Deep Learning Stock Return Prediction Model - **Model Construction Idea**: This model is based on a deep learning framework tailored to the current market environment. It integrates daily and minute-frequency inputs to predict stock returns and generate trading signals based on historical rolling thresholds[1][10][22] - **Model Construction Process**: - **Input Layer**: Combines 51 technical/sentiment daily features, 7 basic daily price-volume indicators, 10 enhanced style factors, and 52 minute-frequency features aggregated to daily frequency[22] - **Training Layer**: Utilizes meta-learning to adapt to new market data dynamically, avoiding overfitting to historical data[14] - **Output Layer**: Employs LinSAT neural networks to impose constraints on the output, ensuring specific objectives like controlling style and industry exposures[18] - **Loss Function**: Multi-period mean squared error (MSE) is used to stabilize predictions for timing strategies[22] - **Formula**: Multi-period return prediction as \( y = (n, 1) \), where \( n \) represents the number of stocks[22] - **Model Evaluation**: Demonstrates robustness in adapting to market changes and controlling exposures, with significant predictive power for timing strategies[10][22] 2. Model Name: SimStock - **Model Construction Idea**: SimStock uses self-supervised learning to predict stock similarities, incorporating both static and dynamic correlations. It leverages contrastive learning to dynamically capture time-series information beyond traditional industry and style classifications[2][47][48] - **Model Construction Process**: - **Input**: Past 40-day price-volume data, Barra style factors, and capital flow indicators[52] - **Positive and Negative Sample Construction**: Positive samples are generated as \( X_{pos} = X + (1-\alpha)X_{rand} \), where \( \alpha = 0.75 \) and \( X_{rand} \) is a random feature sample[52] - **Embedding**: LSTM initializes dynamic attention weights, and CLS tokens aggregate sequence information into stock attribute vectors[52] - **Similarity Calculation**: Stock similarity is measured using cosine similarity between attribute vectors[52] - **Model Evaluation**: Effectively identifies stocks with high similarity, primarily within the same industry, but without clear patterns in market capitalization or sub-industry[56] 3. Model Name: Improved GRU Model with SimStock Integration - **Model Construction Idea**: Enhances the GRU-based stock return prediction model by initializing hidden states with SimStock-generated stock attribute vectors, improving stability across different stock types[57][59] - **Model Construction Process**: - **Initialization**: SimStock attribute vectors replace the GRU model's initial hidden state[57] - **Training**: Retains the same training setup as the baseline GRU model, with adjustments to incorporate the new initialization[59] - **Model Evaluation**: Demonstrates improved predictive performance and stability, particularly in timing strategies across diverse stocks[60][63] 4. Model Name: Index Timing Model - **Model Construction Idea**: Aggregates individual stock signals into index signals using weighted predictions based on market capitalization, followed by threshold-based signal generation[77] - **Model Construction Process**: - **Aggregation**: Combines stock return predictions into index return predictions using market-cap weights[77] - **Signal Generation**: Uses the 60th percentile of past-year predictions as the buy threshold and the 40th percentile as the sell threshold[77] - **Holding Period**: Maintains positions for at least 5 trading days to reduce turnover[77] - **Model Evaluation**: Effective in generating excess returns, particularly in high-volatility sectors[79][82][84] --- Model Backtest Results 1. Deep Learning Stock Return Prediction Model - **Cumulative Excess Return**: 77% over 5 years[33] - **Annualized Return**: 27%[33] - **Excess Return vs. Stocks**: 11.3% (pre-cost)[33] 2. SimStock - **Cumulative Excess Return**: 109% over 5 years[60] - **Annualized Return**: 30%[60] - **Excess Return vs. Stocks**: 14.8% (pre-cost)[60] - **Daily Win Rate**: 57.4%[60] - **Holding Probability**: 45.7%[60] 3. Index Timing Model - **HS300**: Annualized Return 5.1%, Excess Return 5.6%, Max Drawdown 7.7%[79] - **CSI500**: Annualized Return 12.4%, Excess Return 12.2%, Max Drawdown 7.1%[82] - **CSI1000**: Annualized Return 15.1%, Excess Return 14.9%, Max Drawdown 11.3%[84] 4. Sector Timing - **Best Sector**: Electric Power Equipment & New Energy, Annualized Return 36%, Excess Return 31.1%[101] --- Quantitative Factors and Construction Methods 1. Factor Name: Reinforced Style Factor (PPO Model) - **Factor Construction Idea**: Uses PPO reinforcement learning to predict market style preferences, generating more interpretable and robust risk factors compared to traditional deep learning[12] - **Factor Construction Process**: - **Input**: Traditional style factors and recent stock price-volume data[12] - **Reward Function**: Stability-penalized market return goodness-of-fit[12] - **Output**: Enhanced style factor representing AI market preferences[12] - **Factor Evaluation**: Provides a stable and interpretable representation of market style dynamics[12] --- Factor Backtest Results 1. Reinforced Style Factor - **RankIC**: Weekly average of 4.5% since 2019[36] - **Annualized Return**: 23.2% for long-only portfolios, Excess Return 18.3% vs. CSI800[36]