多模态多尺度股价预测模型

Search documents
深度学习研究报告:股价预测之多模态多尺度
GF SECURITIES· 2025-03-07 09:20
Quantitative Models and Factor Analysis Summary Quantitative Models and Construction - **Model Name**: Multi-modal Multi-scale Stock Price Prediction Model **Model Construction Idea**: The model integrates multi-modal (chart data and time-series data) and multi-scale (different frequency data) features to enhance stock price prediction accuracy. It employs four independent deep time-series models and convolutional models for feature extraction, using both regression and classification losses for end-to-end training[14][17][18]. **Model Construction Process**: 1. **Multi-modal Features**: Combines time-series price-volume data and standardized price-volume charts. Time-series models capture abstract numerical relationships, while convolutional models identify chart patterns[17]. 2. **Multi-scale Features**: Incorporates 1-minute high-frequency data, daily data, and weekly data. High-frequency data is factorized into 55 features, which are then input into time-series models[18]. 3. **Lightweight Design**: Reduces the parameter size of each sub-model to 1/4 of the initial version, minimizing overfitting and computational resource dependency[18]. 4. **Multi-head Output**: Outputs include absolute future returns and categorical predictions (up, flat, down), using mean squared error and cross-entropy as loss functions[19]. **Model Evaluation**: The model demonstrates significant improvements in prediction accuracy and excess returns compared to the initial version[14][17][19]. Model Backtesting Results - **RankIC Mean**: - All Market: 8.7% - CSI 300: 7.9% - CSI 500: 6.6% - CSI 800: 6.9% - CSI 1000: 8.2% - CNI 2000: 8.7% - ChiNext: 10.4%[21][116] - **RankIC Win Rate**: - All Market: 86.7% - CSI 300: 69.0% - CSI 500: 73.5% - CSI 800: 75.2% - CSI 1000: 84.8% - CNI 2000: 86.1% - ChiNext: 89.2%[21][116] - **Excess Annualized Returns**: - All Market: 12.97% - CSI 300: 9.17% - CSI 500: 5.30% - CSI 800: 8.38% - CSI 1000: 7.47% - CNI 2000: 7.47% - ChiNext: 11.52%[21][117] Quantitative Factors and Construction - **Factor Name**: Model-derived Factor **Factor Construction Idea**: Derived from the model's predictions, the factor captures both numerical relationships and chart patterns, leveraging multi-modal and multi-scale data[14][17][18]. **Factor Construction Process**: 1. Predictions from time-series models and convolutional models are combined. 2. Multi-frequency data (1-minute, daily, weekly) is processed to extract features. 3. Factor values are generated based on the model's outputs, including both regression and classification results[14][17][18]. **Factor Evaluation**: The factor shows low correlation with traditional Barra style factors, indicating its uniqueness[22][23]. Factor Backtesting Results - **Correlation with Barra Factors**: - Liquidity: -18% - Volatility: -16% - Size: -8%[22][23] - **RankIC Mean**: - All Market: 8.7% - CSI 300: 7.9% - CSI 500: 6.6% - CSI 800: 6.9% - CSI 1000: 8.2% - CNI 2000: 8.7% - ChiNext: 10.4%[21][116] - **RankIC Win Rate**: - All Market: 86.7% - CSI 300: 69.0% - CSI 500: 73.5% - CSI 800: 75.2% - CSI 1000: 84.8% - CNI 2000: 86.1% - ChiNext: 89.2%[21][116] - **Excess Annualized Returns**: - All Market: 12.97% - CSI 300: 9.17% - CSI 500: 5.30% - CSI 800: 8.38% - CSI 1000: 7.47% - CNI 2000: 7.47% - ChiNext: 11.52%[21][117]