GRU模型

Search documents
国泰海通|金工:基于GRU、TCN模型的深度学习因子选股效果研究
国泰海通证券研究· 2025-07-30 14:37
Core Viewpoint - The report demonstrates the effectiveness of deep learning models, specifically GRU and TCN, in stock selection, with GRU showing slightly better performance than TCN+GRU and TCN. The 10-day return prediction model outperforms the 5-day model. The deep learning factors are highly correlated with low volatility and low liquidity factors, indicating potential investment strategies [1][2]. Group 1: Model Performance - The GRU model is confirmed to be effective, with advantages in prediction accuracy and training speed, making it widely used in the industry [1]. - The TCN model, based on CNN architecture, effectively captures long-term dependencies in time series data through causal convolution and residual connections [1]. - The annualized excess returns since 2017 for various indices are as follows: - CSI 300: 11.8% - CSI 500: 13.6% - CSI 1000: 21.7% - CSI 2000: 27.1% The current year's excess returns are -0.4%, 2.7%, 9.9%, and 9.3% respectively [1][3]. Group 2: Single Factor Stock Selection - The single-factor stock selection shows better performance in small and mid-cap stock pools (CSI 1000, CSI 2000), with minimal impact from market capitalization and industry neutrality [2]. - The original factor values in CSI 300 outperform the market capitalization and industry-neutralized factor values, indicating that deep learning factors capture style and industry rotation patterns [2]. Group 3: Composite Factor Stock Selection - Composite factors, when equally weighted, outperform single factors, and the report outlines the construction of index-enhanced strategies with specific constraints on stock turnover and market exposure [3]. - The maximum drawdown for the CSI 300 index-enhanced strategy since January 2017 is -6.0%, with a current year excess return of -0.4% [3]. - Allowing for slight market and industry exposure results in annualized excess returns of 8.8% for CSI 300 and 14.6% for CSI 500, with current year excess returns of -1.7% and 5.2% respectively [3].
行业轮动周报:ETF资金净流入红利流出高位医药,指数与大金融回调有明显托底-20250721
China Post Securities· 2025-07-21 10:13
Quantitative Models and Construction Methods - **Model Name**: Diffusion Index Model **Construction Idea**: The model is based on price momentum principles, aiming to capture upward trends in industry performance[25][37] **Construction Process**: 1. Calculate the diffusion index for each industry based on price momentum 2. Rank industries by their diffusion index values 3. Select industries with the highest diffusion index values for portfolio allocation **Formula**: Not explicitly provided in the report **Evaluation**: The model performs well during upward trends but struggles during reversals, as seen in historical performance[25][37] - **Model Name**: GRU Factor Model **Construction Idea**: The model leverages GRU (Gated Recurrent Unit) deep learning networks to analyze minute-level volume and price data for industry rotation[38][33] **Construction Process**: 1. Input minute-level volume and price data into the GRU network 2. Train the model using historical data to identify industry rotation signals 3. Generate GRU factor scores for each industry and rank them 4. Allocate portfolio weights based on GRU factor rankings **Formula**: Not explicitly provided in the report **Evaluation**: The model performs well in short cycles but faces challenges in long cycles and extreme market conditions[38][33] Model Backtesting Results - **Diffusion Index Model**: - Monthly average return: -0.81% - Excess return over equal-weighted industry benchmark: -1.61% (July 2025)[29] - Year-to-date excess return: 1.48%[24][29] - **GRU Factor Model**: - Weekly average return: -0.46% - Excess return over equal-weighted industry benchmark: -1.27% (July 2025)[36] - Year-to-date excess return: -5.75%[33][36] Quantitative Factors and Construction Methods - **Factor Name**: Diffusion Index **Construction Idea**: Measures industry momentum based on price trends[25][26] **Construction Process**: 1. Calculate the diffusion index for each industry using price data 2. Rank industries by diffusion index values 3. Select industries with the highest diffusion index values for portfolio allocation **Formula**: Not explicitly provided in the report **Evaluation**: Effective in capturing upward trends but vulnerable to reversals[25][26] - **Factor Name**: GRU Factor **Construction Idea**: Utilizes GRU deep learning networks to analyze minute-level volume and price data for industry rotation[38][33] **Construction Process**: 1. Input minute-level volume and price data into the GRU network 2. Train the model using historical data to identify industry rotation signals 3. Generate GRU factor scores for each industry and rank them 4. Allocate portfolio weights based on GRU factor rankings **Formula**: Not explicitly provided in the report **Evaluation**: Performs well in short cycles but struggles in long cycles and extreme market conditions[38][33] Factor Backtesting Results - **Diffusion Index Factor**: - Top-ranked industries (July 18, 2025): Comprehensive Finance (1.0), Comprehensive (0.998), Non-Banking Finance (0.996), Steel (0.995), Nonferrous Metals (0.994), Communication (0.993)[26][27] - Weekly changes in rankings: Consumer Services (+0.224), Food & Beverage (+0.208), National Defense (+0.091)[28] - **GRU Factor**: - Top-ranked industries (July 18, 2025): Banking (2.68), Transportation (2.42), Nonferrous Metals (-0.87), Steel (-1.92), Construction (-2.19), Coal (-2.36)[34] - Weekly changes in rankings: Building Materials (+), Banking (+), Comprehensive Finance (+)[34]
中金:一种结合自注意力机制的GRU模型
中金点睛· 2025-07-14 23:39
Core Viewpoint - The article discusses the evolution and optimization of time series models, particularly focusing on GRU and Transformer architectures, and introduces a new model called AttentionGRU(Res) that combines the strengths of both [1][6][49]. Group 1: Time Series Models Overview - Time series models, such as LSTM, GRU, and Transformer, are designed for analyzing and predicting sequential data, effectively addressing long-term dependencies through specialized gating mechanisms [1][8]. - GRU, as an optimized variant, enhances computational efficiency while maintaining long-term memory capabilities, making it suitable for real-time prediction scenarios [2][4]. - The Transformer model revolutionizes sequence modeling through self-attention mechanisms and position encoding, demonstrating significant advantages in analyzing multi-dimensional time series data [2][4]. Group 2: Performance Comparison of Factors - A systematic test of 159 cross-sectional factors and 158 time series factors revealed that while cross-sectional factors generally outperform time series factors, the latter showed better out-of-sample performance when used in RNN, LSTM, and GRU models [4][21]. - The average ICIR (Information Coefficient Information Ratio) for time series factors was found to be higher than that of cross-sectional factors, indicating better predictive performance despite a more dispersed distribution [4][20]. - In terms of returns, cross-sectional factors yielded a long-short excess return of 11%, compared to only 1% for time series factors, highlighting the differences in performance metrics [4][20]. Group 3: Model Optimization Strategies - The article explores various optimization strategies for time series models, including adjustments to the propagation direction of time series, optimization of gating structures, and overall structural combinations [5][27]. - Testing of BiGRU and GLU models showed limited improvement over the standard GRU model, while the Transformer model exhibited significant in-sample performance but suffered from overfitting in out-of-sample tests [5][28]. - The proposed AttentionGRU(Res) model combines a simplified self-attention mechanism with GRU, achieving a balance between performance and stability, resulting in an annualized excess return of over 30% in the full market [6][40][41]. Group 4: AttentionGRU(Res) Model Performance - The AttentionGRU(Res) model demonstrated strong performance, achieving a near 12.6% annualized excess return over the past five years in rolling samples, indicating its robustness in various market conditions [6][49]. - The model's generalization ability was validated within the CSI 1000 stock range, yielding an annualized excess return of 10.8%, outperforming traditional GRU and Transformer structures [6][46][49]. - The integration of residual connections and simplified self-attention structures in the AttentionGRU(Res) model significantly improved training stability and predictive performance [35][40].
行业轮动周报:融资资金持续大幅净流入医药,GRU行业轮动调出银行-20250616
China Post Securities· 2025-06-16 09:37
证券研究报告:金融工程报告 发布时间:2025-06-16 研究所 分析师:肖承志 SAC 登记编号:S1340524090001 Email:xiaochengzhi@cnpsec.com 研究助理:李子凯 SAC 登记编号:S1340124100014 Email:lizikai@cnpsec.com 近期研究报告 《谷歌更新 Gemini 2.5 Pro,阿里开源 Qwen3新模型——AI动态汇总20250609 【中邮金工】》 - 2025.06.09 《资金博弈停牌个股大幅流入信创 ETF,概念轮动速度较快——行业轮动 周报 20250608》 - 2025.06.09 《综合金融受益于稳定币表现突出, ETF 资金逢高净流出医药和消费——行 业轮动周报 20250601》 – 2025.06.02 《退潮周期情绪仍需等待恢复,ETF 净 流 入 国 防 军 工 — — 行 业 轮 动 周 报 20250525》 – 2025.05.26 《ETF 大幅流出红利,成长 GRU 行业因 子得分提升较大——行业轮动周报 20250518》 – 2025.05.19 《各大宽基指数成功补缺,融资资金大 幅 ...
机器学习因子选股月报(2025年5月)-20250430
Southwest Securities· 2025-04-30 08:14
Quantitative Models and Construction Methods GAN_GRU Model - **Model Name**: GAN_GRU - **Model Construction Idea**: The GAN_GRU model utilizes Generative Adversarial Networks (GAN) for processing volume-price time series features and then uses the GRU model for time series feature encoding to derive the stock selection factor[2][9]. - **Model Construction Process**: 1. **GRU Model**: - **Basic Assumptions**: The GRU+MLP neural network stock return prediction model includes 18 volume-price features such as closing price, opening price, trading volume, turnover rate, etc[10][13][15]. - **Training Data and Input Features**: All stocks' past 400 days of 18 volume-price features, sampled every 5 trading days. The feature sampling shape is 40*18, using the past 40 days of volume-price features to predict the cumulative return of the next 20 trading days[14]. - **Training and Validation Set Ratio**: 80%:20%[14]. - **Data Processing**: Extreme value removal and standardization in the time series for each feature within the 40 days, and cross-sectional standardization at the stock level[14]. - **Model Training Method**: Semi-annual rolling training, i.e., training the model every six months and using it to predict the returns for the next six months. Training dates are June 30 and December 31 each year[14]. - **Stock Selection Method**: Select all stocks in the cross-section, excluding ST and stocks listed for less than six months[14]. - **Training Sample Selection Method**: Exclude samples with empty labels[14]. - **Hyperparameters**: batch_size is the number of stocks in the cross-section, optimizer Adam, learning rate 1e-4, loss function IC, early stopping rounds 10, maximum training rounds 50[14]. - **Model Structure**: Two GRU layers (GRU(128, 128)) followed by MLP layers (256, 64, 64). The final output predicted return pRet is used as the stock selection factor[18]. 2. **GAN Model**: - **Introduction**: GANs consist of a generator and a discriminator. The generator aims to generate realistic data, while the discriminator aims to distinguish between real and generated data[19]. - **Generator**: - **Loss Function**: $$L_{G}\,=\,-\mathbb{E}_{z\sim P_{z}(z)}[\log(D(G(z)))]$$ where \(z\) represents random noise (usually Gaussian distributed), \(G(z)\) represents the data generated by the generator, and \(D(G(z))\) represents the probability that the discriminator judges the generated data as real[20][21]. - **Training Process**: Generate noise data, convert noise data to generated data using the generator, calculate generator loss, and update generator parameters through backpropagation[21][22]. - **Discriminator**: - **Loss Function**: $$L_{D}=-\mathbb{E}_{x\sim P_{d a t a}(x)}[\log\!D(x)]-\mathbb{E}_{z\sim P_{z}(z)}[\log(1-D(G(z)))]$$ where \(x\) is real data, \(D(x)\) is the probability that the discriminator judges the real data as real, and \(D(G(z))\) is the probability that the discriminator judges the generated data as real[23]. - **Training Process**: Sample real data, generate fake data, calculate discriminator loss, and update discriminator parameters through backpropagation[24][25]. - **GAN Training Process**: Alternately train the generator and discriminator until convergence[25][26]. 3. **GAN Feature Generation Model Construction**: - **LSTM Generator + CNN Discriminator**: To retain the time series nature of the input features, the LSTM model is used as the generator. The CNN model is used as the discriminator to match the two-dimensional volume-price time series features[29][30][33]. - **Feature Generation Process**: Input original volume-price time series features (Input_Shape=(40,18)), output volume-price time series features processed by LSTM (Input_Shape=(40,18))[33]. Model Evaluation - **Evaluation**: The GAN_GRU model effectively combines GAN and GRU to process and encode volume-price time series features, providing a robust stock selection factor[2][9]. Model Backtest Results - **GAN_GRU Model**: - **IC Mean**: 11.73%[37][38] - **Annualized Excess Return**: 24.89%[37][38] - **Latest IC**: 0.22% (as of April 28, 2025)[37][38] - **IC Mean in the Past Year**: 11.44%[37][38] - **Annualized Return**: 36.06%[38] - **Annualized Volatility**: 23.80%[38] - **Information Ratio (IR)**: 1.66[38] - **Maximum Drawdown**: 27.29%[38] - **Turnover Rate**: 0.83[38] - **ICIR**: 0.90[38] Quantitative Factors and Construction Methods GAN_GRU Factor - **Factor Name**: GAN_GRU - **Factor Construction Idea**: The GAN_GRU factor is derived from the GAN_GRU model, which processes volume-price time series features using GAN and encodes them using GRU[2][9]. - **Factor Construction Process**: The factor is generated by the GAN_GRU model, which includes the steps of feature processing by GAN and encoding by GRU as described in the model construction process[2][9][33]. - **Factor Evaluation**: The GAN_GRU factor shows strong performance in stock selection, with high IC values and significant excess returns[2][9]. Factor Backtest Results - **GAN_GRU Factor**: - **IC Mean**: 11.73%[37][38] - **Annualized Excess Return**: 24.89%[37][38] - **Latest IC**: 0.22% (as of April 28, 2025)[37][38] - **IC Mean in the Past Year**: 11.44%[37][38] - **Annualized Return**: 36.06%[38] - **Annualized Volatility**: 23.80%[38] - **Information Ratio (IR)**: 1.66[38] - **Maximum Drawdown**: 27.29%[38] - **Turnover Rate**: 0.83[38] - **ICIR**: 0.90[38]