GRU模型 - filings, earnings calls, financial reports, news

GRU模型

Search documents

SINOLINK SECURITIES· 2025-12-30 08:53

Group 1 - The core viewpoint of the report highlights a significant shift in the A-share market style from "value/low volatility" to "small-cap/momentum" in 2024, and further converging to "consensus growth" in 2025, leading to a pronounced mean reversion effect due to overcrowding in market capitalization factors [2][13] - During the extreme market conditions from August to September 2025, mainstream AI strategies failed to adapt to the rapid style shift, resulting in significant net value drawdowns that were highly correlated with small-cap factor reversals [2][17] - The report identifies that both traditional linear multi-factor models and advanced AI strategies experienced a notable decline in excess returns during extreme market conditions, with AI strategies suffering more than traditional ones due to their reliance on historical data paths [2][17] Group 2 - The report discusses the issue of strategy homogeneity within the industry, where the widespread use of models like GRU and LightGBM has led to a high correlation between factors generated by different institutions, increasing systemic risk during market reversals [3][24] - It emphasizes that the mismatch between training sample distributions and extreme market conditions is a critical factor in AI model failures, as these models struggle to capture asset linkage patterns during rare events [3][35] Group 3 - An external risk control system has been developed, independent of stock selection models, to address the challenges of traditional timing strategies, utilizing a standardized three-layer processing workflow to generate clear long/short signals [4][40] - The empirical backtesting of this timing framework shows significant improvements in annualized returns and drawdown control, with the annualized return for the composite strategy on the CSI A500 index reaching 10.61% and maximum drawdown reduced to 11.82% [4][45] Group 4 - The report outlines targeted optimizations for core AI models, including enhancements to the LightGBM model through a "high-quality sample weighting" mechanism and the use of Huber Loss to reduce sensitivity to outliers, resulting in a significant reduction in maximum drawdown [5][61] - For the GRU model, the introduction of Attention Pooling and a memory module with CVaR Loss has improved the model's ability to utilize historical information effectively, leading to a substantial increase in excess returns and a decrease in maximum drawdown [5][67]

AI+HI系列：DecompGRNv1：基于线性RNN的端到端模型初探

Huachuang Securities· 2025-09-05 08:12

Quantitative Models and Construction Methods 1. Model Name: RNN-LIN - **Model Construction Idea**: Simplify the traditional GRU model by using a linear RNN structure, reducing parameter complexity while maintaining competitive performance[2][17][20] - **Model Construction Process**: - The model uses a linear RNN structure with only a forget gate and an output gate. The hidden state is updated without non-linear activation functions - Equations: $ h_{t} = f_{t} \otimes h_{t-1} + (1 - f_{t}) \otimes c_{t} $ $ y_{t} = o_{t} \otimes h_{t} $ $ f_{t} = Sigmoid(x_{t}W_{f}) $ $ o_{t} = Sigmoid(x_{t}W_{o}) $ $ c_{t} = SiLU(x_{t}W_{c}) $ - $f_{t}$: Forget gate - $o_{t}$: Output gate - $c_{t}$: Candidate state[20][21] - The model reduces parameters by approximately 50% compared to GRU[21] - **Evaluation**: The linear RNN model shows slightly weaker performance than GRU but remains competitive. Adding GLU modules improves its performance significantly[22][53] 2. Model Name: DecompGRN - **Model Construction Idea**: Extend the linear RNN by integrating cross-sectional information directly into the RNN gating mechanism, enabling simultaneous modeling of temporal and cross-sectional data[2][50] - **Model Construction Process**: - The first RNN layer outputs individual stock representations at each time step - Cross-sectional information is incorporated by grouping stocks based on market capitalization and calculating group de-meaned values - The second RNN layer combines temporal and cross-sectional information in the forget and output gates - Equations: $ h_{t} = f_{t} \otimes h_{t-1} + (1 - f_{t}) \otimes c_{t} $ $ y_{t} = o_{t} \otimes h_{t} $ $ f_{t} = Sigmoid(x_{t}W_{f}) $ $ o_{t} = Sigmoid(x_{t}W_{o}) $ $ c_{t} = SiLU(x_{t}W_{c}) $ - $f_{t}$: Forget gate - $o_{t}$: Output gate - $c_{t}$: Candidate state[50][55] - **Evaluation**: DecompGRN outperforms the GRU baseline in terms of RankIC and RankICIR while maintaining only 43% of the GRU's parameter count[74][53] --- Model Backtest Results 1. RNN-LIN - **RankIC**: - CSI All Share: 0.13 - CSI 300: 0.10 - CSI 500: 0.09 - CSI 1000: 0.12[36][37] - **RankICIR**: - CSI All Share: 1.08 - CSI 300: 0.62 - CSI 500: 0.71 - CSI 1000: 0.96[36][37] - **IC Win Rate**: - CSI All Share: 0.88 - CSI 300: 0.74 - CSI 500: 0.78 - CSI 1000: 0.86[36][37] - **Annualized Return (Top Group)**: - CSI All Share: 42.59% - CSI 300: 28.59% - CSI 500: 23.68% - CSI 1000: 32.81%[42] 2. DecompGRN - **RankIC**: - CSI All Share: 0.141 - CSI 300: 0.099 - CSI 500: 0.098 - CSI 1000: 0.127[55][58] - **RankICIR**: - CSI All Share: 1.26 - CSI 300: 0.65 - CSI 500: 0.77 - CSI 1000: 1.08[55][58] - **IC Win Rate**: - CSI All Share: 0.89 - CSI 300: 0.74 - CSI 500: 0.78 - CSI 1000: 0.88[55][58] - **Annualized Return (Top Group)**: - CSI All Share: 57.68% - CSI 300: 31.69% - CSI 500: 26.9% - CSI 1000: 40.35%[57][58] --- Index Enhancement Test Results (DecompGRN) - **Annualized Excess Return**: - CSI 300: 10.24% - CSI 500: 10.05% - CSI 1000: 19.58%[75][85] - **Tracking Error**: - CSI 300: 5.07 - CSI 500: 6.1 - CSI 1000: 6.75[75][85] - **Cumulative Excess Return (as of 2025-08-27)**: - CSI 300: 3.93% - CSI 500: 6.72% - CSI 1000: 18.26%[75][85]

【广发金工】面向通用模型的时序数据增强方法

广发金融工程研究· 2025-07-31 03:11

Core Viewpoint - Temporal Data Augmentation is increasingly recognized as a technique to enhance the generalization ability and robustness of quantitative models in finance, addressing the challenge of homogeneous data sources among investors [1][4][5]. Group 1: Temporal Data Augmentation - Temporal Data Augmentation involves various strategies such as shifting, scaling, perturbation, cropping, and synthesis to create a richer training sample space without introducing additional information [1][4]. - This technique is applicable not only to traditional machine learning models but also seamlessly integrates into deep learning architectures and reinforcement learning systems, expanding the expressiveness and adaptability of quantitative strategies [1][4]. Group 2: Application Methodology - The study uses GRU as a representative deep learning model to explore whether Temporal Data Augmentation can improve performance while keeping the original input data, network, loss function, and hyperparameter settings consistent [1][58]. - Two training modes are discussed: one with a fixed probability p for data augmentation and another with a linearly decaying probability p throughout the training process [2][63]. Group 3: Empirical Analysis - In the fixed probability p training mode, no significant improvement in factor performance was observed; however, in the linearly decaying probability p mode, various data augmentation factors showed improvements in RankIC and annualized returns [2][67]. - Specifically, the RankIC mean increased by 1.2%, and the annualized returns for long and short positions improved by 2.81% and 7.65%, respectively, when combining data augmentation factors with original data factors [2][75]. Group 4: Data Augmentation Techniques - The study identifies eight different temporal data augmentation techniques, including jittering, scaling, rotation, permutation, magnitude warping, time warping, window slicing, and window warping, and compares their performance against the original data [58][67]. - Among these techniques, jittering and scaling showed the highest correlation with the original data, indicating minimal disruption to the temporal information [59]. Group 5: Performance Metrics - The performance metrics for the various data augmentation methods under fixed probability p indicate that jittering and scaling achieved the highest RankIC win rates, while rotation and time warping resulted in significant information loss [68]. - In the linearly decaying probability p mode, jittering demonstrated the most substantial performance improvement, with a RankIC mean of 13.30% and an annualized return of 55.35% [75].

China Post Securities· 2025-06-05 07:20

Quantitative Models and Construction Methods GRU Model - **Model Name**: GRU - **Model Construction Idea**: The GRU model is used to mine volume and price information, and this report explores its ability to incorporate financial information[2][14]. - **Model Construction Process**: - **Data Range**: 20130101-20250430, all market stocks (excluding Beijing Stock Exchange)[16] - **Input**: Each stock has one sample at the end of each month, containing volume and price information for the past 240 trading days, including 7 fields: opening price, highest price, lowest price, closing price, trading volume, trading amount, and turnover rate. Each field is standardized using z-score for 240 values[16]. - **Prediction Target**: Next month's return rate standardized by cross-section (opening price at the beginning of the month to closing price at the end of the month)[16]. - **Training Set**: Samples from the past 6 years, divided into training and validation sets in a 4:1 ratio according to time sequence[16]. - **Training Method**: Rolling training every month, early stopping if the loss function does not decrease for 10 consecutive rounds[16]. - **Model Evaluation**: The GRU model can simultaneously mine volume and price information and financial information. The high-frequency processing of financial information improves the model results to some extent[2][18]. - **Model Testing Results**: - **Annualized Excess Return**: 8.75% - **IR**: 2.25 - **Maximum Drawdown**: 4.71%[3][19][23] GRU Model with Financial Information - **Model Name**: GRU with Financial Information - **Model Construction Idea**: Incorporating financial information into the GRU model to improve its performance[4][24]. - **Model Construction Process**: - **Simple Splicing of Financial Information**: Financial data is calculated as TTM value according to the latest available quarterly report for each trading day, then spliced into new columns. The matrix containing volume and price information and fundamental information is standardized and input into the GRU network[25]. - **Adjusted Financial Information**: Assuming the TTM value of financial indicators grows steadily at the quarterly growth rate, the daily adjustment formula for TTM values is: $$ \mathrm{DFTTM}_{\mathrm{q1}}={\frac{\mathrm{FactorTTM}_{\mathrm{q1}}-\mathrm{FactorTTM}_{\mathrm{q0}}}{a b s\big(\mathrm{FactorTTM}_{\mathrm{q0}}\big)}} $$ $$ \mathrm{Factort} = \mathrm{FactorTTMq} + \mathrm{abs(FactorTTMq)} \times \left(\frac{90}{1}\right) $$ where t is the trading day, q is the financial report period (March 31, June 30, September 30, December 31)[36][38]. - **Model Evaluation**: Incorporating financial information improves the overall performance of the baseline model, especially before 2022. However, after 2023, the improvement is weaker or even negative[4][35][42]. - **Model Testing Results**: - **Annualized Excess Return**: 7.76% - **IR**: 1.65 - **Maximum Drawdown**: 5.40%[41][44] GRU Model with Simplified Financial Information - **Model Name**: GRU with Simplified Financial Information - **Model Construction Idea**: Simplifying the financial indicators to only include important ones like net profit TTM and market value[45]. - **Model Construction Process**: - **Simplified Financial Information**: Only retaining important indicators like net profit TTM and market value, and incorporating them into the GRU model[45]. - **Model Evaluation**: Simplifying the financial indicators improves the overall performance of the model, especially before 2022. After 2023, the improvement is weaker but still positive[45][55]. - **Model Testing Results**: - **Annualized Excess Return**: 9.97% - **IR**: 1.93 - **Maximum Drawdown**: 5.70%[51][52] Mixed Frequency Model - **Model Name**: Mixed Frequency Model (barra5d + daily GRU) - **Model Construction Idea**: Combining long-term and short-term prediction capabilities by integrating barra5d and daily GRU models[56][65]. - **Model Construction Process**: - **Input**: Combining the daily GRU model with the barra5d model, which is trained on 240-minute intraday data to predict the next 1-5 days' returns[56][65]. - **Model Evaluation**: The mixed frequency model significantly improves the performance of the barra5d model, especially after October 2024. Adding fundamental information further stabilizes the annual excess performance[65][72][80]. - **Model Testing Results**: - **Annualized Excess Return**: 11.82% - **IR**: 2.39 - **Maximum Drawdown**: 5.70%[77][78] Model Backtesting Results GRU Model - **Annualized Excess Return**: 8.75% - **IR**: 2.25 - **Maximum Drawdown**: 4.71%[3][19][23] GRU Model with Financial Information - **Annualized Excess Return**: 7.76% - **IR**: 1.65 - **Maximum Drawdown**: 5.40%[41][44] GRU Model with Simplified Financial Information - **Annualized Excess Return**: 9.97% - **IR**: 1.93 - **Maximum Drawdown**: 5.70%[51][52] Mixed Frequency Model (barra5d + daily GRU) - **Annualized Excess Return**: 11.82% - **IR**: 2.39 - **Maximum Drawdown**: 5.70%[77][78]

金工专题报告：结合基本面和量价特征的GRU模型

China Post Securities· 2025-06-05 06:23

Quantitative Models and Construction GRU Model - **Model Name**: GRU baseline model [2][3][14] - **Model Construction Idea**: The GRU model is designed to extract information from historical price and volume data to predict future returns. It serves as a baseline to evaluate the impact of adding financial data [14][15]. - **Model Construction Process**: - **Data Range**: All A-share stocks (excluding Beijing Stock Exchange) from 2013-01-01 to 2025-04-30 [16]. - **Input Features**: Past 240 trading days' price and volume data, including open price, high price, low price, close price, trading volume, turnover, and turnover rate. Each feature is standardized using z-score [16]. - **Prediction Target**: Next month's standardized return (from the opening price at the beginning of the month to the closing price at the end of the month) [16]. - **Training**: Rolling training with a 4:1 split between training and validation sets over the past six years. Early stopping is applied if the loss function does not decrease for 10 consecutive iterations [16]. - **Portfolio Construction**: Enhanced portfolio based on the CSI 1000 index, with constraints on stock weight deviation (1%), style deviation (within 0.1 standard deviation), and industry deviation (1%). Monthly rebalancing with a turnover rate of 50% per side [18]. - **Model Evaluation**: The GRU model demonstrates stable performance in extracting price-volume information, achieving consistent excess returns across years [19]. GRU Model with Financial Data - **Model Name**: GRU with financial data [4][24][25] - **Model Construction Idea**: Incorporates financial data into the GRU model to enhance its ability to predict future returns by combining price-volume and fundamental information [14][24]. - **Model Construction Process**: - **Financial Data**: Includes 20 fields from income statements, such as revenue, cost of goods sold, management expenses, R&D costs, and net profit. Data is converted to TTM (trailing twelve months) values [24][25]. - **Integration**: Financial data is appended to the price-volume matrix, standardized, and input into the GRU model [25]. - **Adjustment**: To address frequency mismatches, financial data is adjusted daily based on the assumption of stable TTM growth rates. The adjustment formula is: $$ \text{Factor}_{t} = \text{Factor}_{\text{TTM}_{q}} + \text{abs}(\text{Factor}_{\text{TTM}_{q}}) \cdot \frac{90}{\text{days in quarter}} $$ where $ t $ is the trading day and $ q $ is the financial reporting quarter [36][38]. - **Model Evaluation**: Adding financial data improves performance before 2023 but weakens it afterward. Adjusting financial data enhances overall performance, especially in earlier years [42][45]. Mixed-Frequency GRU Model - **Model Name**: Mixed-frequency GRU model (barra5d + daily GRU) [5][56][65] - **Model Construction Idea**: Combines long-term and short-term prediction capabilities by integrating daily and intraday GRU models [56][65]. - **Model Construction Process**: - **Daily GRU**: Trained on 240 trading days of daily data to predict monthly returns [16]. - **Intraday GRU (barra5d)**: Trained on 240 minutes of intraday data to predict 5-day returns, neutralized for Barra style factors [56]. - **Integration**: The two models are combined to leverage their complementary strengths [65]. - **Model Evaluation**: The mixed-frequency model significantly improves stability and excess returns, addressing weaknesses in individual models [67][68]. Mixed-Frequency GRU with Financial Data - **Model Name**: Mixed-frequency GRU with financial data (barra5d + daily GRU + financial data) [5][73][74] - **Model Construction Idea**: Enhances the mixed-frequency model by incorporating selected financial data to improve stability and performance across years [73][74]. - **Model Construction Process**: - **Financial Data Selection**: Only key financial indicators, such as net profit TTM and market capitalization, are retained to avoid redundancy [45]. - **Integration**: Financial data is appended to the mixed-frequency model, following the same adjustment process as the GRU with financial data model [36][38]. - **Model Evaluation**: The addition of financial data further stabilizes annual excess returns and improves overall performance metrics [77][80]. --- Model Backtesting Results GRU Baseline Model - **Excess Annualized Return**: 8.75% [19][23] - **IR**: 2.25 [19][23] - **Maximum Drawdown**: 4.71% [19][23] GRU with Financial Data - **Excess Annualized Return**: 6.86% [32][33] - **IR**: 1.46 [32][34] - **Maximum Drawdown**: 6.14% [32][34] GRU with Adjusted Financial Data - **Excess Annualized Return**: 7.76% [41][44] - **IR**: 1.65 [41][44] - **Maximum Drawdown**: 5.40% [41][44] GRU with Selected Financial Data - **Excess Annualized Return**: 9.97% [51][52] - **IR**: 1.93 [51][52] - **Maximum Drawdown**: 5.70% [51][52] Mixed-Frequency GRU Model - **Excess Annualized Return**: 11.32% [68][69] - **IR**: 2.42 [68][69] - **Maximum Drawdown**: 8.19% [68][69] Mixed-Frequency GRU with Financial Data - **Excess Annualized Return**: 11.82% [77][78] - **IR**: 2.39 [77][78] - **Maximum Drawdown**: 5.70% [77][78]