机器学习因子选股月报（2025年6月）

Quantitative Models and Construction Methods GAN_GRU Model - Model Name: GAN_GRU - Model Construction Idea: The GAN_GRU model utilizes Generative Adversarial Networks (GAN) for processing volume-price time series features and then employs the GRU model for time series feature encoding to derive the stock selection factor[2][9]. - Model Construction Process: 1. GRU Model: - Volume-Price Features: Includes 18 volume-price features such as closing price, opening price, trading volume, turnover rate, etc.[10][13][15]. - Training Data and Input Features: Uses past 400 days of 18 volume-price features for all stocks, sampling every 5 trading days. The feature sampling shape is 40*18, predicting cumulative returns for the next 20 trading days[14]. - Training and Validation Set Ratio: 80% training set, 20% validation set[14]. - Data Processing: Extreme value removal and standardization in time series for each feature, cross-sectional standardization at the stock level[14]. - Model Training Method: Semi-annual rolling training, training points are June 30 and December 31 each year[14]. - Stock Screening Method: Excludes ST and stocks listed for less than half a year[14]. - Training Sample Screening Method: Excludes samples with empty labels[14]. - Hyperparameters: batch_size is the number of stocks in the cross-section, optimizer Adam, learning rate 1e-4, loss function IC, early stopping rounds 10, maximum training rounds 50[14]. - Model Structure: Two GRU layers (GRU(128, 128)) followed by MLP layers (256, 64, 64), with the final output pRet as the stock selection factor[18]. 2. GAN Model: - Generator: Learns the real distribution of data and generates samples that look like real data. The loss function is: $L_{G}\,=\,-\mathbb{E}_{z\sim P_{z}(z)}[\log(D(G(z)))]$ where $z$ is random noise, $G(z)$ is the data generated by the generator, and $D(G(z))$ is the probability output by the discriminator[20][21][22]. - Discriminator: Distinguishes real data from generated data. The loss function is: $L_{D}=-\mathbb{E}_{x\sim P_{d a t a}(x)}[\log\!D(x)]-\mathbb{E}_{z\sim P_{z}(z)}[\log(1-D(G(z)))]$ where $x$ is real data, $D(x)$ is the probability output by the discriminator for real data, and $D(G(z))$ is the probability output by the discriminator for generated data[23][24][25]. - Training Process: Alternating training of generator and discriminator until convergence[25][26]. - Model Structure: Uses LSTM as the generator to retain the time series nature of the input features and CNN as the discriminator to match the two-dimensional volume-price time series features[29][30][31]. - Feature Generation: The generator part of the trained GAN model is used for feature generation, inputting original volume-price time series features and outputting processed features[33][34]. Model Evaluation - Evaluation: The GAN_GRU model effectively combines GAN and GRU to process and encode volume-price time series features, showing promising results in stock selection[2][9]. Model Backtest Results - GAN_GRU Model: - IC Mean: 11.57%[37][38] - ICIR: 0.89[38] - Turnover Rate: 0.83[38] - Recent IC: -0.28%[37][38] - One-Year IC Mean: 11.54%[37][38] - Annualized Return: 36.60%[38] - Annualized Volatility: 24.02%[38] - IR: 1.66[38] - Maximum Drawdown: 27.29%[38] - Annualized Excess Return: 24.89%[38] Quantitative Factors and Construction Methods GAN_GRU Factor - Factor Name: GAN_GRU Factor - Factor Construction Idea: The GAN_GRU factor is derived from the GAN_GRU model, which processes volume-price time series features using GAN and encodes them using GRU[2][9]. - Factor Construction Process: Same as the GAN_GRU model construction process described above[2][9][10][14][18][19][20][21][22][23][24][25][26][29][30][31][33][34]. Factor Backtest Results - GAN_GRU Factor: - IC Mean: 11.57%[37][38] - ICIR: 0.89[38] - Turnover Rate: 0.83[38] - Recent IC: -0.28%[37][38] - One-Year IC Mean: 11.54%[37][38] - Annualized Return: 36.60%[38] - Annualized Volatility: 24.02%[38] - IR: 1.66[38] - Maximum Drawdown: 27.29%[38] - Annualized Excess Return: 24.89%[38]