Quantitative Factors and Construction Factor Name: GAN_GRU Factor - Construction Idea: The GAN_GRU factor is derived by processing volume-price time-series features using a Generative Adversarial Network (GAN) model, followed by encoding these time-series features with a Gated Recurrent Unit (GRU) model to generate a stock selection factor [4][13][41] - Construction Process: 1. Input Features: 18 volume-price features such as closing price, opening price, turnover, and turnover rate are used as input data. These features are sampled every 5 trading days over the past 400 days, resulting in a feature matrix of shape (40,18) [14][17][18] 2. Data Preprocessing: - Outlier removal and standardization are applied to each feature over the 40-day time series - Cross-sectional standardization is performed at the stock level [18] 3. GAN Model: - Generator: An LSTM-based generator is used to preserve the sequential nature of the input features. The generator takes random noise (e.g., Gaussian distribution) as input and generates data that mimics the real data distribution [23][33][37] - Discriminator: A CNN-based discriminator is employed to classify real and generated data. The discriminator uses convolutional layers to extract features from the 2D volume-price time-series "images" [33][35] - Loss Functions: - Generator Loss: where represents random noise, is the generated data, and is the discriminator's output probability for the generated data being real [24] - Discriminator Loss: where is real data, is the discriminator's output probability for real data, and is the discriminator's output probability for generated data [27] 4. GRU Model: - Two GRU layers (GRU(128,128)) are used to encode the time-series features, followed by an MLP (256,64,64) to predict future returns [22] 5. Factor Output: The predicted returns () from the GRU+MLP model are used as the stock selection factor. The factor is neutralized for industry and market capitalization effects and standardized [22] Factor Evaluation - The GAN_GRU factor effectively captures the sequential and cross-sectional characteristics of volume-price data, leveraging the strengths of GANs for feature generation and GRUs for time-series encoding [4][13][41] --- Factor Backtesting Results GAN_GRU Factor Performance Metrics - IC Mean: 11.43% (2019-2025), 10.97% (last year), 9.27% (latest month) [41][42] - ICIR: 0.89 [42] - Turnover Rate: 0.82 [42] - Annualized Return: 38.52% [42] - Annualized Volatility: 23.82% [42] - IR: 1.62 [42] - Maximum Drawdown: 27.29% [42] - Annualized Excess Return: 24.86% [41][42] GAN_GRU Factor Industry Performance - Top 5 Industries by IC (Latest Month): - Home Appliances: 27.00% - Non-Bank Financials: 23.08% - Retail: 20.01% - Steel: 14.83% - Textiles & Apparel: 13.64% [41][42] - Top 5 Industries by IC (Last Year): - Utilities: 14.43% - Retail: 13.33% - Non-Bank Financials: 13.28% - Steel: 13.23% - Telecommunications: 12.36% [41][42] GAN_GRU Factor Long Portfolio Performance - Top 5 Industries by Excess Return (Latest Month): - Textiles & Apparel: 5.19% - Utilities: 3.62% - Automobiles: 3.29% - Non-Bank Financials: 2.56% - Pharmaceuticals: 1.47% [2][43] - Top 5 Industries by Average Monthly Excess Return (Last Year): - Home Appliances: 5.44% - Building Materials: 4.70% - Textiles & Apparel: 4.19% - Agriculture: 4.09% - Utilities: 3.92% [2][43]
机器学习因子选股月报(2025年8月)-20250730
Southwest Securities·2025-07-30 05:43