机器学习选股

Search documents
机器学习因子选股月报(2025年10月)-20250930
Southwest Securities· 2025-09-30 04:03
- The GAN_GRU factor is based on the GAN_GRU model, which utilizes a Generative Adversarial Network (GAN) for processing volume-price time series features and then uses a GRU model for time series feature encoding to derive the stock selection factor[4][13][14] - The GAN_GRU model includes two GRU layers (GRU(128, 128)) followed by an MLP (256, 64, 64), with the final output prediction return (pRet) used as the stock selection factor[22] - The GAN model consists of a generator and a discriminator. The generator aims to generate data that appears real, while the discriminator aims to distinguish between real and generated data. The generator's loss function is $L_{G} = -\mathbb{E}_{z\sim P_{z}(z)}[\log(D(G(z)))]$[23][24][25] - The discriminator's loss function is $L_{D} = -\mathbb{E}_{x\sim P_{data}(x)}[\log D(x)] - \mathbb{E}_{z\sim P_{z}(z)}[\log(1-D(G(z)))]$[27][28][29] - The GAN_GRU model's training process involves alternating training of the generator and discriminator until convergence[30] - The GAN_GRU factor's performance from January 2019 to September 2025 shows an IC mean of 0.1136, an annualized excess return of 22.58%, and a recent IC of 0.1053 as of September 28, 2025[41][42] - The GAN_GRU factor's IC mean for the past year is 0.0982, with the highest IC values in the coal, building materials, social services, non-bank finance, and food & beverage industries[42][44] - The top-performing long portfolios in September 2025, based on the GAN_GRU factor, include sectors like building materials, steel, social services, coal, and non-bank finance, with excess returns of 5.78%, 5.13%, 1.91%, 1.55%, and 1.21%, respectively[45] - Over the past year, the top-performing long portfolios based on the GAN_GRU factor include home appliances, building materials, food & beverage, utilities, and textiles & apparel, with average monthly excess returns of 5.04%, 4.96%, 3.92%, 3.53%, and 3.10%, respectively[46] - The top stocks in each industry based on the GAN_GRU factor as of September 28, 2025, include companies like Baolaite, Yutaiwei-U, Cangge Mining, Tuowei Information, Hengtong Co., Angang Co., and others[49][50]
【广发金工】机器学习选股训练手册
广发金融工程研究· 2025-06-20 06:25
Core Viewpoint - The article discusses the increasing application of machine learning in quantitative stock selection, particularly focusing on GBDT and neural network models, as traditional factors have become less effective [1][4]. Group 1: Model Selection - Machine learning has been widely adopted in quantitative stock selection, with GBDT models (including LGBM, XGBoost, and CatBoost) and neural networks (including GRU, TCN, and Transformer) being the primary focus [1]. - GBDT models are effective for handling manually constructed features, while neural networks excel in capturing temporal changes in features [2]. Group 2: Feature Data Preparation - Different model types require different feature types; tree models handle price and fundamental features well, while neural networks perform better with high-frequency data [22][27]. - Feature selection methods, particularly SHAP, can effectively reduce the number of features while maintaining model performance [2][31]. - Standardization of features before feeding them into models is crucial for improving model performance [2][35]. Group 3: Loss Function Adjustment and Prediction Target Processing - Besides the common MSE loss function, investors often use IC as a loss function, with various ranking loss functions showing improved performance [2][37]. - Using cross-sectional normalization helps the model focus on differences in cross-sectional returns, enhancing factor performance [3][50]. Group 4: Machine Learning Models - GBDT is highlighted as a superior algorithm due to its iterative approach of updating target values based on residuals from previous trees [10][11]. - Neural networks, including RNN, LSTM, GRU, CNN, TCN, and Transformer, are discussed for their effectiveness in various domains, particularly in time series prediction [12][19]. Group 5: Index Enhancement Strategies - The article presents the performance of various index enhancement strategies, with the CSI 300 index showing an annualized excess return of 10.03% and a maximum drawdown of -5.42% [3]. - The CSI 500 index strategy has a slightly lower annualized excess return of 8.41% with a maximum drawdown of -10.78%, while the CSI 1000 index strategy shows a more stable performance with an annualized excess return of 11.44% and a maximum drawdown of -7.95% [3].