【广发金工】面向通用模型的时序数据增强方法

Core Viewpoint - Temporal Data Augmentation is increasingly recognized as a technique to enhance the generalization ability and robustness of quantitative models in finance, addressing the challenge of homogeneous data sources among investors [1][4][5]. Group 1: Temporal Data Augmentation - Temporal Data Augmentation involves various strategies such as shifting, scaling, perturbation, cropping, and synthesis to create a richer training sample space without introducing additional information [1][4]. - This technique is applicable not only to traditional machine learning models but also seamlessly integrates into deep learning architectures and reinforcement learning systems, expanding the expressiveness and adaptability of quantitative strategies [1][4]. Group 2: Application Methodology - The study uses GRU as a representative deep learning model to explore whether Temporal Data Augmentation can improve performance while keeping the original input data, network, loss function, and hyperparameter settings consistent [1][58]. - Two training modes are discussed: one with a fixed probability p for data augmentation and another with a linearly decaying probability p throughout the training process [2][63]. Group 3: Empirical Analysis - In the fixed probability p training mode, no significant improvement in factor performance was observed; however, in the linearly decaying probability p mode, various data augmentation factors showed improvements in RankIC and annualized returns [2][67]. - Specifically, the RankIC mean increased by 1.2%, and the annualized returns for long and short positions improved by 2.81% and 7.65%, respectively, when combining data augmentation factors with original data factors [2][75]. Group 4: Data Augmentation Techniques - The study identifies eight different temporal data augmentation techniques, including jittering, scaling, rotation, permutation, magnitude warping, time warping, window slicing, and window warping, and compares their performance against the original data [58][67]. - Among these techniques, jittering and scaling showed the highest correlation with the original data, indicating minimal disruption to the temporal information [59]. Group 5: Performance Metrics - The performance metrics for the various data augmentation methods under fixed probability p indicate that jittering and scaling achieved the highest RankIC win rates, while rotation and time warping resulted in significant information loss [68]. - In the linearly decaying probability p mode, jittering demonstrated the most substantial performance improvement, with a RankIC mean of 13.30% and an annualized return of 55.35% [75].