Workflow
FISHER模型
icon
Search documents
首个多模态工业信号基座模型FISHER,权重已开源,来自清华&上交等
36氪· 2025-07-24 10:36
Core Viewpoint - The article discusses the introduction of the FISHER model, a unified modeling framework for heterogeneous industrial signals, developed by researchers from Tsinghua University, Shanghai Jiao Tong University, Beijing Huakong Zhijia Technology Co., Ltd., and North China Electric Power University. The model aims to address the challenges posed by the M5 problem in industrial signal analysis [2][5]. Research Background - The increasing installation of sensors on industrial equipment has led to a significant challenge in efficiently analyzing the diverse industrial signals generated. The M5 problem encompasses multiple modalities, sampling rates, scales, tasks, and fault occurrences, which complicates the analysis [3][4]. - Existing methods typically focus on narrow ranges of industrial signals and fail to leverage the advantages of large data training and the complementarity between different modalities [3][4]. Research Motivation - Despite the apparent differences in industrial signals, their underlying features and semantic information are often similar. This similarity allows for the possibility of using a single model to unify the modeling of heterogeneous industrial signals [5]. FISHER Model Introduction - FISHER is the first foundational model designed for multi-modal industrial signals, utilizing sub-bands as modeling units and employing a "building block" approach to represent entire signals. It can handle industrial signals with any sampling rate [7][9]. - The model uses Short-Time Fourier Transform (STFT) for signal input features, which is crucial for capturing high-frequency fault components and maintaining consistent time-frequency resolution across different sampling rates [8][9]. Model Architecture - FISHER consists of a ViT Encoder and a CNN Decoder, employing a "teacher-student" self-distillation pre-training method. The model processes 80% of the masked sub-bands and reconstructs the signal representation by combining the masked and unmasked parts [11]. Open Source Availability - The FISHER model has been made available in three different sizes: tiny (5.5M), mini (10M), and small (22M), all pre-trained on a mixed dataset totaling 17,000 hours [12]. RMIS Benchmark Introduction - The RMIS benchmark includes five anomaly detection datasets and thirteen fault diagnosis datasets across four modalities, designed to evaluate the model's performance in various health management tasks [16][17]. Experimental Results - FISHER outperformed baseline models on the RMIS benchmark, with improvements of at least 3.91%, 4.34%, and 5.03% across its three versions. It demonstrated strong generalization capabilities, particularly excelling in fault diagnosis tasks due to its ability to utilize complete frequency bands [20][22]. Scaling Effects - The scaling performance of FISHER models was superior to baseline systems, indicating that even the smallest version of FISHER surpassed all baseline models. The study suggests that data cleaning and increasing data ratios are critical for scaling up the model effectively [24]. Variable Splitting Ratio - FISHER exhibited the highest area under the curve in variable splitting scenarios, indicating its robust performance even when faced with changing data splits [27].