RELIABILITY ENGINEERING & SYSTEM SAFETY, v.266, pp.111637
Abstract
Accurate fault diagnosis using deep learning (DL) has become essential for effective quality control, maintenance, and process automation in various industrial processes. However, an efficient labeling strategy is required because constructing large-scale labeled datasets to train DL-based predictive models entails considerable cost and labor. While active learning (AL) has been a prominent solution for efficient data labeling in fault diagnosis, existing AL approaches are unsuitable in practice due to low-budget scenarios where there is insufficient labeled data to train the model stably. In this regard, this work proposes a novel method, called a hybrid deep active learning for low-budget (HDAL-LB) scenarios, that addresses emerging challenges in the label-scarce regime to perform efficient fault diagnosis. First, self-supervised learning is performed with a deep stacked residual variational auto-encoder to efficiently initialize an encoder for latent feature extraction. Second, an evidential learning-based training technique is developed to enable a cost-efficient generation of calibrated predictive uncertainty. Third, a hybrid query selection is systematically formulated under a combinatorial optimization framework, utilizing both uncertainty and data diversity for deep AL. The efficacy of the proposed method (i.e., HDAL-LB) in fault diagnosis is validated through four case studies, utilizing three public benchmark datasets and one private real-world dataset. The comprehensive experimental results demonstrate the superior performance of HDAL-LB under low-budget scenarios compared to existing baseline and state-of-the-art (SOTA) AL methods. Furthermore, extensive ablation studies demonstrate that HDAL-LB consistently exhibits effective fault diagnosis performance across various experimental settings, highlighting its label efficiency and practical applicability in real-world practice.