In the field of advanced manufacturing, cryogenic machining has emerged as a technique for reducing cutting temperatures and minimizing thermal dissipation by injecting liquid nitrogen into the workpiece. However, the intense cutting forces generated during cryogenic machining can lead to tool failure, such as chipping, overshadowing traditional concerns about tool wear. The distinctive attributes of cryogenic processes render it challenging to visually inspect tool failure or collect sufficient data, resulting in data-deficient and imbalanced scenarios. Furthermore, conventional data-driven models often exhibit degraded performance under diverse process conditions, leading to erroneous predictions that hinder the manufacturing workflow. Consequently, there is a pressing need for data-driven tool-condition prediction models that operate robustly across varying conditions, especially in data-deficient cryogenic milling processes. This work presents a novel model, Supervised Contrastive Gated Recurrent Unit AutoEncoder, called SupCGAE, that addresses the challenges of diverse process conditions in cryogenic milling under data scarcity. SupCGAE leverages multi-source domain generalization by integrating a supervised autoencoder architecture with supervised contrastive learning. Utilizing cutting force data, SupCGAE learns generalizable features from multiple source domains, enabling robust tool failure predictions in target domains. Despite class imbalances in the dataset, where failure or normal cases are rare yet critical to detect, the model demonstrates exceptional diagnostic ability. Through experimentation, the model has demonstrated high predictive power for tool-failure diagnosis, maintaining consistent performance across various process conditions. The application of this model to cryogenic milling improves productivity and cost efficiency by enabling timely tool replacement based on reliable predictive insights.