Inland water systems are essential resources for human survival but can become vectors for disease transmission when contaminated by pathogens such as harmful algal blooms (HABs) and Escherichia coli (E. coli). Effective monitoring of microbial water quality is critical for public health protection, but traditional methods are costly and resource-intensive. This research integrates remote sensing techniques and machine learning models, including convolutional neural networks (CNNs) and Random Forest (RF), to estimate microbial water quality parameters and evaluate model performance. First, four state-of-the-art data-driven model structures were applied to monitor the vertical distribution of chlorophyll-a (Chl-a), phycocyanin (PC), and turbidity (Turb) using drone-borne hyperspectral imagery, in-situ measurement, and meteoroidal data (Chapter 3). One of the models demonstrated significant performance, and Gradient-weighted Class Activation Mapping (Grad-CAM) identified informative reflectance band ranges for the vertical estimation of pigments. Second, a spatial attention CNN was applied to estimate Chl-a and PC concentrations in the Geum, Nakdong, and Yeongsan rivers in South Korea to evaluate cyanobacteria using remote sensing reflectance data (Chapter 4). The spatial attention CNN model outperformed conventional bio-optical algorithms, showing high potential for broader application across diverse water bodies by accounting for their unique optical properties. Third, E. coli concentrations in an irrigation pond were estimated using demosaiced natural color (red, green, and blue: RGB) imagery in the visible and infrared spectral ranges, along with 14 water quality parameters (Chapter 5). Two data-splitting methods - ordinary and quantile data splitting - were utilized to compare model performance metrics. The combination of water quality parameters and RGB imagery data resulted in a higher R2 value for the test dataset, demonstrating the utility of demosaiced RGB imagery as a useful predictor of E. coli concentration. Lastly, an RF algorithm was developed to estimate E. coli concentrations in irrigation pond water using a) 17 water quality variables, b) reflectance in five spectral bands, and c) 24 remote sensing indices derived from these reflectance values (Chapter 6). While the accuracy of the RF model with five reflectance values was moderate, the RF model utilizing remote sensing indices achieved the highest testing R2. This predictive strength of remote sensing indices can be attributed to their ability to characterize water quality factors critical for E. coli survival. I expected that these approaches would enhance the applicability of machine learning models for estimating the microbial quality of inland water systems, providing valuable tools for water quality management and policy-making.
Publisher
Ulsan National Institute of Science and Technology
Degree
Doctor
Major
Department of Civil, Urban, Earth, and Environmental Engineering (Environmental Science and Engineering)