This thesis presents Semantic-GSL, a novel framework that integrates semantic description into the probabilistic estimation process for indoor gas source localization (GSL) by leveraging Vision-Language Models (VLM). In complex indoor environments, conventional GSL approaches face significant chal- lenges due to turbulent airflow, obstacles, and non-stationary gas dispersion, which produce sparse and irregular concentration patterns. Consequently, mobile robots relying solely on chemical sensors often suffer from slow convergence or become trapped in local minima of the information-theoretic objective. While vision-based methods have been introduced to mitigate these issues, they are fundamentally lim- ited by the closed-set assumption of traditional object detectors (e.g., YOLO), which fail to recognize potential sources that are not explicitly defined in their training categories. This limitation restricts the robot’s ability to identify diverse or previously unseen leak sources in realistic scenarios. To address these limitations, we utilize the crucial real-world prior that emission sources are typi- cally spatially associated with semantically relevant objects (e.g., gas cylinders, valves) and propose a two-stage VLM framework. This framework enables the extraction of semantic information about high- probability source objects from general environmental descriptions (e.g., “an object likely to leak gas”) without requiring predefined object names. In the first stage, Grounding DINO serves as a region pro- posal network, generating a broad set of candidate objects based on the text prompt. Subsequently, CLIP refines these candidates by evaluating the semantic similarity between the visual features and the text description, effectively filtering out irrelevant detections and robustly identifying semantic cues even for unseen objects. The extracted semantic information is then incorporated into the proposed Semantic-Informed Parti- cle Filter (SIPF). Unlike conventional methods that treat visual data as a separate likelihood update, SIPF utilizes the Metropolis-Hastings (MH) algorithm to modify the proposal distribution within the particle filter. Specifically, the MH step biases the sampling process to redistribute particles toward semantically relevant regions while maintaining spatial continuity through a distance-based transition kernel. This process effectively reshapes the prior belief before the measurement update, allowing the estimation to converge rapidly toward the true source even when gas measurements are intermittent. We validate the effectiveness of Semantic-GSL through extensive simulations in four distinct in- door environments with varying layouts and obstacle configurations, as well as real-world experiments using a mobile robot platform. The results demonstrate that the proposed method significantly outper- forms existing state-of-the-art approaches, including dual-mode and mode-change strategies, achieving a higher success rate, reduced search time, and lower estimation error, thereby proving its robustness and practicality in unknown indoor environments.
Publisher
Ulsan National Institute of Science and Technology