With the rapid increase in the volume of data collected from diverse sources, Knowledge Graphs (KGs) have emerged as essential tools for extracting and integrating knowledge from raw data. Initially de- signed for information retrieval, KGs are now widely used in various artificial intelligence applications, including question answering and recommender systems. However, several critical challenges remain in effectively applying KGs to real-world problems. First, constructing a domain-specific KG from scratch is highly complex and often lacks automation, especially when no pre-existing KG is available. Second, the practical use of KGs requires not only their construction but also structural learning to extract and leverage the knowledge embedded within the graph. However, research that integrates KG construction with structural learning in downstream tasks remains limited. Finally, KGs are inherently incomplete, as they cannot capture all facts and are not updated in real-time. To address these issues, this dissertation proposes three studies that leverage language models to overcome key limitations in KG research. The first study presents a framework for automated KG con- struction in the domain of technology opportunity discovery, integrating both structured and unstructured data sources. A document classification model is used to define semantic relationships between technolo- gies and startups within the KG. Based on the constructed KG, a novel index is introduced to identify promising technologies. The second study combines large language models with retrieval-augmented generation to construct a reliable medical KG that captures relationships among medical codes within electronic health records. It further proposes a framework that jointly learns from the structural informa- tion of the KG to enhance predictive performance in healthcare applications. The third study focuses on the sentence-like structure of KG triples and proposes an efficient and effective KG completion model using a 2D Discrete Fourier Transform as an alternative to self-attention. This approach effectively bal- ances efficiency and performance, ensuring its applicability to practical tasks. By automating KG construction from heterogeneous data, incorporating structural learning, and ad- dressing incompleteness, the proposed methods provide practical solutions to key challenges in KG ap- plications. This dissertation presents frameworks and a model that effectively leverage language models to address the core limitations of KGs for practical use.
Publisher
Ulsan National Institute of Science and Technology