This thesis presents the first population-scale characterization of the Korean mitochondrial landscape using 10,239 whole-genome sequences. Through an optimized and scalable assembly pipeline, we generated 10,166 complete and circularized mitogenomes within three days, enabling highly accurate mitochondrial reconstruction at national cohort scale. These high-fidelity assemblies made it possible to reconstruct a whole-mitogenome phylogenetic tree directly from de novo sequences rather than relying on predefined haplogroup markers. Unsupervised principal component analysis of complete mitogenomes revealed four distinct Korean Mitochondrial (KM) clusters, representing fine-scale maternal lineage structure not captured by traditional haplogroup classifications. The KM clusters were defined by 24 population-informative variants and corresponded to separate branches of the M, N, and R macrohaplogroups, indicating unexpectedly deep maternal lineage divergence within Korea. Among the cluster-defining variants, one corresponded to the 9-bp deletion in the COX2–tRNA Lys intergenic region (MIC9D), a well-known anthropological marker in East Asian and Pacific populations. Although MIC9D showed strong contribution to Korean clustering patterns, previous reports of its co- transmission with pathogenic mtDNA mutations suggest that its relevance should be investigated further. Overall, this study demonstrates that whole-mitogenome, assembly-based phylogeny surpasses marker- based haplogrouping by resolving population-specific mitochondrial lineages. The resulting high- resolution reference lays the foundation for future studies of Korean mitochondrial evolution and disease association.
Publisher
Ulsan National Institute of Science and Technology