The importance of theoretical and computational simulations has been increasingly recognized nowa- days. Numerous methods tailored to each system, ranging from small molecules to large cluster systems, are under active investigation aiming to provide practical intuition and validation. Despite the develop- ment of numerous methodologies, the accuracy-cost tradeoff impedes progress in this field. However, the advent of machine learning (ML)-based methodologies has altered the landscape significantly. ML- based design and prediction offer notably reasonable results within constrained resources, enabling sim- ulations on larger systems in more timescale than previously feasible. This study integrates these methodologies to conduct simulations at three different scales. First, a method for discovering new compounds using an optimization approach based on genetic algorithms, utilizing existing or self-constructed databases, is proposed. This approach optimizes the mutation op- erator to accelerate performance enhancement with successive generations. It aims to maximize the wavelength of emitted light from D-luciferin, a molecule involved in firefly bioluminescence, for appro- priate utilization within the human body. Gaussian process regression was employed to make accurate predictions from the limited database, with additional molecules generated by the genetic algorithm and added to improve database accuracy. Additionally, graph-based molecular representation and SAScore were utilized to increase the likelihood of synthesizing designed molecules. Next, machine learning-based excited-state molecular dynamics are explored, focusing on accu- rately calculating machine learning potentials (MLPs) and NACVs to simulate the transition from the excited state to the ground state via conical intersections in the penta-2,4-dieniminium cation (PSB3). Challenges that persist in machine learning-based nonadiabatic dynamics are solved by applying the (2,2) state-interaction state-averaged spin-restricted ensemble-referenced Kohn–Sham (SI-SA-REKS) methodology. Furthermore, kinetic Monte Carlo (kMC) simulations are provided to track real-time changes in the porous metal-organic framework (MOF) structure after synthesis via post-synthetic exchange. Var- ious substitution patterns resulting from temperature-dependent phenomena are simulated by inferring transition state energies from the potential difference between reactants and products obtained through quantum calculations and using the Arrhenius equation to compute reaction rate constants. Through these three distinct-scale computational simulations, this study offers insights into how the discovery of new molecules can be seamlessly integrated with simulations at actual scales, providing new insights for future research in molecular simulations utilizing ML.
Publisher
Ulsan National Institute of Science and Technology