JOURNAL OF PHYSICAL CHEMISTRY C, v.124, no.16, pp.8905 - 8918
Abstract
The band gap is an important parameter that determines light-harvesting capability of perovskite materials. It governs the performance of various optoelectronic devices such as solar cells, light-emitting diodes, and photodetectors. For perovskites of a formula ABX(3) having a non-zero band gap, we study nonlinear mappings between the band gap and properties of constituent elements (e.g., electronegativities, electron affinities, etc) using alternating conditional expectations (ACE)-a machine learning technique suitable for small data sets. We also compare ACE with other machine learning methods: decision trees, kernel ridge regression, extremely randomized trees, AdaBoost, and gradient boosting. The best performance is achieved by kernel ridge regression and extremely randomized trees. However, ACE has an advantage that it presents its results in a graphic form, helping in interpretation. The models are trained with the data obtained from density functional theory calculations. Different statistical approaches for feature selection are applied and compared: Pearson correlation, Spearman's rank correlation, maximal information coefficient, distance correlation, and ACE. A classification task of separating metallic perovskites from nonmetallic ones is solved using support-vector machines with the radial basis function kernel.