Boosting Algorithm to handle Unbalanced Classification of PM2.5 Concentration Levels by Observing Meteorological Parameters in Jakarta-Indonesia using AdaBoost, XGBoost, CatBoost, and LightGBM
- Author(s)
-
Toharudin, Toni, Caraka, Rezzy Eko, Pratiwi, Indah Reski, Kim, Yunho, Gio, Prana Ugiana, Sakti, Anjar Dimara, Noh, Maengseok, Nugraha, Farid Azhar Lutfi, Pontoh, Resa Septiani, Putri, Tafia Hasna, Azzahra, Thalita Safa, Cerelia, Jessica Jesslyn, Darmawan, Gumgum, Pardamean, Bens
- Issued Date
-
2023-04
- DOI
-
10.1109/ACCESS.2023.3265019
- URI
-
https://scholarworks.unist.ac.kr/handle/201301/62468
- Citation
-
IEEE ACCESS, v.11, pp.35680 - 35696
- Abstract
-
Air quality conditions are now more severe in the Jakarta area that is among the world’s top eight worst cities according to the 2022 Air Quality Index (AQI) report. In particular, the data from the Meteorological, Climatological, and Geophysical Agency (BMKG) of the Republic of Indonesia, the latest outcomes in air quality conditions in Jakarta and surrounding areas, says that PM2.5 concentrations have increased and peaked at 148 μg/m3 in 2022. While a classification system for this pollution is necessary and critical, the observation of PM2.5 concentrations measured through the BMKG Kemayoran station, Jakarta, turns out to be identified as an unbalanced data class. Thus, in this work, we perform boosting algorithm supervised learning to handle such an unbalanced classification toward PM2.5 concentration levels by observing meteorological patterns in Jakarta during 1 January 2015 to 7 July 2022. The boosting algorithms considered in this research include Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and Light Gradient Boosting Machine (LightGBM). Our simulations have proven that boosting classification can significantly reduce bias in combination with variance reduction with unbalanced within-class coefficients, with the classification of PM2.5 class values: good 62%, moderate 34%, and unhealthy 59%, respectively.
- Publisher
-
Institute of Electrical and Electronics Engineers Inc.
- ISSN
-
2169-3536
- Keyword (Author)
-
AdaBoost, Boosting, CatBoost, LightGBM, PM2.5, unbalanced classification, XGBoost
- Keyword
-
BIG DATA, MODEL, MORTALITY, FEATURES, SUPPORT
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.