Interpretable air quality classification for public health using machine learning
Abstract
Air pollution remains a critical public health and environmental challenge, especially in rapidly urbanizing regions. Accurate classification of air quality levels is essential for proactive environmental management and timely health interventions. This study presents an interpretable machine learning approach for multi-class air quality classification using logistic regression. We utilize a real-world dataset comprising 23,463 records, integrating pollutant concentrations (PM2.5, PM10, NO₂, SO₂, CO), meteorological data (temperature and humidity), and demographic features (industrial zoning and population distribution). Data preprocessing includes median imputation for missing values, feature normalization, and appropriate encoding of categorical variables. To address class imbalance, class weighting is applied, and model evaluation is conducted using 5-fold stratified cross-validation. Results show that the model achieves strong overall accuracy (87%) and a macro F1-score of 0.72, with particularly high performance for the dominant “Good” and “Moderate” categories. Feature selection methods, including Pearson correlation and recursive feature elimination, highlight PM2.5 as the most influential predictor (r = 0.98 with overall AQI). The model’s transparency and computational efficiency make it suitable for real-time deployment and policy decision-making. This work contributes a robust, interpretable baseline for air quality forecasting and highlights the importance of addressing underrepresented but critical pollution categories. Future directions include real-time data integration and comparative evaluation against more complex machine learning models.
Received on, 11 June 2025
Accepted on, 25 July 2025
Published on, 07 October 2025
Keywords
Full Text:
PDFReferences
D. Xu, Q. Zhang, Y. Ding, and D. Zhang, “Application of a hybrid ARIMA-LSTM model based on the SPEI for drought forecasting,” Environmental Science and Pollution Research, vol. 29, no. 3, pp. 4128–4144, Jan. 2022, doi: 10.1007/s11356-021-15325-z.
S. Zhu et al., “Internal and external coupling of Gaussian mixture model and deep recurrent network for probabilistic drought forecasting,” International Journal of Environmental Science and Technology, vol. 18, no. 5, pp. 1221–1236, May 2021, doi: 10.1007/s13762-020-02862-2.
D. Xu, Q. Zhang, Y. Ding, and H. Huang, “Application of a Hybrid ARIMA–SVR Model Based on the SPI for the Forecast of Drought—A Case Study in Henan Province, China,” J Appl Meteorol Climatol, vol. 59, no. 7, pp. 1239–1259, Jul. 2020, doi: 10.1175/JAMC-D-19-0270.1.
Y. Wu, C. Miao, Q. Duan, C. Shen, and X. Fan, “Evaluation and projection of daily maximum and minimum temperatures over China using the high-resolution NEX-GDDP dataset,” Clim Dyn, vol. 55, no. 9–10, pp. 2615–2629, Nov. 2020, doi: 10.1007/s00382-020-05404-1.
J. Wu, X. Chen, C. A. Love, H. Yao, X. Chen, and A. AghaKouchak, “Determination of water required to recover from hydrological drought: Perspective from drought propagation and non-standardized indices,” J Hydrol (Amst), vol. 590, p. 125227, Nov. 2020, doi: 10.1016/j.jhydrol.2020.125227.
V. K. and S. K., “Towards activation function search for long short-term model network: A differential evolution based approach,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 6, pp. 2637–2650, Jun. 2022, doi: 10.1016/j.jksuci.2020.04.015.
G. P. Oise et al., “YOLOv8-DeepSORT: A High-Performance Framework for Real-Time Multi-Object Tracking with Attention and Adaptive Optimization,” Journal of Science Research and Reviews, vol. 2, no. 2, pp. 92–100, May 2025, doi: 10.70882/josrar.2025.v2i2.50.
Z. Chang, Q. Gu, C. Lu, Y. Zhang, S. Ruan, and S. Jiang, “5G Private Network Deployment Optimization Based on RWSSA in Open-Pit Mine,” IEEE Trans Industr Inform, vol. 18, no. 8, pp. 5466–5476, Aug. 2022, doi: 10.1109/TII.2021.3132041.
C. Zhong and G. Li, “Comprehensive learning Harris hawks-equilibrium optimization with terminal replacement mechanism for constrained optimization problems,” Expert Syst Appl, vol. 192, p. 116432, Apr. 2022, doi: 10.1016/j.eswa.2021.116432.
C. Liu, “An improved Harris hawks optimizer for job-shop scheduling problem,” J Supercomput, vol. 77, no. 12, pp. 14090–14129, Dec. 2021, doi: 10.1007/s11227-021-03834-0.
J. Cai, T. Luo, G. Xu, and Y. Tang, “A Novel Biologically Inspired Approach for Clustering and Multi-Level Image Thresholding: Modified Harris Hawks Optimizer,” Cognit Comput, vol. 14, no. 3, pp. 955–969, May 2022, doi: 10.1007/s12559-022-09998-y.
G. P. Oise et al., “YOLOv8-DeepSORT: A High-Performance Framework for Real-Time Multi-Object Tracking with Attention and Adaptive Optimization,” Journal of Science Research and Reviews, vol. 2, no. 2, pp. 92–100, May 2025, doi: 10.70882/josrar.2025.v2i2.50.
A. A. Akinlabi, F. M. Dahunsi, J. J. Popoola, and L. B. Okegbemi, “Real-time mobile broadband quality of service prediction using AI-driven customer-centric approach,” Advances in Computing and Engineering, vol. 5, no. 1, p. 1, Jun. 2025, doi: 10.21622/ACE.2025.05.1.1332.
K. O. Igboji, C. J. Uneke, F. U. Onu, and O. Chukwu, “Development of policy research-evidence organizer and public health-policy evaluation tool (prophet): a computing paradigm for promoting evidence-informed policymaking in Nigeria,” Advances in Computing and Engineering, vol. 4, no. 2, p. 125, Dec. 2024, doi: 10.21622/ACE.2024.04.2.1076.
G. P. Oise and K. Susan, “Deep Learning for Effective Electronic Waste Management and Environmental Health,” Sep. 13, 2024. doi: 10.21203/rs.3.rs-4903136/v1.
G. P. Oise, O. C. Nwabuokei, O. J. Akpowehbve, B. A. Eyitemi, and N. B. Unuigbokhai, “TOWARDS SMARTER CYBER DEFENSE: LEVERAGING DEEP LEARNING FOR THREAT IDENTIFICATION AND PREVENTION,” FUDMA JOURNAL OF SCIENCES, vol. 9, no. 3, pp. 122–128, Mar. 2025, doi: 10.33003/fjs-2025-0903-3264.
R. K. Goli, N. Shaik, and M. S. Yalamanchili, “Dynamic demand response strategies for load management using machine learning across consumer segments,” Advances in Computing and Engineering, vol. 4, no. 2, p. 144, Dec. 2024, doi: 10.21622/ACE.2024.04.2.1082.
J. Cai, K. Xu, Y. Zhu, F. Hu, and L. Li, “Prediction and analysis of net ecosystem carbon exchange based on gradient boosting regression and random forest,” Appl Energy, vol. 262, p. 114566, Mar. 2020, doi: 10.1016/j.apenergy.2020.114566.
O. Samuel Abiodun, O. P. Ejenarhome, and G. Oise, “AI-BASED MEDICAL IMAGE ANALYSIS FOR EARLY DETECTION OF NEUROLOGICAL DISORDERS USING DEEP LEARNING,” FUDMA JOURNAL OF SCIENCES, vol. 9, no. 6, pp. 322–328, Jun. 2025, doi: 10.33003/fjs-2025-0906-3697.
Y. Chen and Y. Liu, “Which Risk Factors Matter More for Psychological Distress during the COVID-19 Pandemic? An Application Approach of Gradient Boosting Decision Trees,” Int J Environ Res Public Health, vol. 18, no. 11, p. 5879, May 2021, doi: 10.3390/ijerph18115879.
K. Mehmood et al., “Predicting the quality of air with machine learning approaches: Current research priorities and future perspectives,” J Clean Prod, vol. 379, p. 134656, Dec. 2022, doi: 10.1016/j.jclepro.2022.134656.
G. P. Oise et al., “DECENTRALIZED DEEP LEARNING IN HEALTHCARE: ADDRESSING DATA PRIVACY WITH FEDERATED LEARNING,” FUDMA JOURNAL OF SCIENCES, vol. 9, no. 6, pp. 19–26, Jun. 2025, doi: 10.33003/fjs-2025-0906-3714.
B. E. Akilo, S. A. Oyedotun, G. P. Oise, O. C. Nwabuokei, and N. B. Unuigbokhai, “Intelligent Traffic Management System Using Ant Colony and Deep Learning Algorithms for Real-Time Traffic Flow Optimization,” Journal of Science Research and Reviews, vol. 1, no. 2, pp. 63–71, Dec. 2024, doi: 10.70882/josrar.2024.v1i2.52.
J. Duan, Y. Gong, J. Luo, and Z. Zhao, “Air-quality prediction based on the ARIMA-CNN-LSTM combination model optimized by dung beetle optimizer,” Sci Rep, vol. 13, no. 1, p. 12127, Jul. 2023, doi: 10.1038/s41598-023-36620-4.
G. Oise and S. Konyeha, “E-WASTE MANAGEMENT THROUGH DEEP LEARNING: A SEQUENTIAL NEURAL NETWORK APPROACH,” FUDMA JOURNAL OF SCIENCES, vol. 8, no. 3, pp. 17–24, Jul. 2024, doi: 10.33003/fjs-2024-0804-2579.
J. Xue and B. Shen, “Dung beetle optimizer: a new meta-heuristic algorithm for global optimization,” J Supercomput, vol. 79, no. 7, pp. 7305–7336, May 2023, doi: 10.1007/s11227-022-04959-6.
K. Ravindra et al., “Application of machine learning approaches to predict the impact of ambient air pollution on outpatient visits for acute respiratory infections,” Science of The Total Environment, vol. 858, p. 159509, Feb. 2023, doi: 10.1016/j.scitotenv.2022.159509.
M. Kazemi Garajeh, G. Laneve, H. Rezaei, M. Sadeghnejad, N. Mohamadzadeh, and B. Salmani, “Monitoring Trends of CO, NO2, SO2, and O3 Pollutants Using Time-Series Sentinel-5 Images Based on Google Earth Engine,” Pollutants, vol. 3, no. 2, pp. 255–279, May 2023, doi: 10.3390/pollutants3020019.
B. Zhou, S. Zhang, R. Xue, J. Li, and S. Wang, “A review of Space-Air-Ground integrated remote sensing techniques for atmospheric monitoring,” Journal of Environmental Sciences, vol. 123, pp. 3–14, Jan. 2023, doi: 10.1016/j.jes.2021.12.008.
G. P. Oise, S. A. Oyedotun, O. C. Nwabuokei, A. E. Babalola, and N. B. Unuigbokhai, “ENHANCED PREDICTION OF CORONARY ARTERY DISEASE USING LOGISTIC REGRESSION,” FUDMA JOURNAL OF SCIENCES, vol. 9, no. 3, pp. 201–208, Mar. 2025, doi: 10.33003/fjs-2025-0903-3263.
T. M. T. Lei, S. W. I. Siu, J. Monjardino, L. Mendes, and F. Ferreira, “Using Machine Learning Methods to Forecast Air Quality: A Case Study in Macao,” Atmosphere (Basel), vol. 13, no. 9, p. 1412, Sep. 2022, doi: 10.3390/atmos13091412.
K. Fan, R. Dhammapala, K. Harrington, R. Lamastro, B. Lamb, and Y. Lee, “Development of a Machine Learning Approach for Local-Scale Ozone Forecasting: Application to Kennewick, WA,” Front Big Data, vol. 5, Feb. 2022, doi: 10.3389/fdata.2022.781309.
mahatiratusher, “Air Quality Prediction using Logistic Regression,” Jul. 2025, Kaggle. [Online]. Available: https://www.kaggle.com/code/mahatiratusher/air-quality-prediction-using-logistic-regression
Doreswamy, H. K S, Y. KM, and I. Gad, “Forecasting Air Pollution Particulate Matter (PM2.5) Using Machine Learning Regression Models,” Procedia Comput Sci, vol. 171, pp. 2057–2066, 2020, doi: 10.1016/j.procs.2020.04.221.
Y. Rybarczyk and R. Zalakeviciute, “Assessing the COVID‐19 Impact on Air Quality: A Machine Learning Approach,” Geophys Res Lett, vol. 48, no. 4, Feb. 2021, doi: 10.1029/2020GL091202.
G. P. Oise, O. C. Nwabuokei, O. J. Akpowehbve, B. A. Eyitemi, and N. B. Unuigbokhai, “TOWARDS SMARTER CYBER DEFENSE: LEVERAGING DEEP LEARNING FOR THREAT IDENTIFICATION AND PREVENTION,” FUDMA JOURNAL OF SCIENCES, vol. 9, no. 3, pp. 122–128, Mar. 2025, doi: 10.33003/fjs-2025-0903-3264.
G. Mazuruse, B. Nyagadza, A. Tumbure, T. Makoni, and A. Muvuti, “Algorithmic Optimization for Efficient Air Quality Prediction Models through Machine Learning: A Case Study of Shillong City in India,” Next Research, vol. 2, no. 2, p. 100346, Jun. 2025, doi: 10.1016/j.nexres.2025.100346.
Y. Özüpak, F. Alpsalaz, and E. Aslan, “Air Quality Forecasting Using Machine Learning: Comparative Analysis and Ensemble Strategies for Enhanced Prediction,” Water Air Soil Pollut, vol. 236, no. 7, p. 464, Jul. 2025, doi: 10.1007/s11270-025-08122-8.
G. Oise and S. Konyeha, “Environmental impacts in e-waste management using deep learning,” Discover Artificial Intelligence, vol. 5, no. 1, p. 210, Aug. 2025, doi: 10.1007/s44163-025-00376-9.
G. P. Oise, S. A. Oyedotun, O. C. Nwabuokei, A. E. Babalola, and N. B. Unuigbokhai, “ENHANCED PREDICTION OF CORONARY ARTERY DISEASE USING LOGISTIC REGRESSION,” FUDMA JOURNAL OF SCIENCES, vol. 9, no. 3, pp. 201–208, Mar. 2025, doi: 10.33003/fjs-2025-0903-3263.
DOI: https://dx.doi.org/10.21622/ACE.2025.05.2.1428
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Godfrey Perfectson Oise, Cyprian C. Konyeha, Chioma Julia Onwuzo, Ejenarhome Prosper Otega, Babalola Eyitemi Akilo, Olayinka Tosin Comfort, Joy Akpowehbve Odimayomi, Unuigbokhai Nkem Belinda
Advances in Computing and Engineering
E-ISSN: 2735-5985
P-ISSN: 2735-5977
Published by:
Academy Publishing Center (APC)
Arab Academy for Science, Technology and Maritime Transport (AASTMT)
Alexandria, Egypt


