Comparing PCA-Based Machine Learning Algorithms for COVID-19 Classification Using Chest X-ray Images

Hussein Ahmed Ali, Microwave Electronics Research Laboratory, Faculty of Sciences of Tunis, University Tunis El-Manar, Tunis El-Manar, Tunisia AND College of Computer Science and Information Technology, University of Kirkuk, Kirkuk, Iraq Follow
Walid Hariri, College of Computer Science and Information Technology, University of Kirkuk, Kirkuk, Iraq
Nadia Smaoui Zghal, Labged Laboratory, Department of Computer Science, Badji Mokhtar Annaba University, Annaba, Algeria
Dalenda Ben Aissa, Control and Energy Management Laboratory, (CEM Lab) ENIS, University of Sfax, Sfax, Tunisia

Abstract

The rapid spread of the COVID-19 pandemic has strained global healthcare systems, necessitating efficient diagnostic methods. While Polymerase Chain Reaction (PCR) and antigen tests are common, they have limitations in speed and precision. Enhancing the accuracy of imaging techniques, especially Chest X-rays (CXR) and Computerized Tomography (CT) scans, is crucial for detecting COVID-19-related lung abnormalities. CXR, being cost-effective and accessible, is preferred over CT scans, but accurate diagnosis often requires technological support. To address this, an extensive dataset of CXR images categorized into five classes is available on Kaggle. Processing such data involves steps like grayscale conversion, image intensity adjustment, resizing, and feature extraction using Principal Component Analysis (PCA). Machine Learning (ML) techniques, including Decision Tree (DT), Random Forest (RF), Stochastic Gradient Descent (SGD), Logistic Regression (LR), Gaussian Naive Bayes (GNB), and K-Nearest Neighbors (KNN), are employed for image classification. DT shows the highest accuracy at 88%, outperforming other models like GNB (77%), KNN (71%), SGD (70%), LR (74%), and RF (45%). It consistently excels across assessment metrics such as F1-score, sensitivity, and precision, with an 88% best-weighted average. However, selecting the optimal ML model depends on factors like dataset characteristics and implementation specifics. Thus, careful consideration of these factors is crucial when choosing an ML model for COVID-19 diagnosis via CXR image classification.

Keywords

Chest X-ray (CXR), COVID-19, Decision tree, Gaussian Naïve, Stochastic gradient descent, Bayes, Machine learning

Subject Area

Computer Science

First Page

687

Last Page

705

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Receive Date

9-12-2023

Revise Date

4-19-2024

Accept Date

4-21-2024

How to Cite this Article

Ali, Hussein Ahmed; Hariri, Walid; Zghal, Nadia Smaoui; and Aissa, Dalenda Ben (2025) "Comparing PCA-Based Machine Learning Algorithms for COVID-19 Classification Using Chest X-ray Images," Baghdad Science Journal: Vol. 22: Iss. 2, Article 27.
DOI: https://doi.org/10.21123/bsj.2024.9422

Download

COinS