Abstract
The improvements in computation facility and technology support the development and implementation of automatic methods for medical data assessment. This study tries to extend a framework for efficiently classifying chest radiographs (X-rays) into normal/COVID-19 class. The proposed framework consists subsequent phases: (i) image resizing, (ii) deep features extraction using a pretrained deep learning method (PDLM), (iii) handcrafted feature extraction, (iv) feature optimization with Brownian Mayfly-Algorithm (BMA), (v) serial integration of optimized features, and (vi) binary classification with 10-fold cross validation. In addition, this work implements two methodologies: (i) performance evaluation of the existing PDLM in the literature and (ii) improving the COVID-19 detection performance of chosen PDLM with this proposal. The experimental investigation of this study authenticates that the effort performed using pretrained VGG16 with SoftMax helped get a classification accuracy of >94%. Further, the research performed using the proposed framework with BMA selected features (VGG16 + handcrafted features) helps achieve a classification accuracy of 99.17% on the chosen X-ray image database. This outcome proves the scientific importance of the implemented framework, and in the future, this proposal can be adopted to inspect the clinically collected X-rays.
1. Introduction
Due to different grounds, the occurrence speed of diseases in humankind is gradually rising, and timely screening and treatment are necessary to reduce the infection/death rates. In the current era, many advanced life-saving facilities are available in healthcare centers to treat individuals suffering from infectious/acute diseases. However, even though enough investigative and healing services are conveniently accessible to the individuals, the occurrence rate of life-threatening communicable infections is gradually rising, which causes more medical burden worldwide [1–3].
The contagious infections caused by viruses/bacteria commonly infect a sizeable human group, and early recognition and management is the solitary remedy to manage its increase. Recently, contagious infection named COVID-19 infected many individuals globally and are the prime reason for increased death rates in the years 2020 and 2021. Due to its severity and spreading speed, the World Health Organization (WHO) confirmed it as a pandemic in early 2020 [4, 5]. COVID-19 is caused by the SARS-CoV-2 virus, which creates mild to harsh pneumonia in individuals based on their immunity intensity. Even though the patient is vaccinated and following the COVID-19 protocol suggested by WHO, its infection rapidity is still unmanageable due to the speedy alteration in the virus.
Self/doctor can diagnose the symptom of COVID-19, and the clinical level screening of this disease involves (i) collection of samples from individuals and execution of the reverse transcription-polymerase chain reaction (RT-PCR) using permitted clinical practice and (ii) radiological image-based lung screening. First, radiology-supported lung imaging is performed in a controlled environment, in which the infection in the lung is diagnosed with chest X-ray or computed tomography (CT) images. Then, the pulmonologist examines the collected X-ray/CT images to detect the severity of the infection, decision making, and treatment execution to cure the disease.
In hospitals, the usages of the X-ray/CT are prevalent to examine lung infection and compared to the CT, the implementation of the X-ray is simple and cost-effective. Hence, most of the initial level lung screening considers X-ray images. The disease and its harshness can be easily detected when the radiologist/pulmonologist examines it. Several computerized screening procedures for X-ray images are discussed in the literature, and these works confirm that X-ray-supported lung infection screening helps achieve a better diagnosis.
Several X-ray image examination methods are proposed and implemented in the literature using the machine learning (ML) schemes and deep learning (DL) methods. The existing works helped achieve better detection accuracy. However, the integration of the ML and DL approaches is minimal, and this scheme will help to achieve improved detection accuracy when the clinical-grade X-ray image is assessed. This research aims to develop a DL framework for automatic detection of COVID-19 in chest X-ray images. In order to achieve a better disease detection, this framework employs the following stages: (i) Collection and preprocessing of X-ray images; (ii) evaluating the performance of pretrained deep learning method (PDLM) and finding the appropriate practice to screen the X-ray database; (iii) mining of deep features from the X-ray; (iv) mining of handcrafted features (HF) using chosen procedures; (v) feature selection with Brownian Mayfly-Algorithm (BMA) and serial feature integration; and (vi) classification and validation of the performance of proposed COVID-19 screening framework.
This research primarily executes PDL scheme-supported X-ray evaluation and identifies the infection screening performance based on the attained metrics. The initial study confirms that the COVID-19 detection accuracy achieved by VGG16 is better (>95%) than other PDL schemes. Hence, the VGG16 supported framework is considered, and then its performance is enhanced by serially integrating the HF, such as local binary pattern (LBP) and PHOG. In order to avoid the overfitting problem, these features are then optimized by the BMA. Then, the necessary hybrid feature vector (Deep + HF) is generated, and it is then considered to train and validate the binary classifiers with 10-fold cross-validation. This study considers 4800 (2400 normal and 2400 COVID-19) X-ray images for the evaluation, in which 90% images are considered for training and 10% are considered for the validation. The experimental outcome of this study confirms that the proposed technique helps get a classification accuracy of 99.17% with the K-nearest neighbor (KNN) classifier.
The novelty and the merits of this research include(i)Implementation of Brownian Mayfly-Algorithm (BMA) based deep and handcrafted feature optimization on improving the detection accuracy without the overfitting(ii)Precise COVID-19 detection in X-ray images using hybrid features with 10-fold cross-validation
The upcoming sections of this work are demonstrated as follows: Section 2 shows the literature review; Section 3 presents the methodology; and Sections 4 and 5 demonstrate achieved results and conclusion of the presented work, respectively.
2. Related Research
Chest X-ray supported lung infection detection is a clinically accepted methodology in which the combined report of the radiologist and pulmonologist are considered to evaluate the disease in the lung to plan and implement the necessary treatment to cure the patient. The computer algorithm-supported X-ray examination is one of the widely accepted procedures. Hence, several PDL schemes have been implemented to examine the harshness of COVID-19 infection in patients. The employed PDL schemes help categorize the available X-ray images into normal and disease classes with better accuracy. This procedure is an essential process when a mass screening procedure is implemented, and this considerably reduces the COVID-19 detection burden when more patients are to be screened. Table 1 depicts some chosen deep-learning assisted COVID-19 infection procedures found in the recent literature.
The earlier works in the literature authenticate that the combination of deep and HF assists to acquire better disease detection accuracy [19]. The above table confirms that the maximum detection accuracy presented in the earlier work is 99.02% [13]. This work considered the hybrid feature-based X-ray classification to improve the detection accuracy. Hence, in this work, the classification of X-rays into normal/COVID-19 is implemented using the BMF algorithm optimized VGG16’s features and the optimally selected LBP and PHOG features. The experimental outcome of this research confirms that the presented work helps to get better detection accuracy than the works considered in Table 1.
3. Methodology
This section represents the developed structure to examine the selected X-ray database. Also, it outlines the different procedures implemented to distinguish normal/COVID-19 class X-rays.
3.1. Framework
Figure 1 depicts the proposed framework developed to sense the COVID-19 in chosen test X-ray pictures. In this effort, the necessary images are primarily collected and resized into pixels, and these imagery are afterward considered to extract the deep features (DF) and HF. The DF mining is initially achieved using the pretraining schemes, and every scheme helps to get a one-dimensional (1D) feature vector of size . This feature vector is adopted to confirm the SoftMax (SM) classifier’s merit on the chosen test images. The initial experiment proves that the COVID-19 detection accuracy of VGG16 with SM classifier is better than AlexNet, VGG19, ResNet18, ResNet50, and ResNet101 schemes. Furthermore, the initial study confirmed that the VGG16 provides better results on the chosen data than other methods.

Later, the HF, such as LBP and PHOG, is extorted from the test imagery. The collected DF and HF are then reduced by the BMF algorithm, the selected features are then serially integrated, and the classification study is repeated. This experimental outcome authenticates that the proposed framework helped to accomplish a categorization accuracy of 99.17% for the chosen X-ray database. The various stages of this framework are clearly depicted in Figure 1, and the outcome of the framework is more significant compared to other results presented in Table 1.
3.2. X-Ray Database
The merit of the planned COVID-19 detection framework is then tested and validated with benchmark X-ray images found in the literature. In this scheme, the necessary test images for this study are collected from the following locations [20, 21]. During this study, 4800 test images were considered for the assessment. Table 2 presents the information about the images (total, training, and validation), and Figure 2 presents the sample test images of the chosen database. Finally, all the considered PDLM are tested with the considered database and the results are analyzed.

3.3. Deep-Features Mining
The concert of the planned framework relies mainly on the deep features obtained from the chosen PDLM [22]. In this work, the well-known PDLM, such as AlexNet, VGG16, VGG19, ResNet18, ResNet50, and ResNet101, are considered for the evaluation. During this task, the following parameter setting is implemented on all the chosen PDLM: initial-weights = imageNet features, total epochs = 100, optimizer = Adam, pooling = max/average (AVG), activation for hidden-layer = ReLu, classifier-activation = sigmoid, training images = 2160, validation images = 240, and classifier validation = 10-fold.
Before employing the chosen PDLM to assess the images, an image augmentation is employed to increase the number of images for training the PDLM scheme. The augmentation of images is achieved with horizontal flip, vertical flip, rotation = , zoom = 0.4, width shift = 0.4, height shift = 0.4, and shear range = 0.3, and this method assists the PDLM in distinguishing the image information correctly.
This scheme helps to extract deep features from every PDLM, and this value is mathematically depicted in the following formula:
3.4. Handcrafted-Feature Mining
In this work, the HF is mined using the LBP [23, 24] with varied weights and the PHOG [24, 25] with various bins, and the discussion about these procedures can be found in earlier research works. The essential HF is then mined using local binary patterns (LBP) with different weights (W = 1 to 4) and PHOG with various bins (Bin = 1 to 3). The outcome attained with LBP is depicted in Figure 3, in which Figures 3(a)–3(d) present the outcomes with various values of weights (W = 1 to 4) on a chosen test X-ray.

(a)

(b)

(c)

(d)
A similar practice is then implemented with the PHOG, and the achieved features for bin1, bin2, and bin3 are presented in Figure 4.

In this research, equation (11) is considered as the HF, and the optimized HF is then combined with the deep-feature to get the deep + HF, which helps to classify the X-ray with better accuracy.
3.5. Feature Selection Using Brownian Mayfly-Algorithm
Feature selection is a prime task in this work, and as discussed in the earlier work, the deep and HF are optimized using the Mayfly Algorithm (MA) [26]. The MA is a nature-inspired algorithm invented by combining Firefly, particle swarm, and genetic algorithm. A levy-flight search operator guides the traditional MA, and in this work, the proposed MA is driven by a Brownian operator. The search process found in Brownian Mayfly-Algorithm (BMA) is smoothly compared to the traditional approach [27, 28]. Figure 5 depicts the working of the proposed BMA, in which Figure 5(a) illustrates the Brownian walk search process for a single Mayfly. The various stages (Stages 1 to 3) are depicted in Figures 5(b)–5(d).

(a)

(b)

(c)

(d)
The description of the MA is as follows.
Let, MA includes identical male () and female () flies, which are randomly distributed in search space. Let these flies are demoted as . During the examination task, each fly is authorized to fuse close to the optimum location (). After reaching , male-fly () is permitted to stay in . This process is depicted in Figure 5(b).
This process is shown in equations (12) and (13).where and are initial and ending spots and and are initial and ending velocities. and denote local and global learning constraints. Other parameters are assigned as follows. , and are the Cartesian distance among flies.
During the relocation, every will achieve and executes a velocity update to attract female-fly () with the help of nuptial-dance.
The velocity update at this condition is shown in the following equation:where nuptial-dance (d) = 5 and R = random numeral [−1,1].
When the search by is finished, every is permitted to find , which reached and this process is depicted in Figure 5(c).
The expression for female-fly update given in equations (15) and (16).where is the objective-value.
When the search process continues, each will find the finest, the offspring generation happens, and other information on MA can be found in literature, and this procedure is depicted in Figure 5(d).
Figure 6 presents the feature optimization process. During the feature reduction process, the BMA is permitted to explore the deep/HF to reduce the value based on the Cartesian distance (CD). This process compares the features of the normal/COVID-19 class images and helps to find the features whose CD is large. The features with lesser CD are discarded, and this procedure is depicted graphically in Figure 4. This procedure helps to find the optimal features (deep-features = and HF = ) and the selected features are then combined to get a hybrid feature vector (), which is considered to train and validate the classifiers.

3.6. Classification and Validation
In the proposed research, initially, the SoftMax classifier is employed to recognize the classifier performance with the selected PDLM, and after achieving the results, the performance of other binary classifiers, like decision tree (DT), random forest (RF), aïve Bayes (NB), K-nearest neighbor (KNN), and support vector machine (SVM) with linear kernel is considered and the attained results are measured. The merit of planned practice is measured using the essential measures, like true ositive (), false negative (), true negative (), and false positive (), accuracy (AC), precision (PR), sensitivity (SE), specificity (SP), F1-score (F1S), and negative predictive value (NPV) are obtained from these values.
The mathematical expression for these measures is presented in equations (17) to (22) [29–32].
4. Results and Discussion
This part of the work present the investigational results obtained with an Intel i5 2.6 GHz CPU, with 18 GB RAM and 4 GB VRAM, and equipped with Python®. In this work, 4800 images (2400 normal and 2400 COVID-19) are considered for evaluating the merit of the PDLM on the assigned task. In this work, the performance of the proposed scheme is verified with max-pooling (MP) and average pooling (AP) approaches, and the merit of the scheme is confirmed based on the achieved metrics.
Initially, the performance of the PDLM is tested on the considered images with the SoftMax classifier, and the achieved results are shown in Table 3. This table proves that the VGG16 scheme with AP helps to get superior categorization accuracy (95.21%) contrast to other methods. Table 4 confirms that fold6 presents a better result compared to other folds, and its graphical verification is presented in Figure 7.

The experimental outcome shown in Table 3 proves that the outcome achieved with VGG16 is superior contrast to other PDLM of this study. Also, this study verifies that the outcome of the average pooling is superior to max-pooling. Hence, the VGG16 with average pooling is then considered to verify the performance of the classifiers, such as DT, RF, NB, KNN, and SVM, and the results are presented in Table 5.
After verifying the performance of the VGG16 with deep-features, its performance is then confirmed using the BMA optimized serially integrated deep and HF. During this task, the BMF-based feature selection is then employed to find the optimal deep (equation (1)) and HF (equation (11)) features. The BMF algorithm based feature selection helps to get a deep-feature of size , HF of size , and the integrated feature of size . This hybrid features are then used to verify the merit of VGG16 in detecting normal/COVID-19 X-ray images using the different classifiers using 10-fold validation, and the attained result is depicted in Table 5. This table validates that the KNN is better (accuracy = 99.17%) compared to other methods.
The various convolutional-layer (CL) outcome of the VGG16 achieved for a sample test image is presented in Figure 8. Figure 8(a) depicts the sample test image and Figures 8(b)–8(f) show the outcome of CL1 to 5, respectively. The overall performance of the binary classifiers is verified using the metrics in Table 5, analyzed with a spider plot, and is depicted in Figure 9. Figure 9(a) presents the plot to confirm the merit of VGG16 with the traditional deep-feature of dimension and Figure 9(b) shows the result for deep + HF of size . The spider plot, which creates a major pattern, is considered to be superior, and this plot confirms that the result of DT (with deep-feature) and KNN (with hybrid features) is better. The achieved experimental results with VGG16 and KNN for deep + HF are presented in Figure 10. Figures 10(a) and 10(b) demonstrate the validation/validation accuracy and loss function for 100 epochs. Figures 10(c) and 10(d) show the confusion matrix and ROC curves, respectively. From this result, it can be verified that the outcome of this experiment confirms that the proposed scheme helps to achieve a better classification metric during the assessment of the considered image database.

(a)

(b)

(c)

(d)

(e)

(f)

(a)

(b)

(a)

(b)

(c)

(d)
The performance of this practice is further demonstrated with the experimental outcome (classification accuracy) of other methods discussed in Table 1, and its value is graphically depicted in Figure 11. This comparison validates that the accurateness realized with the proposed scheme is improved compared to earlier works. This ensures that this proposal is clinically noteworthy, and the proposed technique can be considered to inspect the clinical-grade X-ray imagery, in future. In the future, the proposed methodology’s performance can be enhanced by considering other handcrafted methods accessible in the literature.

5. Conclusion
COVID-19 is a pandemic disease that causes pneumonia in humankind, and the unrecognized infection will lead to death. X-ray-supported lung infection detection is an extensively implemented medical procedure, and radiologists and pulmonologists typically assess the recorded X-ray to recognize the disease. This research developed a PDLM-based COVID-19 recognition from X-ray, and this scheme executes different features assisted detection of COVID-19. This research considers the serially combined features of VGG16 and HF to classify the X-ray images into normal/COVID-19. Furthermore, this work employed the BMA to optimize the deep features and HF to reduce overfitting. The investigation is implemented using a binary classifier with 10-fold cross-validation. This study confirms that the BMA optimized Deep+HF helps get an improved accuracy (99.17%) with the KNN classifier. This accuracy is compared with other results existing in the literature, and this study confirms that the proposed scheme is better. This scheme can be considered to evaluate the clinically collected X-ray images in the future.
Data Availability
The Experimental data can be accessed from the following links: (1) https://www.kaggle.com/tawsifurrahman/COVID-19-radiography-database and (2) https://ieee-dataport.org/open-access/covid-19-and-normal-chest-x-ray.
Conflicts of Interest
The authors declare that they have no conflicts of interest.