Abstract

Breast cancer (BrCa) is the most common disease in women worldwide. Classifying the BrCa image is extremely important for finding BrCa at an earlier stage and monitoring BrCa during treatment. The computer-aided detection methods have been used to interpret BrCa and improve the detection of BrCa during the screening and treatment stages. However, if a new BrCa image is generated for the treatment, it will not classify correctly. The main objective of this research is to classify the BrCa images for newly generated images. The model performs preprocessing, segmentation, feature extraction, and classification. In preprocessing, a hybrid median filtering (HMF) is used to eliminate the noise in the images. The contrast of the images is enhanced using quadrant dynamic histogram equalization (QDHE). Then, ROI segmentation is performed using the USE-Net deep learning model. The CaffeNet model is used for feature extraction on the segmented images, and finally, classification is made using the improved random forest (IRF) with extreme gradient boosting (XGB). The model obtained 97.87% accuracy, 98.45% sensitivity, 95.24% specificity, 98.96% precision, and 98.70% f1-score for ultrasound images. The model gives 98.31% accuracy, 99.29% sensitivity, 90.20% specificity, 98.82% precision, and 99.05% f1-score for mammogram images.

1. Introduction

Breast cancer (BrCa) is a common disease in women and is one of the leading causes of cancer-related deaths worldwide. According to the report released by the National Research Council and the Institute of Medicine, there is now a significant demand for breast imaging professionals [1]. The development of computer-aided detection (CAD) systems for BrCa detection and diagnosis uses several imaging modalities, such as mammograms and ultrasounds. Mammography has been shown to lessen the risk of mortality from BrCa. On the other hand, the sensitivity of mammography is not ideal and is lower in women who are described as having “dense breasts.” As a result, there has been a recent update worldwide on using breast ultrasonography as an adjuvant to mammographic screening. In addition to mammographic screening, it has been demonstrated that screening with automated 3D breast ultrasound systems and handheld ultrasound devices can boost the cancer detection rate in women with thick breast tissue. The automated breast ultrasonography screening function was the primary motivation for its development. Compared to handheld ultrasonography, it is less dependent on the operator and can obtain complete three-dimensional breast ultrasound volumes reproducible over time [2]. Ultrasound is the most sensitive method when detecting invasive cancer in thick breasts. Nevertheless, it is a modality dependent on the operator, and the interpretation of its images calls for the radiologist to have specialized knowledge.

In order to overcome operator dependency and increase the accurate diagnosis rate, CAD systems are required for detecting and classifying the BrCa [4]. Figure 1 represents the mammogram and ultrasound view of the BrCa image of a 45-year-old woman with an infiltrating ductal carcinoma in the left breast. In digital mammography, the patient’s breast is irradiated with X-rays, which are then detected by a digital X-ray detector to produce a two-dimensional (2D) breast image. It is a method that is both quick and simple. However, it has the issue of tissue superposition, a significant drawback. The likelihood of fibro glandular tissue covering up lesions increases when the breast has a high density of fibro glandular tissue, as this type of breast is known to be dense. The mediolateral oblique (MLO) view and the craniocaudal (CC) view are taken during a mammogram to help alleviate some of the difficulties associated with this issue. During an ultrasound, sound waves are passed through the breast, and at the same time, waves that have been backscattered are identified. An ultrasound image is generated using the waves that have been detected. As a result, ultrasound does not use ionizing radiation, which is a significant benefit [3]. The difficulty in interpreting ultrasound images is due to speckle and low contrast despite the numerous benefits that may be gained from using ultrasound [5]. This causes the images to be degraded, even though the numerous benefits ultrasound can gain.

CAD systems are designed to minimize costs and increase radiologists’ ability to interpret medical pictures and differentiate between benign and malignant tissues. Additionally, CAD systems are being developed in order to improve patient care. Increase the effectiveness of the radiologist’s interpretation by increasing its accuracy and consistency in terms of detection and diagnosis and cutting down on the amount of time needed for interpreting the images. The CAD system aims to provide radiologists with more objective evidence and boost their diagnostic confidence. CAD methods have been created to improve the detection of BrCa during screening by lowering the count of false-negative interpretations [6]. Although the present condition of performances for the CAD system is promising, more is needed to create CAD models with fully independent identification and clinical diagnosis frameworks. CAD models would continue to be used as the second opinion clinical procedure unless their performance is significantly improved from its present stage by advancing the conventional approaches, implementing new successful approaches in recognizing patterns such as augmentation of data in deep learning, and utilizing advanced models in the computation power of systems. A CAD system includes different stages, i.e., dataset collection, preprocessing methods, segmentation methods, feature extraction methods, classification methods, and evaluation metrics. In this work, an efficient USE-Net deep learning model was designed to detect BrCa using two different imaging models: mammogram and ultrasound.

The main contribution of the paper is summarized as follows:(i)A deep learning-based CAD model is developed for BrCa detection using ultrasound images.(ii)A hybrid median filter is used to minimize the speckle noise in the ultrasound BrCa image. After filtering, the contrasts of the images are enhanced using the quadrant dynamic histogram equalization (QDHE) technique in the preprocessing stage. This process helps the classifier to improve the classification performance.(iii)A combination of U-Net with the SE network model is used for ROI segmentation, and for feature extraction, a pretrained deep learning model called CaffeNet is used.(iv)Finally, the classification is performed using the improved random forest classifier with XGBoost.

The remaining paper is organized as follows; related work is discussed in Section 2. Implementation of the proposed research model is presented in Section 3. Discussion on experimental results is presented in Section 4. Section 5 discusses the conclusion and future works.

A CAD method was presented in [7] based on the terminology scores of screening ultrasound images from the breast imaging reporting and data system (BI-RADS). Evaluating the BI-RADS category is an important stage in the diagnostic process for BrCa. The image obtained from breast ultrasonography can provide valuable information collected during the examination procedure. In order to identify between cancerous and benign BrCa, the decision tree method was used to assess the BI-RADS information. The CART decision tree algorithm was used to classify BrCa based on the features scored using BI-RADS. A deep learning algorithm could have been used for the automated scoring of ultrasound images, which could improve the performance. Another CAD model was proposed in [8] to detect BrCa using ultrasound images. This CAD model includes preprocessing, segmentation, feature extraction, and classifications. During the preprocessing stage, the noise was removed using a method known as speckle reducing anisotropic diffusion (SRAD), and the active contour model was utilized for the segmentation process. The grey level co-occurrence matrix (GLCM) was used to extract the characteristics of the texture, and those features were then used as input by the classifier. The k nearest neighbors (KNN) method, the decision trees algorithm, and the RF classifiers were applied. The classifiers’ accuracy was computed to determine the most effective for identifying BrCa using ultrasound pictures. The RF classifier ranks top, outperforming the other two classifiers in terms of accuracy. An image fusion approach, several image content representations, and an ensemble of distinct CNN architectures were used in [9] with a CAD framework for diagnosing tumors. ResNet, VGG-Net, and DenseNet are all CNN-based methods incorporated into this system’s implementation. The ensemble method that incorporates a weighted average has the potential to both lessen the amount of variation in the diagnostic results and produce the most accurate diagnosis. In [10], a medical decision-support framework for the classifications and diagnosis of BrCa utilizing ultrasound pictures enabled by an ensemble of deep learning was proposed. In this particular model, the preprocessing was carried out with the help of a Wiener filter and contrast enhancement. The process of segmentation was continued with the application of the Chaotic Krill Herd approach along with Kapur’s entropy. Feature extraction was also done with the help of the ensemble of deep learning techniques: VGG16, Squeeze Net, and VGG-19. In the end, classification was accomplished through cat swarm optimization in conjunction with a multilayered perceptron technique.

The CAD system was created [11] to detect and classify breast lesions as benign or malignant. In the preprocessing step, additional data were added and spatial alterations were carried out, both of which were done to construct breast lesion identification. Then, applying localization error in conjunction with intersection over the union, we improved the evaluation of breast lesion detection in ultrasound images. Compared to the Viola–Jones-based approach, the YOLOv3 algorithm’s breast lesion identification was more reliable and reproducible. In the end, the effective radionics signature for BrCa classification was obtained only from detection bounding boxes, with the segmentation task being entirely left out of the work. In [12], a meta-heuristic algorithm was used to tune the parameters of the neural network. Compounding the wavelet neural network (WNN) and the grey wolf optimizations led to the development of a CAD method that can identify abnormalities in breast ultrasound pictures. In this study, breast ultrasound images were preprocessed using the sigmoid filters, interferences-basedde-speckling was performed, and finally, anisotropic diffusion was carried out. After selecting the ROI using the automatic segmentation algorithm, morphological and textural features were computed. In the end, the GWO tuned WNN was the one that was utilized for the classification task.

Transfer learning is applying the skills and information obtained through resolving one challenge to another challenge of a similar nature. A deep learning model based on transfer-learning models was presented in [13] to effectively assist in the automated diagnosis and detection of BrCa infected zones based on two models, namely, 80–20 and cross-validations. The architecture of deep learning was modeled in such a way as to be problem-specific. Using pretrained CNN models such as Inception V3, VGG-19, ResNet50, VGG16, and Inception-V2 ResNet, the characteristics of this model were collected from the MIAS dataset. A deep learning-based breast density classification model was proposed in [14]. A residual CNN was constructed, trained, and the responses of the model to various modifications in inputs were evaluated. These input adjustments included varying class label distributions in test and training sets and appropriate image preprocessing. The grad-CAM approach for CNN was used to generate salient maps. Spearman’s rank correlations within the saliency maps and input images were computed to evaluate the model’s accuracy. There is a high correlation between the saliency maps and the dense pattern. In [15], a framework for the segmentation and classification of BrCa images was developed. This framework makes use of a variety of models, which include DenseNet121, InceptionV3, VGG16, ResNet50, and Mobile-netV2 models. In addition, the trained version of the modified U-Net approach was applied to extract breast areas from mammograms. This method assists radiologists in the process of early detection and improves the effectiveness of the system. The problem of tagged data was approached via transfer learning and data augmentation. The classification was accomplished with full CNNs.

The effectiveness of various multiscale architectures in locating breast calcifications on an entire field digital mammogram was analyzed in [16]. Both the MLO and CC perspectives were analyzed simultaneously within the architectures used, and both were later merged to give a prediction score in an end-to-end manner. The networks were then trained and tested using high-resolution digital mammogram images that contained only calcifications in the breast and no masses that had been analyzed locally. As a result, the AU-ROC curve of the multiscale attention-residual architecture with DLA is quite large. A BrCa classification model was presented in [17] for identifying benign or malignant BrCa based on mammography. The ROI was performed using a machine learning algorithm and hybrid thresholding. The multifractal dimension model was used to extract the denoised blocks’ features. The feature dimensions were decreased using the genetic algorithm. However, this method does not extract features required for classifying the BrCa. In [18], a model was developed using the transferable texture convolutional neural network (TTCNN) for classifying benign or malignant in the early stage. Instead of the pooling layer, the suggested method uses three convolutional layers and one energy layer. They were using deep features from the convolutional neural network models examined. TTCNN performed in the third stage. In order to improve classification accuracy, the best layers are chosen, from which the deep features are retrieved. A BrCa classification model was proposed in [19] by combing the best available features with deep learning. The DarkNet-53 deep learning algorithm is retrained, and augmented breast cancer images are inputted into the DarkNet-53. The features are extracted from the input, and an optimization algorithm is used to select the features from the input images. Meraj et al. [20] used the U-Net and Independent Component Analysis (ICA) for breast cancer classifications. The developed model is evaluated based on the breast ultrasound images dataset (BUSI). Table 1 presents the merits and demerits of existing works.

3. Methodology

Our system is being developed in this research work to detect BrCa using two different imaging models, such as mammogram and ultrasound. The proposed model uses both the mammogram and ultrasound images as the input separately. The mammogram dataset [21] and dataset of breast ultrasound images [22] are used for checking our proposed model performances. In the preprocessing stage, a hybrid median filtering eliminates the noises in the input images. Further, the contrasts of the images are enhanced by utilizing the quadrant dynamic histogram equalization. After enhancement, ROI segmentation is performed using the USE-Net deep learning model. The CaffeNet model is used for feature extraction on the segmented images, and finally, classification is made using the IRF-XGB method. This model is experimented with to analyze the classifications’ performances by comparing the results based on the image types. Figure 2 displays the pipelining architecture of the proposed model.

3.1. Image Preprocessing

The speckle noises and poor contrasts in breast ultrasound images degrade the overall quality of the images, which in turn hurts the efficiency of the proposed algorithms. Preprocessing has developed into an essential step to overcome these limitations. As can be seen in Figure 2, there are two different kinds of preprocessing procedures that are utilized. These are enhancement-based approaches and filtering methods. In the filtering approach, speckle noise has been reduced via the application of HMF. In the enhancement method, the contrast has been increased by applying QDHE.

3.2. Hybrid Median Filter (HMF)

The HMF is a nonlinear windowed filter that makes it easy to get rid of noise while maintaining the boundaries of the image. The HMF offers corner preservation qualities that are superior to those of the basic form of the filter. An HMF calculates its output by first determining the median value of a variety of pixels located inside a variety of neighborhood shapes and then determining the value of the median of the previous acquired in addition to the initial values of the pixel. The neighborhood shapes are considered in a “+” shape and an “x” shape, respectively, and they are taken in a straight line and a diagonal line around the center pixel. Compared to a traditional median filter, the HMF is superior in maintaining edge characteristics. This is because it is a ranking process that takes place throughout three steps [23].(i)A window is chosen according to the size of the images; hence, an ‘X-shaped subwindow and a “+”-shaped subwindow are picked out of the available options(ii)The diagonal median, denoted by the Mx, can be determined by sorting the pixels either descending or ascendingly before performing the calculation(iii)The horizontal-vertical median, also known as M+, can be determined by sorting the pixels descending or ascendingly before performing the calculation(iv)The output is determined by calculating the Mx, M+ median, and the values in the center of the pixel

An HMF is advantageous for several reasons, one of which is that it requires less computational complexity to operate. This is because it only functions on fewer pixels within the windows than every pixel within the square masks of the equivalent sizes [24]. Figure 3(a) presents the original images and Figure 3(b) presents corresponding filtered images.

3.3. Quadrants Dynamic Histogram Equalization (QDHE)

Improving an image is one of the primary processes involved in image analysis. The purpose of enhancing contrast is to increase the quality of an image so that it is better suited for specific use. As a result, the QDHE enhancement technique is utilized in this research to improve the contrasts of the medical images. The most reliable method for extracting the features from images with low contrast is the QDHE algorithm. Histogram equalization, allocation of grey level range, clipping, and partitioning of the histograms are the processes used to perform QDHE. Figure 4 depicts the workflow of QDHE.

When splitting the histogram, the proposed QDHE uses the intensity values located in the middle of the histogram of input images. At the outset, the original image’s histogram was split to create dual subhistograms. Likewise, the medians of the divided subhistograms were applied as split points to divide each dual subhistogram into two small subhistograms. Consequently, four subhistograms were achieved. The highest and minimum input histogram’s intensity values were then used as separation starting and ending points [25]. Figure 5 depicts the architecture of USE-Net. The recursive subimage HE and the partitioning strategy utilized in the QDHE algorithm are comparable. The strategy of partitioning based on the median prefers to segment the total of pixels evenly across all subhistograms. Therefore, the following equation can be used to determine the location of each dividing point:

In this equation, the intensities , , and are fixed to 0.25, 0.50, and 0.75, individually, for the overall count of pixels that make up the histograms of the input images. The height and width of the input image, respectively, are denoted by the variables and . The purpose of the clipping was to exercise control over the HE enhancement rate to prevent the processed image from appearing unnatural and excessively enhanced. This is accomplished by mitigating the effects of clipping. It does this by adjusting the form of the input histograms by decreasing or raising the values in the histogram bins according to the threshold known as Tc, which was equivalent to the image intensity average values. The grey level dynamic ranges allotted to all the subhistograms by QDHE are determined by the ratio of the total number of grey level spans to the total number of pixels in that subhistogram. This ensures that the improvement spaces for all the subhistograms are even. The mathematical description of this process is given in the following equation:

Here, is the dynamic grey level that the ith subhistogram of the input images was using. In equation (3), denotes the ith separation point, and represents the pixel’s total count in the ith subhistogram. The dynamic ranges for the ith subhistograms in the output images were denoted by , and the degree of emphasis placed on in equation (4) is denoted by . Therefore, must be modified so that the span of all the subhistograms in the output histograms may be accurately determined. As a consequence of the QDHE approach using a virtually identical pixel’s total count for all the subhistograms, equation (3) does not significantly impact the newly created dynamic range. Equation (4) can be rewritten as follows in order to simplify the QDHE and get rid of the parameter .

The new dynamic range for the ith subhistogram is allotted from [, ] specified by the following equations, respectively:

The first value of is set to the minimized intensity values of a range of new dynamics when it is initially initialized. The last phase in the QDHE was to independently balance all the subhistograms after the ranges of new dynamics for every quadrant subhistogram had been determined. If the ith histogram was assigned at the grey level from [, ], then the result of HE, , of this segment could be found by utilizing the transfer mapping function in the following. In this particular subhistogram, the cumulative density function is denoted by .

3.4. Segmentation Using USE-Net

The U-Net segmentation model was used with the Squeeze and Excitation (SE) layers to segment BrCa images. The S.E. blocks that followed each encoder or encoder-decoder of the U-Net model were incorporated into the U-Net model. On the encoder stage of this USE-Net model, the model can successfully extricate the features of input images by using the sequent convolution and pooling layers; in the decoder section of this architecture, the model will methodically map the extricated features to the raw images by using the sequent upsampling layers, and it will eventually generate the predicted masks. Specifically, the S.E. layers were inserted before all the encoder’s pooling layers and after all the decoder’s upsampling layers [26]. The blocks of S.E. were more effective in the encoding paths than in the decoding paths and very effective in the decoding paths than after the classifier because they impact low-level features in the U-Net design and, as a result, considerably boost the overall performance of the network. Therefore, rather than integrating just one S.E. block after the initial encoder/decoder, the blocks of S.E. are placed after every encoder/decoder to achieve the best possible segmentation performance. This allows for detecting coarse-grained contexts in the last layers and fine-grained localizations in the deep layers. The skip connection technique was utilized to concatenate two sequence convolution layers and a layer of activation into the block, which was then renamed the CONV block.

Let be an input feature map. Here, is a single channel with size . A global average pooling layer, defined by its spatial dimensions , produces channel-wise statistics denoted by , the d-the element of which is represented by the following equation:

To reduce the model’s complexity and increase its generalizability, two fully connected (FC) layers in conjunction with the rectified linear unit (ReLU) function to convert r using a sigmoid activation function, as shown in the following equation:

Here, , , and z represent the reduction ratio that controls the capacity as well as the computational costs of the blocks of S.E. The blocks of S.E. could overfit the training set’s channel interdependencies despite the decreased count of weights regarding the actual structure. To produce the adaptive recalibration that avoids low significant channels and accentuates essential ones, V was resized into by applying (10):

In (11), the term denoted by reflects the multiplication of channel-wise that occurs among the feature maps denoted by and the scalar denoted by . The layer of S.E. was a helpful technique to boost the capability to learn the model by reinforcing more significant features [27].

3.5. Feature Extraction Using CaffeNet

SafeNet, a CNN with improved performance derived from AlexNet, was the neural network used in this study to extract the features. Five convolutional layers were included in the CaffeNet, along with three FC layers. The problem of overfitting was avoided by employing the dropout approach at the initial two FC layers, with the likelihood of dropout being set at half. CaffeNet utilized the local response normalization method for normalizing the feature maps to increase the signal of activated neurons while simultaneously decreasing the signal of surrounding neurons, which contributed to an improvement in the model’s capacity for generalization. All the ROIs from this work was scaled using bilinear interpolations to equal the input layer size (227 × 227). The mean of the training set was subtracted from the data using standard practices in deep learning experiments [28]. Table 2 shows the CaffeNet architecture configuration.

In this research, a method was used to extract the features from the CNN model to get higher performance than classifying directly with the CNN. This method was used to achieve the goal. The IRF-XGB classifier was trained with the help of the features that were taken into consideration. Taking the activations from just one layer of the CaffeNet network, the model can be interpreted as a feature extractor once the CaffeNet has been fine-tuned. In general, discriminative characteristics were provided by the higher layers, and the final FC layer just produced the score of the class prediction. As a result, the FC7’s output was utilized as the representation of features of the BrCa image. The features that were extricated from FC7 were the vector with 4096 dimensions, and the values of the features are scaled using the vector’s most excellent absolute value to fall within the range [1, 29].

3.6. Classification Using IRF-XGB

RF is a type of ensemble algorithm that is used for machine learning. RF produces a powerful learner by combining several weak learners in the form of decision trees. The term “random” used in RF refers to two different kinds of randomness, including random samples and features. Initially, RF will use bootstrap aggregating or bagging to partition the initial dataset into a series of random samples. The training portion of the bagging method employs two-thirds of the original dataset, whereas the testing portion uses just one-third. In order to generate random samples, the training dataset is first subjected to a uniform selection of instances followed by instance replacements. After the samples had been obtained randomly, an unpruned decision tree was constructed using every collected dataset. Instead of describing the optimal split in all the tree nodes, random features are employed instead rather than using all of the features. The trees in the forest carry out their tasks uniquely and produce their outputs simultaneously. The outcome of the forest is determined by conducting a significant vote on the outcomes of each decision tree. Since RF employs the bagging approach, it does not need to undergo an additional validation process [30]. A regression model sequence, , is obtained through n-time model training. This sequence is then used to form a multiregression model system. After collecting the results of the prediction of the N estimators’ regression trees, a simple average methodology is utilized for calculating the values of the newer samples. The expression for deciding whether or not to perform regression is given by the following equation:

The integrated regression model is denoted by . In contrast, denotes a regression model of the single decision tree, and T indicates a total count of the regression tree (N estimator). The decisions made by each tree are considered by IRF, which helps to enhance accuracy. The bagging is used to get a random result. The term “bagging” refers to the process of “bootstrap aggregating,” which enhances both the accuracy and stability of the algorithm.

Here, represents the predictions for unseen samples derived from (13), and x represents the number of trees, with values ranging from x = 1, 2, 3, …, X; and represents the training of a decision tree using Bx and Lx as inputs to improve the functionality of RF while simultaneously addressing the problem of imbalanced classification. The RF is made better through sampling, which results in a balanced RF. The bootstrap sampling method, which selects samples from a dataset randomly and uniformly with replacement, is the one that is used in the initial RF. This method does not take into account the requirements of the class. As a result, a bootstrap sample taken from the initial RF can have only a few or even no data about the minority classes. Because of this, the accuracy of the predictions made by all the decision trees formed on such a bootstrap sample would suffer when classifying data about minority classes. The proposed technique executes the sampling with the considerations of the majority and the minority classes to make up for the current shortage. In this improved RF, initial draws are made from the provided dataset using a random number generator to select “n” samples from the minority group. As a direct result, the same quantity of samples from the class that constitutes the majority is likewise extracted via replacement. Thus, these random samples were pooled to provide even random sample sets that include an equal number of data samples representing both the minority and the majority. An unpruned decision tree is developed based on this evenly distributed set of samples. Expanding and integrating the balanced decision trees makes it possible to arrive at a balanced RF in this manner [31].

XGB is a paradigm for improving tree structures that is both scalable and adaptable. It can manage sparse data, increase the speed of algorithms, and decrease the computation time and memory required for large-scale data. The XGB algorithm can be stated more technically as follows [32]. The objective function can be defined as in equation if there is a training dataset including n samples in the following equation:

Here, the value of indicates the distance that separates the target from the prediction , and is the tree’s score assigned to its prediction ability. It is possible to compute an estimated loss function using the Taylor expansion of the objective function, as shown in the following equation:

In this equation, the first derivative of each sample is denoted by , and the second derivative of each sample is denoted by ; the loss function only needs the first and second derivatives of each data element. The use of XGB as a method for making model predictions has recently gained popularity. The speed of RF is the technique’s most significant downside [https://bit.ly/3WlQzfs]. Therefore, XGB was paired with it to acquire better research results and circumvent the problem with speed caused by it. XGB is used for regression and classification approaches to produce more accurate predictions. Additionally, XGB assists in the reduction of errors caused by bias and helps to decrease bias overall.

4. Experimental Results and Discussion

The proposed research model was used to analyze the ultrasound and mammogram images, and the results showed that all the classified images belonged to two categories: malignant and benign. The research model was implemented on MATLAB R2019 on a Windows-1064-bit O.S., Intel i7-CPU@2.60 GHz, 16 GB RAM, and 1 TB. hard disk. The classification model was applied to the ultrasound and KAU-BCMD dataset images. These datasets contain ultrasound and mammogram images.

4.1. Breast Ultrasound Images Dataset

The medical pictures of BrCa obtained from an ultrasound scan are analyzed in this data set. The images in the breast ultrasound dataset can be divided into three categories: standard, benign, or malignant. When integrated with machine learning, breast ultrasound images can yield excellent outcomes in the classification, identification, and segmentation of BrCa. The breast ultrasound scans of participants aged 25 to 75 were included in the data. There are a total of 600 female patients within the population. The collection contains 830 images, each of which has an average of 500 pixels by 500 pixels in size. PNG is the file format used for these pictures. The images of the ground truth are exhibited alongside the original images. Normal, benign, and malignant are the three groups into which the images have been placed for classification [22].

4.2. KAU-BCMD Dataset

One of the main contributions of this research is that it used a new digital mammography data set for BrCa from King Abdulaziz University, Saudi Arabia. The dataset was a digital mammogram and ultrasound images collected between 2019 and 2020 from the Sheikh Mohammed Hussein Al-Amoudi Center of Excellence in BrCa at King Abdulaziz University. It is the first data set in Saudi Arabia that manages a considerable volume of mammography images. This research used mammogram images from the dataset, comprising 1416 mammogram cases, with both MLO and CC views for the left and right breasts, thus totaling 5662 images. The mammogram images are in the format of DICOM and JPG. From a total of 5662 images, only 3778 images were used for the experiment analysis, and the remaining 1884 images were excluded regarding the images considered to be in the normal category [21].

4.3. Performance Computation

The proposed classification model’s performances were calculated using the result parameters often used for evaluating the classifier’s performance in image processing. Such parameters are accuracy, sensitivity, specificity, precision, and f1-score [33]. Based on these parameters’ computation scores, the research model’s performances were compared and analyzed with those of conventional models for validation.

4.4. Discussion on Results

In this section, the results obtained from the performance analysis are discussed. This result section is discussed into two sections: the first one discusses the results obtained for the ultrasound images using the research model and its comparison, and the second part discusses the results obtained for the mammogram images using the research model and its comparison. In order to reduce issues like overfitting and selection bias, cross-validation uses new data that was not used to train the model to assess the model’s performance. We have performed K-foldcross-validations to evaluate our proposed models’ performances. We selected K values as 10. A 10-foldcross-validation technique tests the model for the unseen data. The entire dataset is divided into ten groups and computed accuracy for the 10-fold. The proposed performance model for ultrasound images using stratified-10-fold is presented in Table 3, and Table 4 displays the proposed model for mammograms using stratified-10-Fold. From Tables 3 and 4, we can understand that our model is balanced.

Figure 6 depicts our model’s training and test accuracy for the ultrasound images. Figure 7 depicts the training and test measures of the research technique using mammogram images. This performance analysis was evaluated into training and testing results based on the classification result parameters discussed above. The research model obtained the best performance in the training set compared to the test set in this work using ultrasound images. The research model obtained 99.15% accuracy in training and 97.87% in testing; the training result is 1.28% higher than the test result. The sensitivity of the research model was 99.50% in training and 98.45% in testing. In this parameter, there is a difference of 1.05% between training and test result. The research model obtained 97.22% specificity in training and 95.24% in testing; the training result is 1.98% higher than the test result. The precision score in training and testing was 99.50% and 98.96%, where the difference is 0.54% among them. The f1-score in training and testing was 99.50% and 98.70%, in which the training result is 0.8% higher than the test result. Figure 7 depicts the training and test measures of the research technique using mammogram images.

Table 5 compares the results of the research model using ultrasound images with the existing classification models from the literature survey, such as CNN-DenseNet-161, CNN-VGG-16, CART, R-Boost, and M-Tree. For this comparison, the research model’s test results are used. According to this comparison, the research model obtained the best accuracy using ultrasound images. The accuracy of the research model was 97.87%, which is 3.29% to 9.15% higher than the compared models. The least accuracy was obtained by CNN-VGG-16 [9]. The research model obtained 98.45% sensitivity, which is 7.65% to 14.67% improved than the other models in this work. The specificity score of the research model was 95.24%, which is 2.65% to 3.71% higher than the CNN-DenseNet and V.G.G. models. The CART model obtained the highest specificity score of 98.84%. The research model obtained 98.96% precision, which is 0.05% to 9.7% higher than the compared models. The CNN-DenseNet model obtained the least precision score of 89.26%. The f1-score of the research model was 98.70%, which are 9.14% to 11.99% improved than the CNN-DenseNet and V.G.G. models. Figure 8 represents the performance analysis comparison in a graphical plot. This performance analysis was evaluated into training and testing results based on classification parameters. The research model obtained the best performance in the training set compared to the test set in this work using mammogram images.

The research model obtained 99.58% accuracy in training and 98.31% accuracy in testing; the training result is 1.27% higher than the test result. The sensitivity of the research model was 99.65% in training and 99.29% in testing. In this parameter, there is a difference of 0.36% between the training and test results. The research model obtained 98.90% specificity in training and 90.20% in testing; the training result is 8.7% higher than the test result. The precision scores in training and testing were 99.88% and 98.82%, where the difference is 1.06% among them. The f1-score in training and testing was 99.77% and 99.05%, in which the training result is 0.72% higher than the test result. Figure 9 represents the performance analysis comparison in a graphical plot. The research model’s results were compared using mammogram images with the existing classification models from the literature survey, such as LBP-ANN, VGG-19-SVM, TTCNN, ResNet50-SVM, and Inception-v2ResNet-SVM, as shown in Table 6. The research model’s test results are used for comparison. Based on this comparison, the research model obtained the best accuracy using mammogram images. The accuracy of the research model was 98.31%, which is 1.44% to 3.55% higher than the compared models.

Additionally, the research model has obtained higher accuracy in mammogram classification than the ultrasound classification. The least accuracy was obtained by Inception-v2ResNet-SVM [13]. The research model obtained 99.29% sensitivity, which is 5.05% to 10.43% improved than the other models in this work. The specificity score of the research model was 90.20%, whereas the other compared models achieved a higher specificity score than the research model. The ResNet50-SVM model obtained the highest specificity score of 96.99%. The research model obtained 98.82% precision, which is 3.37% to 10.68% higher than the compared models. The f1-score of the research model was 99.05%, which is 4.21% to 10.56% improved than the other models. The model complexity is measured against USE-Net architecture using FLOPS. The computational complexity is reported in Table 7. This work has some limitations, where the performance analysis comparison was made on different datasets related to the compared models. Because each work has its objectives and purposes for performing the classification process on BrCa images, the proposed QDHE model’s performance could have been more effective, which impacted the segmentation and classification results. A limited dataset was used for ultrasound images, which degraded the model’s performance.

5. Conclusion

The proposed model used both the ultrasound and mammogram images as the input individually. The research model comprised preprocessing, segmentation, feature extraction, and classification stages. Initially, in the preprocessing stage, hybrid median filtering was used to eliminate the noises present in the input images, and further, the contrast of the images was enhanced using the QDHE. After noise removal and enhancement, segmentation of ROI was performed using the USE-Net deep learning model. With the CaffeNet model, feature extraction was performed on the segmented images, and finally, classification was made using the IRF-XGB technique. This research model was experimented with to analyze the performances of the classification by comparing the results utilizing the ultrasound and mammogram image types. The performance analysis was evaluated separately for both image classifications. For ultrasound images, the research model obtained 97.87% accuracy, 98.45% sensitivity, 95.24% specificity, 98.96% precision, and 98.70% f1-score. For mammogram images, the model obtained 98.31% accuracy, 99.29% sensitivity, 90.20% specificity, 98.82% precision, and 99.05% f1-score. The proposed model achieved better performances in classifying mammogram images than ultrasound images. In the future, the performance of this model can be improved by adding more images for training, and a feature selection model can be integrated for selecting the best features for the classification.

Data Availability

The data supporting the findings of this research are available within the article and for public download.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

All the authors contributed equally to the manuscript.

Acknowledgments

The authors extend their appreciation to the Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia for funding this research work through the project number 0075-1442-S. In addition, the authors thank the Vice President for Post Graduate Studies and Scientific Research, Faculty of Computers and Information Technology and Industrial Innovation & Robotics Center, University of Tabuk for their immese support and encouragement for this research through the project number 0075-1442-S.