Abstract

The use of multimodal magnetic resonance imaging (MRI) to autonomously segment brain tumors and subregions is critical for accurate and consistent tumor measurement, which can help with detection, care planning, and evaluation. This research is a contribution to the neuroscience research. In the present work, we provide a completely automated brain tumor segmentation method based on a mathematical model and deep neural networks (DNNs). Each slice of the 3D picture is enhanced by the suggested mathematical model, which is then sent through the 3D attention U-Net to provide a tumor segmented output. The study includes a detailed mathematical model for tumor pixel enhancement as well as a 3D attention U-Net to appropriately separate the pixels. On the BraTS 2019 dataset, the suggested system is tested and verified. This proposed work will definitely help for the treatment of the brain tumor patient. The pixel level accuracy for tumor pixel segmentation is 98.90%. The suggested system architecture's outcomes are compared to those of current system designs. This study also examines the suggested system architecture's time complexity on various processing units with neuroscience approach.

1. Introduction

Tumors are described as the development of glandular growth in the brain, which can be benign or malignant [1, 2], and they are amongst the deadliest diseases [35]. The extra growth of tissues at certain places increases the chances of tumor formation. Those with a brain tumor may encounter the following symptoms or signs. A symptom, such as weariness, nausea, or pain, can only be recognized and explained by the individual experiencing it. A symptom is anything that may be observed and quantified by others, such as a fever, rash, or elevated pulse rate [6, 7]. When signs and symptoms are combined, they can help characterize a medical disease. A person with a brain tumor may not display any of the following signs and symptoms. A medical condition other than a brain tumor might be the source of a symptom or sign [8, 9]. Depending on where they originate and how rapidly they grow, astrocytomas could be fatal. As a result, a neurosurgeon’s most important and initial goal is to correctly segment a brain tumor so that suitable therapy may be decided [10]. Though brain tumors can cause a number of issues, they are not cancerous, which means they grow slowly and seldom spread to other regions of the body. They also have more well-defined borders, making surgical excision easier, and they seldom recur after removal [11, 12]. Malignant brain tumors, on the other hand, are malignant, grow rapidly, and can spread to other parts of your brain or central nervous system, offering a life-threatening risk [13, 14]. The improved matching filter also aids in reducing the likelihood of misclassification. Progress in the technology of diagnostic imaging and analytical modelling have resulted in improved diagnosis and treatment [1].

To produce thorough scans of the brain, many imaging methods are utilized. MRI scans produce pictures with varied contrasts and brightness that highlight distinct sections of the brain, allowing tumor subregions to be distinguished. Each MRI sequence is thought to be crucial in identifying various tumor subregions. The ability to segment brain tumors is crucial for statistical tumor characterization, which leads to more accurate diagnosis and better treatment methods and strategies [15]. An MRI of the head is a non-invasive, painless treatment that produces detailed images of your brain and brain stem. In an MRI machine, a magnetic field and radio waves are used to produce the images. This test is also known as a cranial MRI or a brain MRI [16, 17]. CT scans employ X-rays, which emit a radiation. MRI, on the other hand, does not use radiation. Due to this, pregnant women cannot take CT scan. People with implants may not take MRI scan. Deep learning advances have resulted in more precise and reliable segmentation algorithms. CNNs have attained state-of-the-art outcomes in a wide range of computer vision applications [18]. By varying the 2D U-Net, Isensee et al. performed a brain tumor segmentation investigation [19] to 3D [15]. Havaei et al. offered a feedforward CNN construction that integrates local and global data to make brain tumor separation [20]. To achieve brain tumor segmentation, Pereira et al. employed short convolution layers and gray-level technique [21]. Any of the following signs and symptoms may not be present in a person with a brain tumor. A symptom might be caused by a medical issue other than a brain tumor. Astrocytomas can be deadly depending on where they come from and how quickly they develop. Brain tumors are not malignant, despite the fact that they can cause a variety of problems.

To improve the validity of the model segmentation network, Kamnitsas et al. [22] employed conditional random field (CRF) as a preprocessing method. Despite the fact that many deep learning algorithms have produced incredibly precise outcomes in diagnostic imaging processing activities, an interpretation is required to realize. This clarification serves as a link between experts and the algorithm, offering correct forecasts in addition to justifications [23]. The terms “explainability,” “interpretability,” and “transparency” have all been used to characterize how well the model is understood [2426]. For statistical tumor characterization, the capacity to segment brain tumors is critical.

Furthermore, the added dimension of 3D structures complicates understanding the model's findings in order to develop predictions. Medical specialists are hesitant to believe CNN projections because of their lack of explainability and black-box nature [27]. The degree to which a black-box model's projections can be trusted is a hot issue [28]. Understanding the reasons behind the model’s forecasts is crucial to avoid bad treatment outcomes and to have trust in the model’s forecasts [29]. Furthermore, interpretability allows users to categorize the patterns discovered primarily by the model and check that they are consistent with medical professionals’ domain expertise, increasing end-user trust and reliance on the model's judgments [30].

Two primary types of interpretability techniques are presented in [31]. After the model has been trained using the training sample without being significantly altered, post hoc techniques are utilized, whereas ante hoc or trainable attention entails building interpretability into the model architecture from the start. The most common trend in understandability at the intermediate levels of the model is to see and grasp the information gathering process [32]. To investigate the model’s secret knowledge, saliency maps are constructed in visual interpretability. Several visual interpretability methodologies have been developed [3335], and a thorough research of AI has been done [3638].

Furthermore, representativeness is critical in the healthcare field since it ensures that medical practitioners can recognize and accept estimates of a neural network (NN) [31]. Incorporating optical explicable into deep learning models for medical image interpretation is a popular method, as demonstrated by LIME [36], GB [39], Grad-CAM [40], and CAM [41]. These methods, despite being aesthetically appealing, rely on gradients and image manipulation as inputs. As a result, they are time-consuming to create and produce less clear visual descriptions in terms of projected class. In addition, the lack of assessment criteria makes evaluating the quality of provided explanations difficult. The most typical movement in understandability at the intermediate layers of the model is to view and comprehend the evidence gathering procedure.

According to the literature, the BT segmentation requires extra attention in order to increase its performance metrics. The classification is done based on result of segmentation. The proposed system approach will be incredibly useful in extracting volume of the tumor. The rest of the paper is organized as follows. Section 2 looks into the planned scheme's mathematical model in further depth. The suggested technique has been tested on many types of data and has produced good results. The suggested system design is compared against a variety of algorithms, and the findings are presented in Section 3. According to Ladkat et al. [42], to assess the temporal complexity of the proposed system, it is tested on various processors. With a conclusion, Section 4 comes to an end. The authors in [4350] described their trials with machine learning applications in the medical field. The LRA DNN approaches were developed by Shelke et al. [51]. Wankhede et al. [52,53] performed deep learning experiments on the brain tumor.

2. Proposed Methodology

Image dataset is passed through equation (1). The low and high components from the image are extracted from the image. The is passed through the transform to get as a result.where is an arbitrary starting scale, coefficients define an approximation of at scale parameters for the scales are and .

Now, to get reduced dimensionality features, these decomposed values of the uploaded picture are processed via the following equations:

Equations (5) and (6) gives the feature and to get the ith component of the feature, we need to subtract the first.

Covariance of the feature set is calculated as

Now, equation (8) is supplied as input to the neural network. Equation (9) gives the features which are then fed to the neural network to get the classification done. The all expressions in which occurs can be given as.where , which is used as forward propagation equation.

To compute the gradient, there is a need to get the values of where chain rule can be given as

So, it is clear that the error at the current layer can be easily computed by using deltas at the current layer by just using the derivative of the activation function, . We now have everything we need to compute the gradient with regard to the weights employed by this convolutional layer because we know the errors at the current layer. We need to transmit errors back to the preceding layer to compute the weights for present convolutional layer.

The output of the improved picture is then entered straight into the mathematical model below, which produces enhanced tumor pixels in a three-dimensional plane. Let be the covariance matrix to retrieve all the potential values of the tumor pixels inside the threshold limit.where The estimated tumor pixels’ resulting values are as follows:

After passing images from the above equations, the image is passed through the following neural network to get segmented.

The three-dimensional attention module is merged with the decoding blocks, and the U-Net architecture is translated to 3D. A 3D attention model with decoder blocks is also available to improve segmentation estimation. A channel plus spatial attention system, as well as a bypass connection, makes up the attention module we propose. Combining concurrently intriguing features, on the other hand, may lead to pattern training inconsistency. When skip connections are employed, the network’s redundancy and sparsity are lowered. Figure 1 depicts the situation of projected structure architecture for segmentation of brain tumor.

Here the parallel and serial connection of block of encoder and decoder is presented. The encoder block encodes the result of the mathematical model (from equation (18)) and the left side of the U type structure takes it into account so that the size of the 3D block converges. The right side of the U type structure is the combination of encoder, decoder, and excitation module. After that, there is a block called convolution which convolves the result and gives the segmented result.

Spatial and channel attention improves encoding quality across the feature hierarchy. As a consequence, we create 3D concentration units that provide 3D spatial plus channel attention by combining 3D trans and cross feature interactions. To create the 3D attention map, we first combine all three-dimensional attribute connections focused on the H  ×  W  ×  1 measurement with a 1  × 1  ×  C convolution. We do parallel average pooling and forward it to the neural network to obtain the 1 × 1  ×  C channel correlation. Rich spatial and channel attention is stored in the 3D attention map. We also employ skip connection to decrease the sparsity and singularity that these parallel excitations create. Furthermore, using a skip connection broadens the learning and improves segmentation prediction.

3. Results and Discussion

All of the tests in the paper are done with BraTS 2019. BraTS 2019 has 335 cases, including 259 instances of high-grade glioma and 76 cases of low-grade glioma, respectively. In the validation and test sets, there are 125 and 166 examples, respectively. The modality has a voxel size of 240 × 240  ×  155. In the training set, there is also a segmentation annotation that marks three areas as 1, 3, and 4 pixel values.

We upload our estimation accuracy to the BraTS 2019 site and obtain a variety of measurement metrics to evaluate our model prediction, including Dice, Hausdorff, sensitivity, and specificity. Table 1 presents the statistical parameters calculated from the BraTS 2019 validation set’s performance parameters. A visual depiction of the validation set prediction is shown in Figure 2. Necrosis is shown by red colour, tumor enhancement is shown by yellow colour, and edema is shown by green colour. The presentation graphs of the suggested three-dimensional attention U-Net vs the original 3D U-Net are shown in Figure 3.

Figure 3 shows that the suggested model beats the 3D attentive U-Net and the 3D digital U-Net paradigm across all areas, including ET, WT, and TC.

To develop an effective model, we chose the 14 very vital properties and trained the models. A Bland–Altman plot in Figure 4 depicts the circulation of reversion productivity for entirely retrieved structures and 14 nominated attributes (Figures 4(a) and 4(b)). Compared to all other variables, the average gap between actual survival and anticipated survival rate for the selected attributes is nearly half (5.72 days).

Figure 5 clearly indicates that the suggested structure architecture is performing extraordinarily superior than other present system architectures. The accuracy, precision, recall, and F1 score of the recommended system architecture for segmentation are 99.90%, 99.90%, 98.50%, and 98.50%, respectively, which are much higher than those of existing systems.

When the results are compared on the basis of pixel value, then the resultant confusion matrix is presented in Figure 6. It clearly states that the dominant diagonal is having higher values than those of the other cells in the confusion matrix. The MRI images of the width 809 pixels and height 974 pixels are considered in the present study. The average result of 100 images is calculated and put in the form of confusion matrix here.

Here after taking average of 100 tumor images, the average tumor pixels are 853 and average non-tumor pixels are 787113.

The time complexity is tested on several CPUs. The average time it takes to get a outcome on several hardware platforms is shown in Table 2.

When using a CPU such as i5 or i7, the time complexity is approximately identical; however, when the system is evaluated on a GPU, the time necessary to get the results is much different.

4. Conclusion

BraTS 2019 was used to complete all of the experiments in the article. BraTS 2019 has 335 cases, with 259 instances of high-grade glioma and 76 cases of low-grade glioma. To test our model prediction, we upload our estimation accuracy to the BraTS 2019 site and obtain a number of measurement metrics, such as Hausdorff, sensitivity, Dice, and specificity. The suggested system architecture produces accurate tumor pixel segmentation findings from a 3D brain picture. The system is put to the test on a variety of levels. The types of feature extraction are utilized as the initial criterion. The temporal complexity of the systems is compared. The suggested system is also evaluated in comparison to current classifiers. The suggested system architecture for segmentation has accuracy, precision, recall, and F1 score of 99.90%, 99.90%, 98.5%, and 98.50%, respectively, which are significantly higher than those of existing systems. From this, we can conclude that the proposed system architecture is reliable and we can use it in the medical field for effective diagnosis.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.