Full article: Ensemble 3D CNN and U-Net-based brain tumour classification with MKKMC segmentation

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Advanced brain cancer is the deadliest type with just a few months survival rate. Existing technologies hinder the objective of forecasting cancer. This work aims to fulfil the pressing requirement for timely and precise identification of advanced-stage brain tumours, which are notorious for their markedly reduced life expectancy. It presents an innovative hybrid approach for predicting brain tumours and improves diagnostic capabilities. The Multiple Kernel K-Means Cluster Algorithm (MKKCA) is used to segment brain MRI images effectively, differentiating healthy and tumorous tissues. After segmentation, a hybrid approach with 3D-Convolutional Neural Network (CNN) and U-Net has been utilized for classification. The objective is to effectively and accurately distinguish normal and pathological brain images. To enhance the efficiency, we include the Improved Whale Optimization Algorithm (IWOA), which guarantees accurate and dependable performance via location updates. The methodology demonstrates outstanding precision with 98.5% accuracy rate, 98.56% specificity, 91% sensitivity, 87.45% precision and a recall rate of 96% with the F-Measure at 96.02%. These findings, obtained using MATLAB, demonstrate a substantial performance improvement compared to current approaches. This development not only represents a significant addition to diagnostic imaging but also a crucial role in the prediction and treatment of brain cancers.

KEYWORDS:

1. Introduction

Among the most essential and crucial organs in the human body is the brain, as it contains nerve cells and tissues that govern the most important processes of the entire body, such as breathing, muscular movement and our senses [Citation1]. Every cell has its unique capacities; some cells gain functionality as they mature, while others lose capability, resist and become aberrant. This group of cells form tissue thus it named as tumour. Cancerous brain tumours are uncontrollable and abnormal growths of brain cells that severely impair the neurological system and lead in the patient's terrible death [Citation2,Citation3]. Even though cancer is not a similar disease, it is one of the major life terrifying and deadly disease. One of the major brain tumour types is Glioma and it is divided into two types malignant (HGG) and benign (LGG).

For a physician who uses computer-aided-diagnosis as a supporting tool for medical operations, most of the major problems in brain tumour are recognition, analysis and categorization [Citation4]. Glioma, meningioma and pituitary are the three notable kinds of brain tumour. Brain cancer requires a precise and prompt diagnosis in order to receive effective treatment. Treatment options are determined by the diagnostic kind, the stage of the tumour at the time of evaluation and the tumour's grade [Citation5,Citation6]. In a variety of methods, computer-aided design systems (CAD) have aided neurologists. Furthermore, computer-aided design implementations in neurology aid in the grading, categorization and detection of tumours.

A painless, non-invasive diagnostic imaging technique called magnetic resonance imaging (MRI) creates superb 3D and 2D images of human body components. It is extensively used and recognized as one of the most efficient methods for cancer identification and categorization due to its high-resolution images of brain tissue [Citation7–9]. Recognizing cancer kinds from MRI images, on the other hand, is a tough, failure, highly technical process that requires on the ability of the physician, as well as a time-consuming approach. Additionally, because tumours can have a range of shapes, there might not be enough obvious markers in the image to help with proper diagnosis [Citation10]. We conclude that human analysis is typically unreliable as a result. The wrong type of brain tumour can also be a serious issue since it hinders patients from responding adequately to surgical intervention and lowers their chances of survival [Citation11,Citation12].

Medical image processing for diagnostic reasons has received a lot of interest. The recent introduction of sophisticated machine learning algorithms, as well as their shown efficacy in solving numerous problems in the field of artificial intelligence, has sparked a surge of interest in health-related themes and algorithms [Citation13]. Many studies have been conducted and various approaches applied to categorize various malignancies utilizing MRI, particularly MR brain imaging, artificial neural networks and evolutionary algorithms [Citation14]. The brain MRI can indicate normal and abnormal types in earlier studies that are easy to differentiate the shallow machine learning algorithms such as probabilistic neural network, neural network, SVM and hybrid intelligent techniques [Citation15].

Deep convolutional neural networks (CNNs) have seen a lot of success in computer vision in recent years. CNNs are artificial neural networks that have numerous hidden convolutional layers in the middle of input and output layers, which are influenced by the genetic system of the visual cortex [Citation16–18]. They are non-linear in nature and can retrieve relatively high relevant attributes. Deep learning algorithms combined with CNN have demonstrated good results on a range of additional medical imaging applications, including skin cancer classification, brain tumour segmentation and diabetic retinopathy identification [Citation19,Citation20].

In this research we proposed prediction technique for brain tumour based on ensemble 3D CNN and U-Net, optimization based on Improved Whale Optimization Algorithm (IWOA). The developing investigation region in the brain is brain tumour segmentation, as well as classification, to differentiate between benign and malignant cells in order to anticipate its stage. Advanced systems lack computerized categorization, making decision-making extra time-consuming and inefficient. This article proposes computerized categorization methods to circumvent the drawbacks of recent techniques by preparing a 3D CNN and U-Net-dependent classification using the IWOA.

The following are major contribution of this research work

To tackle the major problem in MR images, such as low contrast and data heterogeneity, a preprocessing method was utilized to increase performance.
The data augmentation approach is then used to compensate for the absence of data and to deal with the wide range of brain tumour heterogeneity.
The hybrid approach of the 3D CNN and U-Net with IWOA is applied for the classification scheme. For segmentation, MKKMC is used.
The proposed system efficiently and accurately classifies the cancer parts in the brain image when compared to existing approaches. The accuracy of tumour categorization using the provided strategy has therefore significantly improved.

2. Literature review

Swati et al. [Citation21] introduced a method for classifying brain tumour images build on fine-tuning and transfer learning. Using a before-trained CNN from an off attributes extraction is not as effective as using learning algorithm and block level fine-tuning. Further body organ MRI image, and many other medical imaging categories such as PET, CT scans and X-rays, might be classified using the proposed method. This technique was significantly more versatile because it only required little preprocessing for 2D MR images and did not rely on produced features. Along the same CE-MRI dataset, the suggested method outscored state-of-the-art classical machine learning techniques as well as state-of-the-art CNNs approaches, according to the experimental results.

Amin et al. [Citation22] suggested using the discrete wavelet transform (DWT) and the Daubechies wavelet kernel for fusion, resulting in a highly communicative tumour region when compared to a normal MRI signal. A partial differential diffusion filter (PDDF) was used to eliminate noise just after the fusion process. For segmenting tumour regions, a global thresholding approach was applied, which was then fed into a recommended convolutional neural network (CNN) model for eventually discriminating tumour and non-neoplastic regions. The method achieves greater solution on merged images such as 0.97% accuracy on BRATS 2012 Image, 0.98% accuracy on BRATS 2013, 0.96% accuracy on BRATS 2013 Leader board, 1.00% accuracy on BRATS 2015 Challenge and 0.97% accuracy on BRATS 2018 datasets.

El-Mahelawi et al. [Citation23] suggested to employ an artificial Neural Network model to classify tumour types. For training this approach utilize feed-forward back propagation algorithm. Sex, histologic type, liver, lung, degree of difference, pleura, bone state, age, bone marrow, brain, peritoneum, skin, neck, supraclavicular, axillar, abdomen and mediastinum are some of the key factors in the classification of tumours. For the ANN model, they were employed as input variables. The “primary tumour” dataset was used to build a classifier depending on the ML topology. The concept was put to the test, with the best score being 79.65%. An artificial neural network was effectively used to classify tumour types in this research.

While predicting output label, Sajid et al. [Citation24] suggested a new CNN architecture that employa patch-based method that takes local and contextual communication into account. The proposed network uses a repeated training method to handle the issue of data imbalance while combining dropout regularization and batch normalization to address the issue of over fitting. The suggested method includes a pre-treatment stage in which images are normalized and the bias field is changed, a feed-forward run via a CNN and a post-processing stage in which minor false positives around the skull area are eliminated. On the BRATS 2013 dataset, the suggested method obtains values of 0.86, 0.86 and 0.91 cent in terms of specificity, dice score and sensitivity over the entire tumour region, outperforming state-of-the-art methodologies.

Numerous end-to-end iterative deep CNN approaches for completely automated brain tumour segmentation were introduced by Saouli et al. [Citation25]. In addition, we use the Ensemble Learning technique to create a more effective strategy. They provide a new training method for solving the problem of learning CNN architectures that takes into account the most important hyper-parameters by constraining and limiting these hyper-parameters in order to ensure an effective training process. The reported deep learning models are efficient at segmenting brain tumours and provide very high accuracy. Furthermore, the suggested models may aid medical experts in reducing diagnosis time.

Naz et al. [Citation26] suggested different architectures of SegNeT encoder and decoder depend on pixel-wise classification. SegNet achieved better segmentation performance and more precise in label prediction. In comparison to patch-based categorization, it showed high efficiency with promising reliability. It achieves successful performance without the inclusion of a post-processing CRF, which would have made the approach extra time demanding despite attaining the intended outcomes. In comparison to previous segmentation models, SegNet reduces computational time and memory usage. The usage of a small number of parameters is crucial in the implementation of SegNet. The projected method achieves a level of accuracy of above 99%.

The implementation of deep learning models for the prediction of brain tumours was described in this research [Citation27]. Noreen et al. The ensemble method based on conjunction of dense blocks using the DensNet201 pre-trained framework outperformed present research methodologies for brain tumour classification. Then, to classify the brain tumour, these characteristics were synthesized and submitted to the softmax classifier. With 99.51% testing consistency on testing samples, the suggested method achieved the best representation in the classification of brain tumours.

Based on the advanced approaches and high-performance measures discussed in the referenced study, there is a possible research need to investigate these techniques’ real-time implementation and adaptability. The majority of studies primarily concentrate on attaining a high level of precision and specificity in regulated datasets. However, extensive research is scarce regarding the capability of such approaches to adapt in broad, real-world research-backing environments. This includes considerations such as processing velocity, interoperability with different medical imaging technological advances and resilience against varying evidence and patient information quality. Resolving these factors might significantly improve the practical effectiveness of deep learning techniques for diagnosing diseases.

3. Methodology

The proposed brain tumour categorization system is preferable due to its complete and complex approach. Data preparation is the initial procedure in which non-tissue components and noise are effectively eliminated from MRI scan image, ensuring that the data for evaluation is flawless and of high quality. Accurate diagnosis relies heavily on this. Data augmentation is a subsequent process that improves the dataset, enabling the model to generalize more effectively and provide dependable outcomes across many scenarios. This is a crucial factor that is often absent in other approaches. Adopting the Multiple Kernel K-Means Cluster Algorithm (MKKMC) in brain tumour segmentation offers a reliable method for accurately detecting tumour areas. The essential advantage of this approach is the use of a hybrid methodology that combines a 3D Convolutional Neural Network and U-Net for tumour classification. This combination capitalizes on the benefits of methods, guaranteeing meticulous feature extraction and practical learning from intricate image data. The Improved Whale Optimization Algorithm (IWOA) effectively tackles optimization difficulties by enhancing the classification procedure. The method's superiority in medical imaging and tumour categorization is evident via its carefully crafted series of phases, each methodically tailored to maximize accuracy and efficacy. Figure depicts the orderly procedure of the suggested technique aimed at enhancing perceptions.

Figure 1. overall architecture of proposed approach.

The suggested strategy has several advantages compared to current methods, including effective preprocessing, data augmentation, generalizability, scalability and an innovative method of dividing into segments that enhances efficiency and effectiveness. These benefits jointly enhance the brain tumour categorization system's efficiency, precision and adaptability.

3.1. Pre-processing

The initial step is to crop unnecessary areas of the MRI and the corresponding tumour mask images that pertain to non-tissue regions from all of the images of each patient. Cropping was done in three dimensions for each patient's image. Cropping the MRI brain images is crucial to remove any superfluous areas from the input image. Cropped image are filled with zeros to keep their aspect ratio, and then scaled back to 256 × 256 pixels. In addition, noise in the obtained MRI scans may smear small details, distort tumour borders and even reduce the spatial resolution of the images. As a result, by complicating feature extraction, it could substantially damage the performance of CNN-based approaches. As a result, MR image contrast enhancement and denoising approaches have recently sparked a lot of interest and have been thoroughly examined by academics. The magnetic resonance imaging contrast enhancement is calculated using the following equation: (1) $g (x, y) = \frac{f (x, y) - f min}{fmac - f min} * 2^{bpp}$ (1)

3.1.1. Denoising

This denoising will demonstrate how to lower noise using a non-local mean filter. The weight value of the mean pixel is altered. Each pixel's weight is determined only by the length between the destination pixel and the severity grey level vector. The following equation is used to construct a denoised image for each pixel: (2) $M (i, j) = \sum_{j \in D} D (i, j) w (i, j)$ (2) The noisy MRI was designated by j, and the de-noised image was represented by M. As a result of subsequent events $0 \leq w (i, j) \leq 1$ quantity can be satisfied. The judgement among the locales of pixels I and j determines the weighted ordinary of the complete pixel.

3.2. Data augmentation

Data augmentation could be viewed as a crucial aspect in the training of significantly deep learning-based systems. Utilizing common data augmentation techniques, many photo variations were created to improve the dataset samples and lessen over-fitting during training. Data Augmentation addresses the issue of class imbalance by interpolation the minority samples, resulting in a more balanced outcome on the training phase. Numerous augmentation approaches were used in our research, involving horizontal and vertical flips, rotating, brightness alteration, zooming, ZCA whitening, shifting and shearing.

3.3. MKKMC-based segmentation

To segment the tumour portion in the unusual image, the suggested method uses a clustering algorithm. The MKKMC algorithm is used in the suggested technique for segment. Multiple kernel learning is used to modify the classic K-means clustering algorithm. For segmentation, a variety of clustering algorithms are utilized. One of the most often used clustering algorithms is the K-means algorithm, which identifies categories by reducing clustering error. The suggested method uses two hybrid kernels in the multi kernel process, like the quadratic and linear kernels.

We utilize the hybrid kernel K-means clustering approach to increase the projected segmentation accuracy. Many kernels are utilized presently, including the quadratic kernel, radial basis function (RBF) kernel and linear kernel. We hybridize quadratic kernels and linear with them in this suggested approach.

Let's assume there are two kernels, k1 and k 2. Under the kernelization of the metric technique, the hybridized kernel K-means algorithm is an iterative two-step technique that minimizes the optimal solution by partitioning P = {P1, … , Pk} of X into K clusters and their associated cluster centroids $Y_{k} \in R^{p} (k = 1, \dots, k)$ . (3) $\begin{aligned} W & = \sum_{k = 1}^{k} \sum | | Φ (x_{i}) - Φ (y_{k}) | |^{2} \\ = \sum_{k = 1}^{k} \sum_{x \in p_{k}} {K_{MK} (x_{i}, x_{i}) - 2 K (x_{i}, y_{k}) + K_{MK} (y_{k}, y_{k})} \end{aligned}$ (3) Let's assume there are two kernels, LK1 and QK2. Then, as seen in the preceding, three hybrid kernels are created. (4) $\begin{aligned} K (a, b) & = L K_{1} (a, b) + Q K_{2} (a, b) is a kernel \end{aligned}$ (4) (5) $\begin{aligned} K (a, b) & = α * L K_{1} (a, b) is a kernel, when α > 0 \end{aligned}$ (5) (6) $\begin{aligned} K (a, b) & = L K_{1} (a, b) * Q K_{2} (a, b) is a kernel \end{aligned}$ (6)

The objective function (OF) is provided in the following expression, and the overall structure of the K-means method seeks to lower it. (7) $\begin{aligned} OF & = \sum_{i = 1}^{N} \sum_{j = 1}^{C} Z_{ij} | | I_{i} - μ_{j} | |^{2} \end{aligned}$ (7) (8) $\begin{aligned} \sum_{I = 1}^{N} \sum_{j = 1}^{c} Z_{ij} | | I_{i} - μ_{j} | |^{2} \end{aligned}$ (8)

The basic Objective Function is changed as follows, where Zij is the cluster deployment method. (9) $OF = \sum_{I = 1}^{N} \sum_{j = 1}^{c} Z_{ij} | | I_{i} - μ_{j} | |^{2} = \sum_{i = 1}^{N} \sum_{j = 1}^{C} 1 - K_{mk} (I_{i}, μ_{j})$ (9) Whereas N represents the quantity of data, C defines the amount of clusters, $K_{mk}$ indicates the multiple kernel function, µ denotes the cluster centre and I represents the input image.

I indicates the kernel function $K_{mk}$ in many kernel fuzzy c means. Quadratic and linear kernels are presented in this section.

The kernel-based objective function is applied in Equation (10) (10) $K_{mk} (a, b) = L K_{1} (a, b) + Q K_{2} (a, b)$ (10) The linear kernel LK1 is given by, (11) $L K_{1} (a, b) = a^{T} b + c$ (11)

The quadrature kernel QK2 is given by, (12) $Q K_{2} (a, b) = 1 - \frac{| | a - b | |^{2}}{| | a - b | |^{2} + c}$ (12) where c is the constant value. The kernel-based objective function is applied in Equation (5), thus we obtain (13) $K_{mk} (a, b) = α * L K_{1} (a, b)$ (13) where $α$ is a random value. The kernel-based objective function is applied in Equation (6), thus it expressed by (14) $K_{mk} (a, b) = L K_{1} (a, b) * Q K_{2} (a, b)$ (14) The modified centre formula is shown in the following equations: (15) $y_{k} = \frac{\sum_{x_{i} \in p_{k}} K_{MK} (x_{i}, y_{k}) x_{i}}{\sum_{x_{_{i}} \in p_{k}} K_{MK} (x_{i}, y_{k})}$ (15)

After each cluster's centroid has been upgraded, the length between both the centroid and the data point must be calculated. Every data is designate to a cluster centre with the shortest length. This approach is done till the modified centroid of every cluster in successive iterations is identical.

3.4. Tumour classification

The CNN is essential in extracting features, classifying and processing 3D data and enhancing the accuracy of diagnosing brain cancers from MRI images. This technology's advanced deep learning capabilities enable thorough examination and interpretation of intricate medical imagery. However, U-Net is crucial in picture segmentation, mainly when data is scarce. The system's design is skilled at precisely dividing complex brain tumour structures by collecting high-level and comprehensive characteristics via fine-grained segmentation.

3.4.1. 3D convolutional neural network

The proposed 3D CNN consists of eight convolutional layers and three fully connected (FC) layers. Our suggested CNN approach is basically based on two major parts for reliable brain tumour categorization. It is just that, unlike 2D CNN architecture, which does not fully investigate the volumetric knowledge in MR images and instead focuses on two-dimensional slices, we use a 3D convolutional layer to generate a detailed feature map that incorporates either global or local contextual information. The next is deep network design, which results in higher-quality local optimization. In such a design, the added non-linearity could result in great discriminative power.

Convolutional neural networks (CNNs) are a type of supervised deep learning technique that has made great research in the area of image analysis. The 3 major layers of a convolutionary neural design are convolutional, fully connected and pooling. In convolutional layers, the network creates several MRI features by utilizing various kernels to describe the input image. The set of parameters in the network will be greatly reduced by using this layer strategy, and the network will acquire the association between corresponding pixels.

Deeper structures have already demonstrated their usefulness for natural image categorization as an outcome of the greater structures recorded by deeper models. Its influence on 3D networks could be even more dramatic.

To modify CNN models to 3D data, all layers must be capable of 3D operations or execute in a 3D manner. 3D convolutional layers, being the crucial constituent of 3D CNNs, execute the following function: (16) $\begin{aligned} h_{i}^{l + 1} & = σ (\sum_{j} u_{ji}^{l + 1} + b_{i}^{l + 1}) \end{aligned}$ (16) (17) $\begin{aligned} u_{ji}^{l + 1} (x, y, z) & = \sum_{m, n, t} h_{j}^{l} (x + m, y + n, z + t) \cdot \\ W_{ji}^{l + 1} (m, n, t) \end{aligned}$ (17) where $u_{ji}^{l + 1} (x, y, z)$ a 3D convolution with flipped kernel is $W_{ji}^{l + 1} and h_{i}^{l}$ is the i^th channel of the l^th layer, $b_{i}^{l + 1}$ is a bias term. By layering 3D convolutional layers and down-sampling layers in a hierarchical manner. Deep 3D CNNs are capable of extracting greater-level 3D attributes, while they are required for tackling complex issues involving volumetric data.

Moreover, as contrasted to the 2D CNN variation, 3D CNN is theoretically and memory intensive due to the huge number of learnable parameters necessary. As a result, we recommended using only 3 × 3 kernels for each convolutional layer as a method, which might be deemed quicker to convolve and allow layering additional layers with less weight. In the meantime, the pooling layers are employed to lower the middle layer's size. Other technique for dealing with storage limits is to use fewer filters for each layer, particularly during first two levels of the network, in which the features are more dimensional.

3.4.1.1. Activation function

The activation function could be regarded as the source of the data transformation's non-linearity. In the suggested model, the rectifier linear unit (ReLU) is employed as an activation function, as stated in Equation (Equation18(18) $f (i) = Max (0, i)$ (18) ), where f (i) is the function of a neuron's outcome of an input labelled “i.” (18) $f (i) = Max (0, i)$ (18) The ReLUs can be represented as (19) $f^{'} (z) = {\begin{cases} 1, for z \geq 0 \\ 0, for z < 0 \end{cases}$ (19)

We use “ReLU” to get better results because it can train deep CNNs quicker than the traditional “hyperbolic tangent” or “sigmoid” functions described by Equation (Equation20(20) $F (i) = \tanh (i)$ (20) ). (20) $F (i) = \tanh (i)$ (20)

3.4.1.2. Pooling

On CNN, pooling is a down sampling approach. We might specify average pooling, max pooling, which are the two most common types of pooling. The average pooling method takes into account all elements in a pooling zone, including those with low magnitude. As an outcome of the mean computing taking into consideration several zero components, the pairing of activation function and the average pooling creates a down weighting high activation impact. As a result, max pooling was utilized in this study because it isolates the most important information for classification, such as tumour edges. The max pooling procedure was given a max filter to keep the sub-regions of the beginning presentation from overflowing.

3.4.1.3. Regularization

To boost adaptability and prevent overfitting, we used the dropout as a normalization for the fully connected (FC) layers. Dropout removes nodes out from system in a deterministic way for each area during training. As a solution, nodes across all FC levels must learn improved data representations while co-adapting is prohibited.

3.4.1.4. Loss function

As a loss function, categorized cross-entropy is used in this study. This function compares the anticipated and real distributions using Equation (21) where y and ŷ are the predicted and target results, appropriately. (21) $L (y, \hat{y}) = - \sum_{j = 0}^{M} \sum_{i = 0}^{N} (yij * \log (\hat{y} ij))$ (21)

3.4.2. U-Net architecture

U-Net is a well-known medical image categorization network. It comprises a contracting path for capturing contextual and a symmetrical extending path for exact expansion localization. There are three convolutional layers for every direction, with dropout and pooling. In addition, skip-connections interconnect the contracting and growing pathways. Every layer includes 3 × 3 × 3 convolutional kernels. The first convolutional layer includes 32 filters, whereas different layers have twice as many filters as the preceding stage.

We built two consecutive 3D nets, including one initial outcome concatenated to later results and the other without. We discovered that the strategy with conjunction outscored the one without concatenation by a little margin. The conjunction of up-sampling and down-sampling convolutional layers, rather than the connection of a former to the latter, appears to be the most important aspect of the U-net structure.

Regionally 3× 3 convolution is calculated is for input fragments implemented or from outcomes of previous stages in every layer in convolution I at bottom n, for the input spots implemented from outcomes of previous stages, accompanied by assessment of RLU to achieve the outcome described in Equation (22). (22) $l_{i}^{m} = max (0, \sum_{j} l_{j}^{m - 1} * w_{ij}^{n} + b_{i}^{n})$ (22) The trainable parameters in this case are the quantity of every 3D convolution kernels $w_{ij}^{n}$ and the biases. Every convolutional layer is followed by a max pool layer, which calculates the maximum on a 3 × 3 kernel, reducing the dimension of the attribute map by 2.

To get the efficiency at the solution, we used the widely accepted exponential function, Softmax, presented in Equation (22) for pixel-wise decision. (23) $p (x)_{i} = \frac{e^{xi}}{\sum_{j = 1}^{k} e^{xj}}$ (23) where k is a positive integer. After implementing softmax, every element in the range (0, 1), and the elements will aggregate up to 1 so that they may be read as probabilities. $p (x)_{i}$ is the approximate of the extreme of activation function, i.e. p(x)i ≅ 1 for k that has maximum activation function and $p (x)_{i}$ ≅ 0 for another values of k.

Thus vector is resolved by the softmax classifier as bright or dark pixels. The cross-entropy function was used to calculate the costs of the decisions and to tally up the overall mistake. (24) $E = \frac{1}{N} (\sum_{i = 1}^{N} c_{i} \cdot \log (p_{i}))$ (24) To assess the expenses of the selections and to sum the overall error, the cross-entropy function was utilized. As the interconnection grew increasingly trained for attaining the intended output for provided sample points, the cost function grew closer to zero. The variance among targets (c) and predicted values (p), as specified in Equation (24), is used to determine cross-entropy (E).

3.5. WOA

The humpback whale is among the biggest rorqual type whales. These whales may grow to be the size of a bus as an adulthood. Creele and tiny fish families make for exciting shooting. The bubble-net feeding model is a mechanism used by humpback whales to catch prey. Humpback whales tend to feed towards the surface of the ocean. WOA is inspired from the humpback whales’ bubble net feeding method. Whale Optimization Algorithm would be a type of adaptive optimization technique which can be implemented to a variety of optimization problems. The activity of humpback whale feed hunting is represented as a way for determining the universal minimum and maximum in this method. The major goal of this technique is to simulate the activity of locating and capturing prey groups using two basic methods: encompassing the prey and using bubble systems to encircle the prey.

The demography to investigate for the best optimization solution would be the first random variable in the Whale Optimization Method. The preceding is aexponential concept for the bubble-net foraging mechanism in whale’s humpback: (25) $\begin{aligned} Z (t + 1) & = {\begin{cases} Z * (t) - AD p < 0 .5 \\ D^{'} e^{bl} \cos (2 πt) + Z * (t) p \geq 0 .5 \end{cases} \end{aligned}$ (25) (26) $\begin{aligned} D^{'} & = | CZ * (t) - Z (t) |^{} \end{aligned}$ (26) (28) $\begin{aligned} A & = 2 ar - a \end{aligned}$ (28) (29) $\begin{aligned} C & = 2 r \end{aligned}$ (29)

Whereas p and r are random constants in the interval [0, 1], l defines a random variable in the region [1, 1], and an is a descending number exponentially from 2 to 0 limit the iteration. The latest iteration is t, the exponential form of the circular movement is b, and the spacing between the ith whale and the best decision is D.

It's worth noting that if |Z| > 1, the technique's completion could be assured. We may additionally make the appropriate improvements to the algorithm investigation: (30) $\begin{aligned} D^{'} & = | C Z_{rand} (t) - Z (t) | \end{aligned}$ (30) (31) $\begin{aligned} Z (t + 1) & = {\begin{cases} Z_{rand} (t) - AD P < 0.5 \\ D^{'} e^{bl} \cos (2 πt) + Z_{rand} (t) p \geq 0.5 \end{cases} \end{aligned}$ (31)

3.5.1. Improved whale optimization algorithm

The WOA technique can be enhanced with the assist of chaos theory. Chaos processes have two key characteristics that can be used to improve system resolution and quickness: their sensitivity to the immediate situation and their random behaviour. This trait increases population variety, making it easier to break free from the locally optimal trap. The use of a logistic mapping method to enhance the general whale optimization algorithm is suggested in this research. The proposed algorithm is known as the improved whale optimization algorithm (IWOA).

For creating chaotic values, there seem to be a variety of approaches. The approach of Logistic Mapping is used in this study as follows: (32) $p_{_{k + 1} = δ_{pk} (1 - p_{k}})$ (32) While k denotes the number of iterations, p0 ∈ [0, 1] denotes the starting random number, and 1 denotes a control parameter in the range 1∈ [0, Citation1] – 0.25, 0.5, 0.75. This can be demonstrated that if 1 = 4, the expression given would be chaos.

This determines which whale should be used in the iterative model for location updates. Like a result, the IWOA-based RBNN divides brain abnormalities into four categories: neoplastic (brain tumour), cerebrovascular and inflammatory illness. Algorithm 1 computationally depicts the overall process of the proposed model.

Table

Download CSV Display Table

Segmentation, mainly through U-Net, is essential for isolating the region of interest (ROI) within the complex anatomy of the brain, thereby enabling more targeted and effective treatment plans. Accurate segmentation ensures that oncologists and radiologists can distinctly identify the size, shape and location of tumours, which is critical for surgical planning, radiation therapy and monitoring disease progression. On the other hand, optimization strategies like the Improved Whale Optimization Algorithm (IWOA) play a pivotal role in enhancing the performance of these neural networks. Optimization directly impacts the accuracy, speed and efficiency of the algorithms, ensuring that the models perform well on current datasets and are robust enough to handle new and varied data. This combination of precise segmentation and effective optimization improves diagnostic accuracy and contributes significantly to personalized medicine and patient care advancements.

Moreover, the use of 3D CNN and U-Net architectures has been increasingly recognized in recent research for their effectiveness in handling complex medical imaging data. Studies have demonstrated that 3D CNNs, with their ability to process volumetric data, provide a more nuanced and detailed analysis of MRI scans compared to traditional 2D methods. This depth of analysis is crucial in identifying subtle patterns indicative of brain tumours. Furthermore, U-Net's proficiency in image segmentation, as evidenced by numerous medical imaging studies, allows for the precise delineation of tumour boundaries, which is vital for accurate diagnosis and treatment planning. Incorporating these technologies signifies a substantial advancement in medical imaging, offering more reliable and precise tools for healthcare professionals.

4. Experimental evaluation

4.1. Dataset description

Here, the online available MRI dataset is available publicly, and it is open-source that anyone can access the dataset for research purposes. It has two labels: yes or no, where yes specifies the occurrence of tumour and no specifies no tumour. The dataset is classified and acquired from the radiologists, physicians and shared with various researchers. It is composed of 253 brain MR images of different lengths and shapes. There are 98 MR images with normal (no tumour) and 155 tumour images, and the dataset shape is heterogeneous and non-uniform. The MR images are jpeg and.png file format. Figure depicts the dataset images with two classes, “no” and “yes.”

Figure 2. Sample image for MRI dataset.

4.2. Performance metrics

The recommended images for medical records classification are created using a variety of medical photos as source. Multiple kernel K-means clustering can be used to perform segmentation, and a hybrid approach 3d can be used to perform definitive healthy and unhealthy classification. 3D CNN and U-Net are two of the most well-known television networks in the United States. The proposed methodology is used to categorize the classiﬁer as “no (non-tumour images)” and “yes (tumour) images” in this case. The simulation is done in the MATLAB. Specificity, Precision, F-measure, FP, sensitivity and NR, accuracy, NPV have been used to evaluate the classifier's efficiency.

False positive rate:

It indicates the efficiency of times an image is mistakenly labelled as relating to the same person as another image when it's doesn't. Equation (33), which is provided below, is used to assess it. (33) $FPR = \frac{FP}{FP + TN}$ (33)

False negative rate:

It indicates the ratio of times an image is classified as not referring to the same person photographs while, in fact, it corresponds to the images. It is calculated using Equation (34) as a guide. (34) $FNR = \frac{FN}{FN + TP}$ (34) Positive Predictive Value (PPV):

The chance of an individual having a positive outcome (B + | T+) for the nation of interest is defined by the PPV. As an outcome, PPV indicates the proportion of sufferers who have a conclusive result out of a maximum of eligible subjects (TP/TP + FP). (35) $PPV = \frac{TP}{(TP + FP)}$ (35)

Negative Predictive Value (NPV):

The likelihood of not contracting the disease in an individual with a false negative outcome (B | T) is described by the NPVNPV is expressed as the ratio of people without illness who have a false negative outcome out of a sum of negative test results (TN/TN + FN). (36) $NPV = \frac{TN}{(TN + FN)}$ (36)

Sensitivity:

The percentage of true positives that are appropriately detected is the sensitivity measure. It has to do with a test's ability to notice good outcomes. (37) $Sensitivity = \frac{TP}{TP + FN}$ (37) TN – true negative, whereas FP – false positive.

Accuracy:

The percentage of the total quantity of True Positive and True Negative to the entire quantity of data determines the accuracy of the suggested approach. (38) $Accuracy = \frac{TN + TP}{(TN + TP + FN + FP)}$ (38) F-measure: sensitivity as well as precision in a dramatic mode (39) $F - Measure = 2 * \frac{Precision * Recall}{Precision + Recall}$ (39)

Specificity: The entire number of appropriately recognized negatives is referred to as the genuine negative rate. (40) $Specificity = \frac{TN}{(TN + FP)}$ (40) Precision: The percentage of true positives among expected positives. (41) $Precision = \frac{TP}{TP + FP}$ (41)

Dice similarity coefficient: The DSC was utilize to assess the projected tumour's resemblance. (42) $DSC = \frac{2TP}{FP + 2TP + FN}$ (42) The findings shown in Table and Figure demonstrate the efficacy of the proposed hybrid strategy for classifying brain tumours, particularly when compared to conventional techniques such as FL-SNM, ELM and 2D CNN. The system's effectiveness is measured using numerous performance metrics, notably Accuracy, Sensitivity, F-Measure, G-mean and Dice Similarity Coefficient (DSC).

Figure 3. MRI outputs of brain (a), input image (b), pre-processed image (c), data augmentation brain image (d), segmented yield of brain MRI and (e) predicted MRI.

Table 1. Computation of performance measures.

Download CSV Display Table

There is a noticeable trend in the performance measurements for various input image quantities (40, 60, 80, 100). With an increasing number of input photos, there is a marginal enhancement seen in most measures. Accuracy reaches its highest point at 97% and DSC at 91.68% for a dataset of 100 images. This increase indicates that the model gains advantages from a more extensive dataset, improving its capacity to acquire knowledge and make generalizations.

The sensitivity, which represents the proportion of actual positive results, typically exceeds 91%, demonstrating the model's capacity to detect the existence of tumours accurately. The model's Accuracy, which remains above 96.88%, indicates its overall usefulness in accurately categorizing both tumour and non-tumour instances.

The F-Measure, a metric that considers both Accuracy and recall, and the G-mean, a metric that evaluates the balance between sensitivity and specificity, both provide robust outcomes. The maximum F-Measure achieved is 96.48%, while the G-mean attains a value of 96.75%, suggesting a well-balanced categorization system.

The Dice Similarity Coefficient, essential for evaluating segmentation quality, positively correlates with the number of input photos. This suggests the model's capacity to perform segmentation improves as the dataset size increases. The suggested hybrid strategy outperforms existing techniques like FL-SNM, ELM and 2D CNN. This excellence may be ascribed to its entire methodology, including extensive pre-processing to eliminate noise, data augmentation to enhance generalization and complex segmentation algorithms. The model's use of 3D CNN and U-Net architecture significantly improves its capacity to extract intricate features and efficiently learn from intricate image input.

The findings, shown in Table and visualized in Figures and , demonstrate the brain tumour classification technique's higher efficacy than current approaches such as FL-SNM, ELM and 2D CNN. This contrast is essential for comprehending the progress our technique brings to the area.

Figure 4. Performance comparison of proposed versus existing.

Figure 5. Differentiation of precision, recall and f-measure.

Table 2. Comparison of overall performance for proposed vs existing approach.

Download CSV Display Table

Our method attains a remarkable accuracy of 98.5%, surpassing FL-SNM by 3.64%, ELM by 6.9% and 2D CNN by 1.48%. The high accuracy rate demonstrates the method's overall effectiveness, primarily because of the incorporation of advanced techniques like 3D CNN and U-Net along with the Improved Whale Optimization Algorithm. These techniques improve the model's ability to identify positive and negative cases accurately.

Moreover, the proposed technique has a sensitivity of 91%, which is lower than FL-SNM and 2D CNN but higher than ELM. By comparison, the specificity is 98.56%, exceeding the performance of all three comparing approaches. The delicate equilibrium between sensitivity and specificity plays a crucial role in medical diagnostics, as it guarantees both accurate tumour identification and the reduction of false positives, which is vital in a clinical environment.

The accuracy of our technique is 87.45%, surpassing ELM but falling short of FL-SNM and 2D CNN. However, recall has a notably higher value in all three comparisons, reaching 96%. The high recall rate of this approach showcases its efficacy in accurately detecting a significant proportion of actual positive tumour cases.

The F-Measure, which stands at 96.02%, solidifies the method's robustness, surpassing all three compared approaches by a significant margin. The measure, which is a harmonic mean of accuracy and recall, highlights the balanced efficacy of our technique in both dimensions.

The investigation of the results, as shown in Table and Figure , highlights the effectiveness of our suggested brain tumour classification approach. This system surpasses others in several crucial criteria, demonstrating its complete and intelligent approach.

Figure 6. Comparison of predicted rates with existing approach.

Table 3. Differentiation of PPV, NPV, FPR, FNR.

Download CSV Display Table

PPV: Our approach attains a PPV of 94.89%, signifying a notable level of precision in forecasting actual positive instances. This strategy has a substantially higher effectiveness level than the ELM approach, as seen by a lower PPV of 88.23%. The high PPV of our technique may be due to the rigorous data preparation procedure, which guarantees the use of high-quality, noise-free data for assessment.

NPV: Our technique has a 78% NPV, indicating its ability to detect negative instances accurately. By comparison, the FL-SNM technique has a superior NPV of 89.40%, suggesting areas where our method might be improved.

Our technique has a much higher FPR of 38.05% compared to ELM's lower rate of 10.01%. While there may be an increased occurrence of incorrect positive results, this is counteracted by extensive data augmentation and sophisticated segmentation methods, such as the MKKMC algorithm, which improve the overall accuracy of predictions. The suggested approach demonstrates a FNR of 6.40%, which is much lower than the FNR of 19.71% seen in FL-SNM. The low FNR of our technique demonstrates its excellent capacity to minimize instances of missed diagnoses, which is essential for ensuring successful medical treatment.

Thus, our technique distinguishes significantly not just in specific measurements but also in its comprehensive incorporation of sophisticated preprocessing, data augmentation, MKKMC-based segmentation and a hybrid of 3D CNN and U-Net for classification. The IWOA optimizes all of these components. This leads to developing a resilient system that outperforms specific performance measures and provides a complete solution for classifying brain tumours. The method's complex and versatile architecture greatly enhances its medical imaging and tumour categorization efficacy.

Figure , in conjunction with the data in Table , presents a comprehensive technical comparison of the AUC (Area Under the Curve) values for several classification methods: FL-SNM, ELM, 2D CNN and the PROPOSED model. The outcome demonstrates that the PROPOSED model surpasses the others, achieving an AUC value of 0.93. The obtained result of 0.90 is notably superior to the scores of FL-SNM (0.85), ELM (0.87) and 2D CNN (0.88), suggesting a more robust and more dependable ability to classify brain cancers. With the result bar of the PROPOSED model approaching the ideal value of 1, the visual depiction visually strengthens its superiority. The PROPOSED model's higher AUC value indicates its superior capacity to reliably differentiate between classes, a critical factor in medical diagnostics for reducing false positives and negatives. However, the comparatively lower AUC values of FL-SNM, ELM and 2D CNN indicate that these models may not be as efficient in distinguishing between tumorous and non-tumorous areas in MRI images, albeit still demonstrating relatively high performance. Figure and Table provide a persuasive depiction and quantitative account of the improved diagnostic precision of the suggested strategy.

Figure 7. Analysis of AUC in the tumour classification process.

Table 4. Comparative analysis of AUC values for different classification methods.

Download CSV Display Table

The suggested brain tumour categorization system has significant and diverse theoretical and practical ramifications. This method expands the limits of the processing of medical images and machine learning by using sophisticated algorithms such as 3D CNNs and U-Net, together with novel optimization and segmentation strategies. Besides adding to what is already known about neural computation along with AI in health care, this also establishes new standards for how quickly and accurately medical diagnoses can be made. In practical terms, the technique has substantial ramifications for therapeutic settings. Improved precision, as well as accuracy in tumour categorization, result in an enhanced diagnosis, hence driving the development of better approaches to therapy. The rigorous preliminary processing and data enhancement procedures strengthen the dependability of outcomes, especially when dealing with diverse image quality, making the approach flexible for various medical imaging modalities and patient demographics.

Moreover, there is the potential for flexibility and generality. Many kinds of medical imaging suggest that this technique can significantly transform the diagnosis and treatment of various disorders. Ultimately, this may lead to better results for patients and improved effectiveness in medical services.

The primary limitations of our work include the model's inherent heterogeneity in its capacity to perform over a wide range of real-world medical imaging datasets, as well as the need to evaluate and modify the technique in various clinical environments. Furthermore, the potential for the method to be applied to many forms of medical imaging and diverse medical disorders has yet to be extensively investigated and confirmed.

5. Conclusion and future enhancement

Our objective in this work was to deliver a highly efficient, precise and user-friendly automated method for classifying brain tumours. Our technique primarily involves integrating Multiple Kernel K-means clustering to separate brain tumours accurately in MRI images. This is then followed by a classification system that utilizes the combined capabilities of 3D CNN and U-Net. This hybrid technique is designed to improve accuracy and greatly minimize error margins. The system employs a supervised learning framework by categorizing MRI results into binary groups: tumour or non-tumour. It utilizes a sequence of feed-forward layers to facilitate efficient learning and prediction. We have selected MATLAB as our execution environment because of its strong capability in managing medical imaging data. The performance metrics of our suggested technique are pretty encouraging, with an accuracy rate of 98.5%, specificity rate of 98.56%, sensitivity rate of 91%, recall rate of 96%, precision rate of 87.45% and an F-Measure of 96.02%. These numbers showcase the technique's efficacy and highlight its superiority over current methodologies in brain tumour classification, thereby making it a substantial addition to medical imaging and diagnostics.

Our future intent is to investigate sophisticated algorithms to improve the model's precision and effectiveness, mainly when dealing with varied and more extensive datasets. Additionally, we want to examine the integration of our technique with real-time diagnostic tools, evaluating its versatility in different clinical settings. Furthermore, we want to investigate the scalability of our method to include many forms of medical imaging and disorders, perhaps expanding its practicality significantly. This extensive expansion will recognize the existing constraints and provide a distinct trajectory for future progress in the discipline.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

Seetha J, Raja SS. Brain tumor classification using convolutional neural networks. Biomed Pharmacol J. 2018;11(3):1457.
Google Scholar
Deepak S, Ameer PM. Brain tumor classification using deep CNN features via transfer learning. Comput Biol Med. 2019;111:103345.
PubMed Web of Science ®Google Scholar
Muhammad K, Khan S, Del Ser J, et al. Deep learning for multigrade brain tumor classification in smart healthcare systems: A prospective survey. IEEE Trans Neural Networks Learn Syst. 2020;32(2):507–522.
Web of Science ®Google Scholar
Ghassemi N, Shoeibi A, Rouhani M. Deep neural network with generative adversarial networks pre-training for brain tumor classification based on MR images. Biomed Signal Process Control. 2020;57:101678.
Web of Science ®Google Scholar
Kaplan K, Kaya Y, Kuncan M, et al. Brain tumor classification using modified local binary patterns (LBP) feature extraction methods. Med Hypotheses. 2020;139:109696.
PubMed Web of Science ®Google Scholar
Alqudah AM, Alquraan H, Qasmieh IA, et al. (2020). Brain tumor classification using deep learning technique–a comparison between cropped, uncropped, and segmented lesion images with different sizes. arXiv preprint arXiv:2001.08844.
Google Scholar
Ge C, Gu IYH, Jakola AS, et al. Enlarged training dataset by pairwise gans for molecular-based brain tumor classification. IEEE Access. 2020;8:22560–22570.
Web of Science ®Google Scholar
Sultan HH, Salem NM, Al-Atabany W. Multi-classification of brain tumor images using deep neural network. IEEE Access. 2019;7:69215–69225.
Google Scholar
Kaur T, Saini BS, Gupta S. An optimal spectroscopic feature fusion strategy for MR brain tumor classification using Fisher criteria and parameter-free BAT optimization algorithm. Biocybern Biomed Eng. 2018;38(2):409–424.
Web of Science ®Google Scholar
Shree NV, Kumar TNR. Identification and classification of brain tumor MRI images with feature extraction using DWT and probabilistic neural network. Brain Inform. 2018;5(1):23–30.
PubMedGoogle Scholar
Raju AR, Suresh P, Rao RR. Bayesian HCS-based multi-SVNN: a classification approach for brain tumor segmentation and classification using Bayesian fuzzy clustering. Biocybern Biomed Eng. 2018;38(3):646–660.
Web of Science ®Google Scholar
Pugalenthi R, Rajakumar MP, Ramya J, et al. Evaluation and classification of the brain tumor MRI using machine learning technique. J Control Eng Appl Inform. 2019;21(4):12–21.
Google Scholar
Tahir B, Iqbal S, Usman Ghani Khan M, et al. Feature enhancement framework for brain tumor segmentation and classification. Microsc Res Tech. 2019;82(6):803–811.
PubMed Web of Science ®Google Scholar
Anaraki AK, Ayati M, Kazemi F. Magnetic resonance imaging-based brain tumor grades classification and grading via convolutional neural networks and genetic algorithms. Biocybern Biomed Eng. 2019;39(1):63–74.
Web of Science ®Google Scholar
Narmatha C, Eljack SM, Tuka AARM, et al. A hybrid fuzzy brain-storm optimization algorithm for the classification of brain tumor MRI images. J Ambient Intell Humaniz Comput. 2020;11(8):1–9.
Google Scholar
Sharif M, Tanvir U, Munir EU, et al. Brain tumor segmentation and classification by improved binomial thresholding and multi-features selection. J Ambient Intell Humaniz Comput. 2022;8(4):3161–3183.
Google Scholar
Khan MA, Lali IU, Rehman A, et al. Brain tumor detection and classification: a framework of marker-based watershed algorithm and multilevel priority features selection. Microsc Res Tech. 2019;82(6):909–922.
PubMed Web of Science ®Google Scholar
Ajai AR, Gopalan S. Analysis of active contours without edge-based segmentation technique for brain tumor classification using SVM and KNN classifiers. In: Advances in communication systems and networks. Singapore: Springer; 2020;65(6):1–10.
Google Scholar
Sahoo L, Sarangi L, Dash BR, et al. Detection and classification of brain tumor using magnetic resonance images. In: Advances in electrical control and signal systems. Singapore: Springer; 2020;665:429–441.
Google Scholar
David DS, Jayachandran A. Robust classification of brain tumor in MRI images using salient structure descriptor and RBF kernel-SVM. TAGA J Graphic Technol. 2018;14(64):718–737.
Google Scholar
Swati ZNK, Zhao Q, Kabir M, et al. Brain tumor classification for MR images using transfer learning and fine-tuning. Comput Med Imaging Graph. 2019;75:34–46.
PubMed Web of Science ®Google Scholar
Amin J, Sharif M, Gul N, et al. Brain tumor classification based on DWT fusion of MRI sequences using convolutional neural network. Pattern Recognit Lett. 2020;129:115–122.
Web of Science ®Google Scholar
El-Mahelawi JK, Abu-Daqah JU, Abu-Latifa RI, et al. Tumor classification using artificial neural networks. Int J Acad Eng Res (IJAER). 2020;4(11):8–15.
Google Scholar
Sajid S, Hussain S, Sarwar A. Brain tumor detection and segmentation in MR images using deep learning. Arab J Sci Eng. 2019;44(11):9249–9261.
Web of Science ®Google Scholar
Saouli R, Akil M, Kachouri R. Fully automatic brain tumor segmentation using end-to-end incremental deep neural networks in MRI images. Comput Methods Programs Biomed. 2018;166:39–49.
PubMed Web of Science ®Google Scholar
Naz ARS, Naseem U, Razzak I, et al. Deep autoencoder-decoder framework for semantic segmentation of brain tumor. Aust J Intell Inf Process Syst. 2019;15(4):53–60.
Google Scholar
Noreen N, Palaniappan S, Qayyum A, et al. A deep learning model based on concatenation approach for the diagnosis of brain tumor. IEEE Access. 2020;8:55135–55144.
Google Scholar

Ensemble 3D CNN and U-Net-based brain tumour classification with MKKMC segmentation

Abstract

1. Introduction

2. Literature review