Search in:

Cogent Engineering Volume 11, 2024 - Issue 1

Submit an article Journal homepage

Open access

465

Views

CrossRef citations to date

Altmetric

Listen

Computer Science

A regularized volumetric ConvNet based Alzheimer detection using T1-weighted MRI images

Nitika Goenkaa Senior Data Scientist, Torcai Digital Media Pvt. Ltd, Mumbai, India

Akhilesh Kumar Sharmab School of Information Technology, Department of Data Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, IndiaCorrespondence[email protected]
View further author information

Shamik Tiwaric School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India

Nagendra Singhd Department of Electrical and Electronics Engineering, Trinity College of Engineering and Technology, Karimnagar, India

Vyom Yadave Department of Information Technology, Manipal University Jaipur, Jaipur, India

Srikanth Prabhuf Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, IndiaCorrespondence[email protected]

https://orcid.org/0000-0002-3826-1084

Krishnaraj Chadagaf Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India

show all

Article: 2314872 | Received 16 Oct 2023, Accepted 01 Feb 2024, Published online: 11 Feb 2024

Cite this article
https://doi.org/10.1080/23311916.2024.2314872
CrossMark

In this article

Abstract
1. Introduction
2. Literature review
3. Materials and methods
3. Results
4. Discussion
5. Conclusions
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Alzheimer’s disease is a gradual neurodegenerative condition affecting the brain, causing a decline in cognitive function by progressively damaging nerve cells over time. While a cure for Alzheimer’s remains elusive, the detection of Alzheimer’s disease (AD) through brain biomarkers is crucial to impede its advancement. High-resolution structural MRI scans, particularly T1-weighted images, are commonly used in Alzheimer’s detection. These images provide detailed information about the brain’s structure, allowing researchers and clinicians to identify abnormalities. Our study employs a deep learning methodology using T1-weighted MRI images for a binary classification task—distinguishing between AD and normal/healthy control (NC). The volumetric convolutional neural network model is deployed on pre-processed images and validated on MIRIAD datasets, achieving an impressive accuracy of 97%, surpassing other network models. Addressing the challenge of limited datasets for deep learning models, we incorporated various augmentation techniques such as rotation and rescaling, resulting in outstanding model accuracy and effective discerning between Alzheimer’s disease and normal controls.

Keywords:

Alzheimer’s disease
neuroimaging
deep learning
magnetic resonance imaging
convolutional neural network

Reviewing Editor:

Jin Zhongmin
Xian Jiao Tong University (China) and Leeds University (UK)
Professor
CHINA

SUBJECTS:

Technology
Computer Science
Artificial Intelligence
Computation
Computer Engineering

1. Introduction

The most dominant type of dementia, Alzheimer’s disease, is caused by the build-up of senile plaques from amyloid beta deposits and neurofibrillary tangles from damaged tau protein. Alzheimer’s disease patients have trouble doing everyday chores and need assistance with daily activities. Speaking of the seriousness of the condition, more than six million Americans currently have Alzheimer’s disease, as well as by 2050, there will likely be 14 million cases (Alzheimer’s Association, Citationn.d.). Numerous biomarkers are recognized for diagnosing Alzheimer’s disease, and they are primarily categorized into Cerebrospinal fluid proteins, blood and urine tests, and genetic risk profilers. Currently, no data can be used to determine which biomarkers are the most effective at predicting Alzheimer’s disease. Still, some research indicates that neuroimaging biomarkers offer more prediction points than other biomarkers.

Three different categories of neuroimaging biomarkers have excellent potential for Alzheimer’s disease identification.

Structural imaging (MRI, CT, etc.) traces hippocampus area reduction or brain volume variations.
Functional imaging (fMRI, PET, etc.) realizes regions with reduced brain cell activities.
Molecular imaging (PET, SPECT, etc.) is aiding in biological change detection.

In this study, T1-weighted MRI are analyzed for brain structure uncertainties like ventricle enlargement, cerebral cortex, hippocampus shrinkage, etc.

With its inherent ability to solve the complex unstructured high dimensional data that human experts can only analyze, deep learning approaches have demonstrated exceptional performance in almost all domains, including computer vision, natural language processing, object or problem classification, and detection. Convolutional Neural Networks (ConvNet) have shown breakthrough performance in analyzing numerous images of the same type and capturing the minute differences among them, leading to one of the best algorithms for image classification and detection tasks. Most of the researchers have worked on either a 2-dimensional Convolutional neural network (2D-ConvNet) or taken a portion of volumetric images in the form of slices or voxels or patches and applied to volumetric Convolutional neural network (volumetric-ConvNet).

In this work, we embarked on a volumetric ConvNet framework applied to complete volumetric 3-D MRI images for Alzheimer’s disease detection. This volumetric ConvNet architecture functions on pre-processed images and extracts high-level features for AD vs. NC detection. Further, to improve the accuracy of our model, augmentation techniques were applied, like rotation by −5 and 5 degrees and rescaling by factors of 0.9 and 1.1, which supplemented the accuracy of our model.

2. Literature review

In this section, methodologies and frameworks adopted by different investigators for Alzheimer’s recognition through MRI scans of the brain have been discussed. Alzheimer’s detection using biomarkers started with the complex conventional method of handcraft feature extraction using multiple tools. When Artificial Intelligence (AI) and Machine Learning (ML) gained popularity, this traditional approach was replaced by machine learning classifiers like Lasso feature selection (Zhang & Shen, Citation2012), Support Vector Machine (SVM), k-Nearest Neighbour (KNN), and Naïve-Bayes (Lahmiri & Shmuel, Citation2019), random forest, multi-kernel SVM (Hao et al., Citation2019) and others. Though these models achieved good accuracy for Alzheimer’s detection, a lot of manual effort was involved in feature classification and model training. In the last few years, deep learning frameworks like MLP (Qiu et al., Citation2018), RNN (Lee et al., Citation2019), LSTM (Li et al., Citation2020), ConvNet (Punjabi et al., Citation2019), and others have reduced this complexity and yielded better classification accuracy for Alzheimer’s disease.

Gupta (Citation2013) presented an Alzheimer’s diagnosis model using 4315 MRI scans of 843 subjects. The model employed sparse Autoencoder, which extracted the features from numerous natural images, applied those extracted features with 2D ConvNet framework with a focus on slice-wise feature extraction on ADNI dataset (ADNI Dataset, Citationn.d.) and attained the accuracy of 94.74% for AD vs. NC using softmax classifier. The model developed by Jain et al. (Citation2019), using deep learning with ConvNet, converted volumetric T1w-MRI images into 2D, selected 32 most knowledgeable slices, and passed them to 2D ConvNet. This VGG-16 Model was evaluated on 150 subjects with 4800 slices fed to the architecture and attained an accuracy of 99.14%. Islam and Zhang (Citation2018) suggested a deep learning architecture, the 2D-ConvNet framework, for 416 subjects collected from OASIS database (OASIS Dataset, Citationn.d.). To fit volumetric MRI input into this 2D architecture, this framework retrieved the axial, coronal, and sagittal planes in three crops for each picture. An ensemble of three deep convolutional networks was then formed, providing multiclass classification with an accuracy of 93.18%.

Hosseini-Asl et al. (Citation2018) proposed a transfer learning approach where a volumetric convolutional autoencoder was applied to 30 subjects of the CADDementia dataset, and extracted feature vectors with a volumetric deeply supervised adaptive model were trained on 210 ADNI subjects. Through simulation results, the author has found that their model achieved 99.3% accuracy for AD vs. NC. Korolev et al. (Citation2017) has focused on implementing two volumetric-ConvNet architectures: VoxConvNet with Adam optimizer and ResNet ConvNet with entropy momentum optimizer. These architectures accepted full brain T1w-MRI images of 231 ADNI subjects as voxel intensity values passed as the volumetric tensor shape of 110*110*110. They were able to achieve an accuracy of 79 and 80% for Alzheimer’s binary classification.

Oh et al. (Citation2019) suggested a deep learning framework for Alzheimer’s detection using structural MRI scans of 694 subjects of ADNI. The authors have suggested a hybrid method by combining convolutional Autoencoder, inception module, and transfer learning with volumetric ConvNet. Through either dimensionality reduction or the inception module, the model was used to classify AD from NC, and this data was then utilized to determine if MCI was stable or progressing (Mild Cognitive Impairment). Through simulation results by taking multiple voxels as input, authors were able to achieve 86.60% accuracy for Alzheimer’s detection. Rieke et al. (Citation2018) suggested a model that used volumetric ConvNet with four convolutional and two dense layers and 969 complete brain structural MRI scans of 344 participants to attain an accuracy of 77% for AD vs. NC. Moreover, the authors also applied four visualization methods, namely occlusion, brain area occlusion, sensitivity analysis, and guided backpropagation, to highlight the brain areas affected by Alzheimer’s disease.

Liu et al. (Citation2020) presented an Alzheimer recognition model utilizing T1w MRI scans of 449 subjects from the ADNI database. Their multi-task deep convolutional network, volumetric DenseNet, extracted features from volumetric Patches of hippocampus segments along with multitask ConvNet and attained 88.9% accuracy for AD vs. NC. Liu et al. (Citation2018) proposed an approach using T1w-MRI and demographic information from ADNI-1, ADNI-2, MIRIAD (MIRIAD Dataset, Citationn.d.), and AIBL (AIBL Dataset, Citationn.d.) datasets on a total 1984 subjects by passing 50 anatomical landmarks from pre-processed MRI images to volumetric ConvNet and classifying disease labels on MIRIAD dataset with 93.7% accuracy. The importance of MRI scans and the function of artificial intelligence, such as deep learning and machine learning, in Alzheimer’s detection have been covered in this research. They continued asserting that while deep learning can automatically extract, select, and classify features, machine learning is currently taking a backseat. The authors have undertaken a thorough analysis of various deep learning frameworks for Alzheimer’s detection employing T1w MRI scans and various difficulties encountered when creating these deep learning systems.

In a recent work, Shukla et al. (Citation2023b) used classic machine learning and ensemble learning models to successfully identify Alzheimer’s disease (AD) and its subtypes. The study also determines the relative effect ratings of distinct cortical and subcortical regions related to Alzheimer’s disease and its subtypes. The experimental inquiry employs two classification methods: binary and multiclass. The Ensemble model achieves an outstanding 99% detection accuracy in binary classification. The Random Forest model has an accuracy of 82% for multiclass categorization. Notably, the research reveals that the right hemisphere’s para-hippocampal and entorhinal regions have the most significant impact during cortical-subcortical analysis.

Similarly, the inferior temporal and isthmus cingulate regions emerge as highly influential in the left hemisphere. In another work, the fusion of PET and MRI modalities is used to improve result visualization by Shukla et al. (Citation2024). They also used the Ensemble Model and other machine learning methods to evaluate various subcortical regions. This study sought to find the most important location for Alzheimer’s disease detection when compared to other subtypes. The hippocampus, the amygdala in both the left and right hemispheres, and the neuro-region were found to be the most effective in diagnosing Alzheimer’s disease, mild cognitive impairment, and cognitive normality.

3. Materials and methods

In this section, the proposed volumetric convolutional neural network model, its architecture, dataset, and different pre-processing methods like N4 bias correction, skull-stripping, and rigid registration applied are discussed. The ConvNet architecture used in Alzheimer’s prediction is simple and similar to that in Shukla et al. (Citation2023b). The network takes complete T1w MRI images as input, processes them, and results in a binary classification of AD or NC. Besides using convolutional, max-pooling, batch-normalization, fully connected (fc), and dropout layer, the novelty comes in using the global-average pooling layer (Shukla et al., Citation2024), thereby replacing one fully connected layer.

The convolution layer is the base of a Convolutional Neural Network. It takes an input volume, which is processed (volumetric convolution operation applied) by k filters, representing the weights and connections in the convolutional network, resulting in volumetric feature volume. EquationEquation 1(1) $V_{nm}^{l} (h, w, d) = \sum_{i = 1}^{H} \sum_{j = 1}^{W} \sum_{k = 1}^{D} F_{m}^{l - 1} (h - i, w - j, d - k) \times W_{nm}^{l} (i, j, k)$ (1) . Represents the volumetric convolution operation applied where $W_{nm}^{l}$ is the volumetric kernel of size H × W × D in the lth layer. This volumetric convolution operation is connected to mth input feature volume $F_{m}^{l - 1}$ in the previous layer l-1 with $F_{n}^{l},$ the nth output feature volume (Zunair et al., Citation2020). (1) $V_{nm}^{l} (h, w, d) = \sum_{i = 1}^{H} \sum_{j = 1}^{W} \sum_{k = 1}^{D} F_{m}^{l - 1} (h - i, w - j, d - k) \times W_{nm}^{l} (i, j, k)$ (1)

Where $W_{nm}^{l} (i, j, k)$ is the volumetric convolution kernel element-wise value.

Activation functions are primarily responsible for introducing non-linearity in the convolutional neural network. The activation function Z^[l] is applied over the received input feature volume V[l], resulting in output feature volume A^[l] as represented in EquationEquation 2(2) $A^{[l]} = Z^{[l]} (V^{[l]})$ (2) . (2) $A^{[l]} = Z^{[l]} (V^{[l]})$ (2)

The pooling layer lessens the spatial size of the input volume, thereby bringing down the parameters in the network and controlling the overfitting. The standard pooling functions that can be applied are average, max, and min. The pooling function P^[l] is applied over the output of activation function A^[l], resulting in output feature volume T^[l] as represented in EquationEquation 3(3) $T^{[l]} = P^{[l]} (A^{[l]})$ (3) . (3) $T^{[l]} = P^{[l]} (A^{[l]})$ (3)

For every mini-batch, input to a layer is standardized by the Batch normalization layer, which works by subtraction of the mini-batch mean and division by the standard deviation (mini-batch). The Global Average Pooling layer performs the dimensionality reduction of MRI images with $height \times width \times depth$ to $1 \times 1 \times depth,$ thereby providing us with a single-entry vector for each possible object in the classification task. Neurons in FC layers are wholly connected to all activations in the preceding layer. The dropout layer helps drop out both invisible and visible units in the network, ensuring no single node in the network is responsible for activating when presented with the given pattern. shows the volumetric ConvNet model for Alzheimer’s disease detection. MRI images obtained from MNI152 template space after pre-processing are of dimension 91 × 109 × 91, but we resized them to 128 × 128 × 64 as passing these dimensions on the same model yielded a much higher accuracy of 100% over 65.7%. This architecture comprises 4 Convolutional layers, three max-pooling layers, three batch-normalization layers, one global average pooling layer, a single Gaussian dropout, and two dense layers, thus making fourteen layered models.

Figure 1. Proposed 3-dimensional convolutional neural network (volumetric-ConvNet) for Alzheimer’s detection.

In the proposed Alzheimer Detection model, the convolutional layer accepts the pre-processed T1w-MRI images of size 128 × 128 × 64 having 32 kernels of size 3 × 3 × 3. The resulting volume image is further convolved with 64 kernels of size 3 × 3 × 3 in a convolutional layer with Relu activation function accompanied by a 2 × 2 × 2 max-pooling layer, thereby reducing the size of an image by half, which is further worked on by batch-normalization layer to tackle internal covariate shift problem. This combination of the Convolutional layer, Max-pooling layer, and batch normalization is repeated two times with 128 kernels and 256 kernels. Further, the Global average pooling layer is applied, the output of which is passed to the FC layer with 512 nodes, the LeakyRelu function (value of alpha is 0.01), and the L1 Regularizer (value of 0.001). We further used Gaussian Dropout with a 30% rate. Lastly, the FC layer with the final sigmoid function performs the final classification into AD or NC classes.

T1-weighted MRI images have been obtained from the MIRIAD public database (Lin et al., Citation2014), which contains MRI images of 23 Healthy subjects and 46 Alzheimer’s subjects with an average age of 69 years. For each subject, multiple images have been collected over the period of 2 weeks to 2 years, and these images of a single subject are considered as separate datasets. Hence, a total of 465 MRI images for AD and 243 MRI images for NC are depicted in .

Table 1. MIRIAD subjects demographic data.

Download CSV Display Table

The pre-processing of MRI images aids in correcting multiple biases that exist in raw images and helps in creating a better model that can focus its main binary classification task of Alzheimer’s detection by removing or correcting these biases. Most of the work, based on handcraft conventional processes or machine learning, deep learning, or others, have employed multiple pre-processing techniques like Skull Stripping, MRI Bias Field Correction, Rigid or Affine Registration, AC-PC correction (AC-Anterior commissure, PC-posterior commissure), Segmentation, Spatial Normalization, Cerebellum Removal, and others. These pre-processing techniques have been performed using different tools and packages like Free Surfer (Qu et al., Citation2020), FSL (Malone et al., Citation2013), DPARSF toolbox (Ashburner, Citation2007), MIPAV software (Jenkinson et al., Citation2012), DARTEL registration method (Qu et al., Citation2020), SPM package (Chao-Gan & Yu-Feng, Citation2010), and others. As shown in , the pre-processing pipeline for our research on T1w-MRI images includes N4 Bias Correction, Skull Stripping, and Rigid Registration with 6 degrees of freedom.

Figure 2. A pre-processing pipeline is applied on T1w-MRI images to pass these modified images to a proposed volumetric convolutional neural network with multiple dataset approaches for Alzheimer’s detection.

N4 Bias Field Correction: MRI images, while being captured through different scanners, suffer from intensity inhomogeneity throughout the images, which may lead to the intensity of white matter and grey matter. To circumvent this issue, bias correction algorithms are applied to MRI images. The N4 Bias Correction (MIPAV, Citationn.d.) method has been used on MRI images to remove the bias field, also known as the gain field. This algorithm is an improvement of the nonparametric non-uniform intensity normalization (N3) (Friston et al., Citation2007) approach by introducing B-spline fitting, which helps remove intensity non-uniformities across the image, thus producing homogenous intensity images.

Skull Stripping: FSL Brain extraction tool (BET) (Tustison et al., Citation2010) extracts the brain tissues from MRI images containing the whole head. T1 weighted MRI images usually contain non-brain tissues (neck, eyes, bone, mouth, skin, etc.) along with brain tissues. Thus, BET employs the segmentation of MR images into brain and non-brain tissues by applying a clustering technique for separating the voxels. In our pre-processing pipeline, brain extraction has been applied with 0.285 fraction values as the second step before registration, as BET does not require any pre-processed or registered images.

Rigid Registration: FMRIB’s Linear Registration tool (FLIRT) (Sled et al., Citation1998) helps in registering the T1 weighted MRI images to MNI152 T1 Template (Smith, Citation2002) by performing minor translations and rotations over the skull stripped image. Thus, 6 degrees of freedom and spline interpolation have been chosen for our pipeline over the default values.

3. Results

In this section, various experiments that are carried out on various datasets are discussed. AUC-ROC, confusion matrix, f1-score, accuracy, precision, Recall, and other performance metrics are used to evaluate the classification model using four distinct methodologies.

3.1. Augmentation

Limited dataset size is one of the obstacles in deep learning models, as the higher the dataset, the higher the model accuracy in solving the problem of overfitting. Rotation and Scaling (Scipy Library) are applied to an original dataset for data enhancement. Rotation by −5 and 5 increases dataset size to 1380 (904 Alzheimer’s Disease and 476 Normal Controls). Rescaling by factors of 0.9 and 1.1 enhances the dataset by 1362 images (895 Alzheimer’s Disease and 467 Normal Controls). Thus, all these augmentation techniques, along with the original dataset, resulted in a total of 3432 images (2251 Alzheimer’s Disease and 1181 Normal Control) ().

Table 2. MRI image count with and without augmentation.

Download CSV Display Table

Original MRI dataset (690 Images)
Original and Scaled Images (2052 Images)
Original and Rotated Images (2070 Images)
Original, Rotated and Scaled Images (3432 Images)

The studies are run on a GPU powered by NVIDIA Volta with 640 Tensor Cores, 5120 CUDA Cores, 32 GB of memory, and a 9000 GB/sec memory bandwidth. The performance of up to 100 Processors in a single GPU is provided by the NVIDIA Tesla V100. This deep learning network uses Tensorflow (Fonov et al., Citation2011) as the backend and Keras (Jenkinson et al., Citation2002) as the front end. Adam optimizer was used for model compilation (0.0001 learning rate). EquationEquation 4(4) $Loss = \frac{- 1}{output size} \sum_{i = 1}^{output size} y . log p + (1 - y) log (1 - p)$ (4) represents the binary cross-entropy loss function for binary classification into AD or NC. To keep track of the validation correctness and patience level of 100, an early stopping mechanism is used. (4) $Loss = \frac{- 1}{output size} \sum_{i = 1}^{output size} y . log p + (1 - y) log (1 - p)$ (4)

Where y is the predicted value that should come as AD or NC, and p is the actual value that is classified by the volumetric ConvNet model.

3.2. Performance evaluation

The effectiveness of the volumetric ConvNet Model devised in our work can be evaluated by different performance metrics. It includes loss, accuracy, precision, Recall, f1-score, AUC, ROC, precision-recall curve, micro-average ROC curve, etc. Talking specifically about the medical domain, it becomes extremely important to find out the wrong predictions on the test dataset and, thereby, evaluate the model’s performance. By contrasting the actual target values with those predicted by our deep learning model, the confusion matrix provides a comprehensive assessment of the performance of our algorithms. The confusion matrix is shown in the figure below, where column values are those determined by our deep learning model, and row values reveal the actual AD or NC classes. The confusion matrix in illustrates the True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) values.

Figure 3. Formulation of confusion matrix for Alzheimer’s detection. True Positive: Alzheimer’s patients are correctly identified as AD by this model. True Negative: healthy Patients are correctly identified as NC. False Positive: healthy Patients are incorrectly identified as AD. False Negative: Alzheimer Patients are incorrectly identified as NC by our algorithm. Total correct predictions (AD and NC) divided by total dataset controls equals accuracy. Precision is the ratio of correctly identified Alzheimer’s controls by total controls (correct or incorrect) classified as Alzheimer’s disease.

(5)

precision = TP / (TP + FP)

(5)

Sensitivity/Recall metrics determine the model’s performance of correctly identifying the Alzheimer’s disease patients. (6) $sensitivity = TP / (TP + FN)$ (6)

F-measure is the harmonic mean function of precision and Sensitivity. Since it considers both FP and FN, it is more advantageous than accuracy, especially in the case of the medical domain where the class imbalance problem exists. (7) $F - measure = 2 \times (precision * sensitivity) / (precision + sensitivity)$ (7)

The Receiver Operating Characteristics Curve (ROC) is the plot of the False Positive Rate (FPR) on the x-axis and True Positive Rate (TPR) on the y-axis. FPR and TPR are performance metrics where FPR provides insight into the model’s ability to avoid classifying negative instances as positive and TPR measures the model’s ability to capture and correctly identify positive instances. The information for potential threshold values is summarized using ROC curves, where point (0,0) indicates zero TP and zero FN, and point (1,1) indicates that all AD is accurately recognized, but all NC are mistakenly classified as AD. A similar percentage of AD and NC are correctly classified, as seen by the diagonal line. Instead of displaying the absolute values of the predictions, the Area Under Curve (AUC) evaluates how well they are rated. As a result, AUC is Scale-Invariant. It evaluates the accuracy of the model’s predictions without considering the chosen categorization threshold. AUC with the value of 0 depicts that the classifier is not performing in the right direction by predicting all positives and negatives as positives. AUC with value 1 shows the classifier’s ability to distinguish between all positive and negative classes perfectly.

Precision-Sensitivity (PR) curve drawn with sensitivity value (x-axis) and precision value (y-axis) is another effective metric for imbalanced binary classification models. Instead of giving a single metric like accuracy, precision, sensitivity, etc., it provides a graphical depiction of a classifier’s performance across numerous thresholds.

a) Original MRI dataset: volumetric ConvNet Model on pre-processed MIRIAD T1w MRI images was trained on 452 AD and 238 NC images with the number of samples in training and validation as 482 and 104. The total parameters for this model were 1,313,105, with 892 non-trainable parameters and 1,312,209 trainable parameters, and each epoch took approximately 45 s to execute, making the entire execution 1 h and 15 min. Testing accuracy achieved was 97.12%, a loss of 15.91%, while training accuracy was 100% with a loss of 7.5%, as compared to validation accuracy of 96.15% with a loss of 16.6% (; and ).

Figure 4. Original MRI dataset model, classification accuracy and loss (epochs count 100).

Figure 5. confusion matrix for original dataset model.

Table 3. Precision, sensitivity, and f-measure results for dataset without augmentation.

Display Table

b) Original and Scaled Images: The volumetric ConvNet Model on pre-processed MIRIAD T1w MRI images was trained on 1347 AD and 705 NC images with several samples in training and validation as 1443 and 309. The total parameters for this model were 1,313,105, with 892 non-trainable parameters and 1,312,209 trainable parameters, and each epoch took approximately 140 s to execute, making total execution around 4 h. Testing accuracy achieved was 99.67%, a loss of 4.3%, while training accuracy was 100%, with a loss of 3.4%, as compared to validation accuracy of 99.67%, with a loss of 4.2% (; and ).

Figure 6. Original and Scaled MRI dataset model classification accuracy and Loss (Epoch Count 100).

Figure 7. confusion matrix for original and scaled dataset.

Table 4. Precision, sensitivity, and f-measure results for original and scaled MRI dataset.

Display Table

c) Original and Rotated Images: The volumetric ConvNet Model on pre-processed MIRIAD T1w MRI images was trained on 1356 AD and 714 NC images with the number of samples in training and validation as 1450 and 310. The total parameters for this model were 1,313,105, with 892 non-trainable parameters and 1,312,209 trainable parameters, and each epoch took approximately 135 s to execute, making total execution around 3-h 45 min. Testing accuracy achieved was 100% with a loss of 3.5%, while training accuracy was 100% with a loss of 2.9% as assessed to validation accuracy of 100% with a loss of 3.34% (; and ).

Figure 8. Original & Rotated MRI dataset Model Accuracy and Loss (Epoch Count 100).

Figure 9. Confusion Matrix for Original and Rotated Dataset Approach.

Table 5. Precision, sensitivity, and F-measure results for original and rotated MRI dataset.

Display Table

d) Original, Rotated, and Scaled Images: The volumetric ConvNet Model on pre-processed MIRIAD T1w MRI images was trained on 2251 AD and 1181 NC images with the number of samples in training and validation as 2403 and 514. The total parameters for this model were 1,313,105, with 892 non-trainable parameters and 1,312,209 trainable parameters, and each epoch took approximately 230 s to execute, making total execution around 6 h 24 min. Testing accuracy achieved was 100%, a loss of 2.4%, while training accuracy was 100% with a loss of 2.2%, as compared to validation accuracy of 100% with a loss of 2.4% (; and ).

Figure 10. Original, rotated, and scaled dataset model accuracy and loss (epoch count 100).

Figure 11. Confusion matrix for original, scaled, and rotated dataset approach

Table 6. Original, rotated, and scaled MRI dataset precision, sensitivity, and f-measure.

Display Table

presents the Receiver Operating Characteristic Curves for all different approaches. presents the precision sensitivity curve for all different approaches.

Figure 12. Receiver Operating Characteristic Curves obtained using volumetric ConvNet models with different dataset augmentation approaches for AD identification.

Figure 13. Precision Sensitivity (PR) Curve obtained using volumetric Convolutional neural network with different dataset approaches for Alzheimer’s Detection.

Micro-Average ROC computes the average metric by aggregating the contributions of AD and NC classes. below presents the Micro Average ROC Curve for all different approaches.

Figure 14. Micro Average ROC Curve obtained using volumetric ConvNet model with different augmentation approaches for Alzheimer Detection.

4. Discussion

All the above networks with different datasets in all experiments were trained independently using different random weight initializations from scratch. Comparing them, models with original, rotated, and scaled images outperformed other models by achieving 100% accuracy with a minimum loss of approximately 2.2%. Accuracy and AUC for these different approaches are shown in .

Table 7. Comparison of classification accuracy and AUC obtained using proposed volumetric convolutional neural network with different augmentation approaches for Alzheimer’s detection.

Display Table

This work has proposed a volumetric ConvNet, a learning approach for Alzheimer’s detection. The proposed method takes advantage of multiple augmentation techniques and regularization methods available for enhancing the model performance. Augmentation techniques like Rotation, Scaling, and Regularization techniques like Global Average Pooling layer, L1 Regularizer, and Gaussian Dropout enhance the model performance. To evaluate the advantages of augmentation and regularization methods adopted, we compared the performance outcomes for four different approaches: a) Original MRI dataset, b) Original and Scaled dataset, c) Original and Rotated dataset, d) Original, Scaled, and Rotated dataset. As observed in . ‘Original and Rotated dataset’ and ‘Original, Scaled and Rotated dataset’ show similar performance in terms of accuracy and AUC with little differences in loss metrics. Using augmentation techniques like rotation and scaling are effective ways of increasing Alzheimer’s detection power; thus, it seems natural to combine both original and augmented datasets to show the best performance. Though we had an unbalanced ratio of Alzheimer’s Disease and Normal Control, our model still showed comparable prediction ability, which was manifested by the evaluation of multiple performance metrics. In addition, our method takes entire brain MRI scans as input for training our model, which is one of the biggest advantages in the face of data scarcity. The results are also compared with another work proposed by Shukla et al. (Citation2023a). In this work, multiple transfer learning-based models are utilized for classification. The best transfer learning model, i.e. VGG16, is used to compare the results of the proposed method. It is evident from the results that the proposed model is superior to the transfer learning approach.

While these results are satisfactory and a step forward for Alzheimer’s detection using computer-aided diagnostic frameworks, multiple possibilities can still be explored, and investigation results can be applied in multiple areas. Starting with visualization of the learned features can be demonstrated to increase the clinician’s satisfaction with the trained model. Secondly, this model can be applied to the ADNI dataset to generalize the model across different datasets. We have applied a volumetric Subject-level neuroanatomical approach. At the same time, other authors have utilized 2D Slice, volumetric ROI, and volumetric Patch-wise approach, so there is a need for extensive analysis for the suitability of these approaches for particular types of models. Next, the progression of AD can be included in the model, resulting in a three-class classification (AD or MCI or NC) or a 4-class classification (AD or pMCI or sMCI or NC). Multiple modalities can be fused to see the model performance, like Amyloid-PET, C-PiB-PET, DTI, APOe4, T2-MRI, and others. Further, model performance can be tuned by passing raw images into the model instead of pre-processed images and fine-tuning the model by changing different hyper-parameters.

5. Conclusions

Alzheimer’s disease is a degenerative brain disorder that impairs memory and the ability to do even routine jobs. Finding the beginning of Alzheimer’s disease is essential because there is currently no cure, making it possible to take the necessary actions to slow down the disease’s progression. This work uses a volumetric ConvNet architecture with T1w-MRI pre-processed images for binary classification of Alzheimer’s disease. We also applied different augmentation techniques like rotation and scaling to enhance the dataset and compared the model accuracy for different augmentation techniques. The experiments showed that augmentation enhances the model accuracy and outperforms that of the original dataset by achieving 100% accuracy. Although these findings significantly advance the growth of computer-assisted diagnosis for Alzheimer’s disease, the value of this research must be applied in other contexts. More studies may be conducted by adding more modalities to MRI and expanding the work to a four-class categorization scheme. The development of visualization techniques for highlighting problematic areas in MRI scans should come next. These discoveries would be beneficial and increase doctors’ confidence in computer-assisted disease diagnosis.

Data availability statement

The MIRIAD database provided the data that was used to prepare this article. The MIRIAD investigators were not involved in the report’s analysis or composition. The UK Alzheimer’s Society’s assistance allows for the availability of the MIRIAD dataset (Grant RF116). GlaxoSmithKline provided an unrestricted educational grant to support the first data gathering (Grant 6GKC). Data will be made available by the corresponding author upon prior request.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Akhilesh Kumar Sharma

Akhilesh Kumar Sharma is the Head of Department, Data Science and Engineering, Manipal University Jaipur, Jaipur, India. He has published over 70 articles in journals and conferences and written books and book chapters. He has six patents and four copyrights to his credit.

References

ADNI Dataset. (n.d.). http://adni.loni.usc.edu/.
Google Scholar
AIBL Dataset. (n.d.). http://adni.loni.usc.edu/category/aibl-study-data/.
Google Scholar
Alzheimer’s Association. (n.d.). Facts and figures. https://www.alz.org/alzheimers-dementia/facts-figures.
Google Scholar
Ashburner, J. (2007). A fast diffeomorphic image registration algorithm. Neuroimage, 38(1), 95–113. https://doi.org/10.1016/j.neuroimage.2007.07.007
PubMed Web of Science ®Google Scholar
Chao-Gan, Y., & Yu-Feng, Z. (2010). DPARSF: A MATLAB toolbox for ‘pipeline’ data analysis of resting-state fMRI. Frontiers in Systems Neuroscience, 4(May), 13. https://doi.org/10.3389/fnsys.2010.00013
PubMedGoogle Scholar
Fonov, V., Evans, A. C., Botteron, K., Almli, C. R., McKinstry, R. C., & Collins, D. L. (2011). Unbiased average age-appropriate atlases for pediatric studies. Neuroimage, 54(1), 313–327. https://doi.org/10.1016/j.neuroimage.2010.07.033
PubMed Web of Science ®Google Scholar
Friston, K., Ashburner, J., Kiebel, S., Nichols, T., & Penny, W. (2007). Statistical parametric mapping: The analysis of functional brain images. Academic Press.
Google Scholar
Gupta, A. (2013). Natural image bases to represent neuroimaging data [Paper presentation]. Proceedings of the 30th International Conference on Machine Learning (Vol. 28, pp. 987–994).
Google Scholar
Hao, X., Bao, Y., Guo, Y., Yu, M., Zhang, D., Risacher, S. L., Saykin, A. J., Yao, X., & Shen, L. (2019). Multi-modal neuroimaging feature selection with consistent metric constraint for diagnosis of Alzheimer’s disease. Medical Image Analysis, 60, 101625. https://doi.org/10.1016/j.media.2019.101625
PubMed Web of Science ®Google Scholar
Hosseini-Asl, E., Ghazal, M., Mahmoud, A., Aslantas, A., Shalaby, A. M., Casanova, M. F., Barnes, G. N., Gimel'farb, G., Keynton, R., & El-Baz, A. (2018). Alzheimer’s disease diagnostics by a volumetric deeply supervised adaptable convolutional network. Frontiers in Bioscience, 23(5), 584–596.
Google Scholar
Islam, J., & Zhang, Y. (2018). Brain MRI analysis for Alzheimer’s disease diagnosis using an ensemble system of deep convolutional neural networks. Brain Informatics, 5(2), 2. https://doi.org/10.1186/s40708-018-0080-3
PubMedGoogle Scholar
Jain, R., Jain, N., Aggarwal, A., & Hemanth, D. J. (2019). Convolutional neural network based Alzheimer’s disease classification from magnetic resonance brain images. Cognitive Systems Research, 57, 147–159. https://doi.org/10.1016/j.cogsys.2018.12.015
Web of Science ®Google Scholar
Jenkinson, M., Bannister, P., Brady, M., & Smith, S. (2002). Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage, 17(2), 825–841. https://doi.org/10.1006/nimg.2002.1132
PubMed Web of Science ®Google Scholar
Jenkinson, M., Beckmann, C. F., Behrens, T. E. J., Woolrich, M. W., & Smith, S. M. (2012). FSL. Neuroimage, 62(2), 782–790. https://doi.org/10.1016/j.neuroimage.2011.09.015
PubMed Web of Science ®Google Scholar
Korolev, S., Safiullin, A., Belyaev, M., & Dodonova. (2017). Residual and plain convolutional neural networks for volumetric brain MRI classification [Paper presentation]. IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017) (pp. 835–838).
Google Scholar
Lahmiri, S., & Shmuel, A. (2019). Performance of machine learning methods applied to structural MRI and ADAS cognitive scores in diagnosing Alzheimer’s disease. Biomedical Signal Processing and Control, 52, 414–419.
Web of Science ®Google Scholar
Lee, G., Nho, K., Kang, B., Sohn, K.-A., & Kim, D. (2019). Predicting Alzheimer’s disease progression using multi-modal deep learning approach. Scientific Reports, 9, 1952.
PubMed Web of Science ®Google Scholar
Li, W., Lin, X., & Chen, X. (2020). Detecting Alzheimer’s disease based on 4D fMRI : An exploration under deep learning framework. Neurocomputing, 388, 280–287. https://doi.org/10.1016/j.neucom.2020.01.053
Web of Science ®Google Scholar
Lin, M., Chen, Q., Yan, S. (2014). Network in network. ArXiv, 1–10.
Google Scholar
Liu, M., Li, F., Yan, H., Wang, K., Ma, Y., Shen, L., & Xu, M. (2020). A multi-model deep convolutional neural network for automatic hippocampus segmentation and classification in Alzheimer’s disease. Neuroimage, 208(2019), 116459. https://doi.org/10.1016/j.neuroimage.2019.116459
PubMedGoogle Scholar
Liu, M., Zhang, J., Adeli, E., & Shen, D. (2018). Joint classification and regression via deep multi-task multi-channel learning for Alzheimer’s disease diagnosis. IEEE Transactions on Biomedical Engineering, 66, 1195–1206.
PubMed Web of Science ®Google Scholar
Malone, I. B., Cash, D., Ridgway, G. R., MacManus, D. G., Ourselin, S., Fox, N. C., & Schott, J. M. (2013). NeuroImage MIRIAD—Public Release of a multiple time point Alzheimer’s MR imaging dataset. Neuroimage, 70, 33–36. https://doi.org/10.1016/j.neuroimage.2012.12.044
PubMed Web of Science ®Google Scholar
MIPAV. (n.d.). https://mipav.cit.nih.gov/index.php.
Google Scholar
MIRIAD Dataset. (n.d.). http://miriad.drc.ion.ucl.ac.uk/.
Google Scholar
OASIS Dataset. (n.d.). https://www.oasis-brains.org/.
Google Scholar
Oh, K., Chung, Y.-C., Kim, K. W., Kim, W.-S., & Oh, I.-S. (2019). Classification and visualization of Alzheimer’s disease using volumetric convolutional neural network and transfer learning. Scientific Reports, 9(1), 18150. https://doi.org/10.1038/s41598-019-54548-6
PubMedGoogle Scholar
Punjabi, A., Martersteck, A., Wang, Y., Parrish, T. B., & Katsaggelos, A. K. (2019). Neuroimaging modality fusion in Alzheimer’s classification using convolutional neural networks. PLoS One. 14(12), e0225759. https://doi.org/10.1371/journal.pone.0225759
PubMed Web of Science ®Google Scholar
Qiu, S., Chang, G. H., Panagia, M., Gopal, D. M., Au, R., & KolachalAMa, V. B. (2018). Fusion of deep learning models of MRI scans, Mini – Mental State Examination, and logical memory test enhances diagnosis of mild cognitive impairment. Alzheimer’s & Dementia, 10, 737–749. https://doi.org/10.1016/j.dadm.2018.08.013
Google Scholar
Qu, L., Wu, C., & Zou, L. (2020). Volumetric dense separated convolution module for volumetric medical image analysis. Applied Sciences, 10(2), 485. https://doi.org/10.3390/app10020485
Google Scholar
Rieke, J., et al. (2018). Visualizing convolutional networks for MRI-based diagnosis of Alzheimer’s disease. Lecture Notes in Computer Science, 2, 24–31.
Google Scholar
Shukla, A., Tiwari, R., & Tiwari, S. (2023a). Alz-ConvNets for classification of Alzheimer disease using transfer learning approach. SN Computer Science, 4(4), 404. https://doi.org/10.1007/s42979-023-01853-7
Google Scholar
Shukla, A., Tiwari, R., & Tiwari, S. (2023b). Structural biomarker‐based Alzheimer’s disease detection via ensemble learning techniques. International Journal of Imaging Systems and Technology, 34(1), e22967. https://doi.org/10.1002/ima.22967
Google Scholar
Shukla, A., Tiwari, R., & Tiwari, S. (2024). Analyzing subcortical structures in Alzheimer’s disease using ensemble learning. Biomedical Signal Processing and Control, 87, 105407. https://doi.org/10.1016/j.bspc.2023.105407
Web of Science ®Google Scholar
Sled, J. G., Zijdenbos, A. P., & Evans, A. C. (1998). A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on Medical Imaging, 17(1), 87–97. https://doi.org/10.1109/42.668698
PubMed Web of Science ®Google Scholar
Smith, S. M. (2002). Fast robust automated brain extraction. Human Brain Mapping, 17(3), 143–155. https://doi.org/10.1002/hbm.10062
PubMed Web of Science ®Google Scholar
Tustison, N. J., Avants, B. B., Cook, P. A., Zheng, Y., Egan, A., Yushkevich, P. A., & Gee, J. C. (2010). N4ITK: Improved N3 bias correction. IEEE Transactions on Medical Imaging, 29(6), 1310–1320. https://doi.org/10.1109/TMI.2010.2046908
PubMed Web of Science ®Google Scholar
Zhang, D., & Shen, D. (2012). Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease. Neuroimage, 59(2), 895–907. https://doi.org/10.1016/j.neuroimage.2011.09.069
PubMed Web of Science ®Google Scholar
Zunair, H., Rahman, A., Mohammed, N., Cohen, J. P. (2020). Uniformizing techniques to process CT scans with volumetric ConvNets for tuberculosis prediction. ArXiv, 1–12.
Google Scholar

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

A regularized volumetric ConvNet based Alzheimer detection using T1-weighted MRI images

Abstract

1. Introduction

2. Literature review

3. Materials and methods

Table 1. MIRIAD subjects demographic data.