529
Views
0
CrossRef citations to date
0
Altmetric
Computer Science

Deep learning-based automated spine fracture type identification with Clinically validated GAN generated CT images

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Article: 2295645 | Received 03 Aug 2023, Accepted 11 Dec 2023, Published online: 16 Jan 2024

Abstract

Automatic type identification of sub-axial spine fractures is of prime importance for orthopaedicians to reduce image interpretation time and increase patient care time. But identifying fracture types is challenging due to imbalanced datasets. In this work, CT scan images of fractured spine has been collected from a Tertiary Care hospital and extended Deep Convolutional Generative Adversarial Network (DCGAN) architecture is developed for generating spine fracture images that overcomes the imbalanced dataset problem. These enhanced dataset are clinically evaluated with Two Visual Turing Tests (VTTs): the first test to “identify real and generated images” and second test to determine “type of fractures in the generated images.” The first VTT demonstrates that generated images of fractures are realistic and that even spine surgeons have difficulty in distinguishing them from real. The second VTT demonstrates that fracture lines are clearly visible in the generated images. The VTT results are analyzed using Fleiss Kappa statistical techniques to determine the inter-observer reliability of spine surgeons’ clinical evaluation. The results showed high interobserver agreement for “type identification” in the generated images. The clinically evaluated generated images are provided to the proposed ensemble based type identification model, which outperformed other models in type identification.

1. Introduction

Deep learning (DL) algorithms are extensively used for medical image classification tasks. In many medical imaging applications, including those in the fields of radiology (Hamamoto Suvarna et al., Citation2020), dermatology (Liu et al., Citation2020), pathology (Valieris et al., Citation2020), ophthalmology (Nazir et al., Citation2020), etc., the DL models have demonstrated very excellent performance.

In the human body, the spine gives protection to the spinal cord. The spine comprises 33 bones called vertebrae. In the spine, the vertebrae are stacked one above the other from head to pelvis and separated by an intervertebral disk. Protecting the spinal cord and supporting the body weight are important functions of the spine. The 33 vertebrae are classified into five different sections. The first section is the cervical region (C1–C7), followed) by the thoracic region (T1–T12), the lumbar region (L1–L5), the sacrum region (S1–S5) and, at the end, the coccyx region (Co1–Co4). Road accidents, falls from heights, violence and sports cause spine fractures. A break or injury to the vertebra results in fragments of the bone damaging the nerves of the spine, causing pain, paralysis and sometimes morbidity.

Numerous spine fracture classification systems have been proposed. However, only a few are easily replicable. The most commonly used and reproducible method is the Arbeitsgemeinschaft für Osteosynthesefragen (AO). According to AO, morphologically, spine fractures are classified into Types A, B and C, with subclassifications (Divi et al., Citation2019). Type A indicates the compression injury, Type B indicates the distraction injury, and Type C indicates the translation injury. The type A injury is subdivided into A0–A4. Distraction injuries are subclassified as B1–B3. Type C, or translation injury, is displacement in any direction that does not involve subclassification. Large sets of annotated images are needed for automatic classification algorithms based on DL. Large datasets are the bottleneck in medical applications. The application of an automated classification system is hampered by this bottleneck.

The GAN has the potential to provide solutions to small dataset problems. GAN is a generative model that comprises two neural networks, namely a generator and a discriminator, which are simultaneously trained; the generator focuses on the synthesis of realistic images, and the discriminator discriminates the synthetic image from real images (Goodfellow et al., Citation2020). Medical image detection, classification, segmentation and reconstruction require the synthesis of realistic images by GAN (Yi et al., Citation2019). Deep Convolutional Generative Adversarial Network (DCGAN) is a variant of GAN that can be used for high-quality medical image synthesis. DCGAN comprises convolutional in the discriminator and transpose convolutional in the generator network (Radford et al., Citation2015). The synthetic images generated by GAN have both local and global features and, hence, can be used for augmentation. Below is a list of some notable literary works that use DCGAN for image synthesis in medical applications. DCGANs have been used to generate images of the lungs (Javaid & Lee, Citation2016), lung nodules (Chuquicusma et al., Citation2017), ECG (Zhou et al., Citation2021), computed tomography (CT) images of liver lesions (Frid-Adar, Diamant, et al., Citation2018; Frid-Adar, Klang, et al., Citation2018), MR images of the brain (Han et al., Citation2018), images for automatic glaucoma diagnosis (Saeed et al., Citation2021), chest X-rays (Salehinejad et al., Citation2019), MRI brain tumors (Aamir et al., Citation2022; Ghassemi et al., Citation2020), intertrochanteric hip fractures (Urakawa et al., Citation2018), hip fractures (Cheng et al., Citation2019), plant disease images (Qin et al., Citation2020), etc. According to the aforementioned literature review, DCGAN generated realistic images in all imaging modalities, expanding the training data set and improving the classification models’ ability to generalize. Hence, the performance of the classification model can be improved in a medical application by DCGAN augmentation.

It can be difficult to assess the quality of synthetic images produced by GAN. A few of the evaluation metrics that have been suggested in the literature are the inception score, the Frechet inception distance and the Wasserstein sliced distance. They are mostly important for comparing various GAN synthesis algorithms, but they are inappropriate for judging visual realism. The apparent realism of GAN-generated images can be evaluated clinically using techniques like the Visual Turing Tests (VTT). Statistical analysis confirms the clinical evaluation’s results. Moreover, statistical analysis can be used to establish the interobserver reliability of VTT data.

Due to the rarity of labeled training data, transfer learning models are frequently used for image classification. In classifying medical images, it performed well. Instead of developing DL models from scratch, the transfer learning technique uses pretrained deep learning models from ImageNet with some modifications in the top layers (Zhang et al., Citation2022) and (Gupta & Pal, Citation2022). Transfer learning models were pretrained on source data and then fine-tuned on target data (Saber et al., Citation2021). According to the study (Raghu et al., Citation2019), a lightweight transfer learning model that was built from scratch for a medical application had almost the same performance. In comparison to the model trained from scratch, the VGG16 used for chest X-ray classification (Choudhary & Hazra, Citation2019) achieved a very good accuracy of 97.81%. With the use of transfer learning from VGG16, jaw tumor diagnosis (Poedjiastoeti & Suebnukarn, Citation2018), gastrointestinal image classification (Amina et al., Citation2021), pneumonia image classification (Jiang et al., Citation2021) and detection of breast cancer in ultrasound pictures (He et al., Citation2015) have all been accomplished. Transfer learning of ResNet50 was utilized in the classification of malaria cell images (Reddy & Juliet, Citation2019) and brain tumor classification (Kumar et al., Citation2023). And recent research (Apostolopoulos & Mpesiana, Citation2020; Zhao et al., Citation2018) suggests that VGG16 has been employed for similar purposes.

The automatic DL-based spine fracture type detection technique is essential for speeding up patient care by reducing the time orthopedic doctors spend interpreting images. The literature makes it clear that no prior work has been done to automatically identify spine fractures. Spine fracture type identification systems have a significant problem with an unbalanced fracture data set that can be overcome with the proposed extended DCGAN augmentation. As far as we are aware, there hasn’t been any research on the clinical evaluation of DCGAN-synthesized spine fracture CT images. The four objectives of this effort are as follows:

  • To develop own deep learning-based automated spine fracture type identification systems to aid orthopedic surgeons in the early detection of spine fractures. And observing the effect of augmentation on type identification.

  • The Extended DCGAN was developed to generate realistic spine fracture CT images, thereby increasing the dataset size.

  • Clinical evaluation of generated CT images to assess the quality of generated CT images.

  • Statistical analysis of clinical evaluation results to address clinical evaluation variability.

2. Methodology

2.1. Data collection

The CT (Computed Tomography) images produce very good images of bone structure in the body. CT images are widely used for spine fracture detection since they provide more precise details of the fracture. The CT images of the spine fracture were collected from a tertiary hospital after obtaining ethical clearance from the Institutional Ethics Committee with IEC No. 503/2020. A total of 564 patients with A/B/C types of spine fractures admitted between January 2017 and June 2020 were considered for the study. Form each patient, five CT images were considered, which showed the fracture line clearly. A total of 2820 sub-types of spine fracture, namely, A0-minor injuries, A1-edge compression injuries, A2-split injuries, A3-incomplete burst injuries, A4-complete burst injuries, B1-chance fractures, B2-posterior tension band disruption injuries and C-translation injuries, were used in the model development. Details of the dataset are given in . Sample spine fracture (CT) Images of eight different sub-types collected are shown as a grid in .

Figure 1. Sample spine fracture CT images.

Figure 1. Sample spine fracture CT images.

Table 1. Data set details.

shows the work flow of the clinical assessment of extended DCGAN-generated sub-axial spine fracture images and automatic type identification. The orthopedicians annotated the sub-axial spine fracture CT images that had been collected. For the synthesis of spine fracture CT images, different sub-types of spine fracture were given to the extended DCGAN model. Three spine surgeons afterwards performed clinical evaluations. To support the evaluation, a statistical analysis of the clinical evaluation was performed. Lastly, the impact of augmentation on the type identification model’s classification accuracy is checked.

Figure 2. Workflow diagram.

Figure 2. Workflow diagram.

2.2. Data augmentation

DCGAN, a variant of GAN (Goodfellow et al., Citation2020), was proposed in 2015. The DCGAN algorithm incorporates supervised CNN learning with the unsupervised GAN technique. To learn feature representation from input images, GANs need no specific cost function, but they are unstable to train and generate noisy output images. In the DCGAN (Radford et al., Citation2015), the structure of the CNN is modified to increase the speed of convergence as well as the quality of the images generated. The DCGAN discriminator and generator use convolutional and transposed convolutional layers, respectively; these layers are strategically inserted into a GAN model. The input for generator (G) is the noise vector (Z), taken from a uniform gaussian distribution. The modified DCGAN generator and discriminator architecture for generating different types of spine fractures is shown in .

Figure 3. Extended DCGAN architecture.

Figure 3. Extended DCGAN architecture.

In the proposed extended DCGAN, additional transposed convolutional layers and convolutional layers are added to the generator and discriminator respectively, to generate CT images of size (256*256). Seven transpose convolutional layers are used in the generator, where in each layer the dimensions of images are modified as (4*4), (8*8), (16*16), (32*32), (64*64), (128*128) and (256*256). Fractional-strided convolutions are used instead of pooling layer. Batch normalization is used to stabilize the training process, and fully connected layers are removed from the top of the generator. All the layers of the generator network use leaky ReLU activation, and the output layer uses the tanh function. The discriminator uses seven convolutional layers; strided convolutions are used instead of pooling layers. It uses leaky ReLU activation. The value output from the discriminator ranges between 0 and 1, which represents the probability of the image being real. For training, a batch size of 10 and a learning rate of 0.0002 and 2000 epochs were used. Details of all parameters of the implemented extended DCGAN are shown in .

Table 2. Parameters of extended DCGAN.

2.3. Clinical evaluation of generated spine fracture

With the use of the VTT, a clinical assessment was performed after the synthesis of the images of the spine fracture (Sreedhara & Mocko, Citation2015). Two VTTs were conducted; the first VTT was used to discriminate between the generated and real spine fracture images in the set of images. The second VTT is used to identify the type of spine fracture in the generated images. There were three spine surgeons involved in the VTTs. All spine surgeons in the spine clinic received type A0, A1, A2, A3, A4, B1, B2 and C fracture images as part of the first VTT test to assess how they perceived naturally present and artificially generated spine fractures. The spine surgeons were unaware of one another’s evaluations. The first VTT comprises 240 images in total, of which 128 high-quality generated and 112 real spine fracture images belong to eight fracture subtypes. Spine surgeons were permitted to alter the image’s angle of view or zoom in and out during VTT. During evaluation, the patient’s information was not taken into account; only real and synthetically produced spine fracture CT images were presented. The given image grid can include only real images, only generated images, or a combination of both, the spine surgeons were informed. The second VTT is used to identify the type of spine fracture in extended DCGAN-generated images. All spine surgeons were provided with A, B and C type-generated spine fracture images as part of the second VTT test to see their perceptions in identifying the type of spine fracture. The spine surgeons were unaware of one another’s evaluations. The second-conducted VTT comprises a total of 240 images, of which 80 are high-quality synthetic fracture spine fracture images belonging to three fracture types. A total of fifteen such square grids are given for type identification. Spine surgeons had the option to alter the image’s angle of view or zoom in and out during VTT.

2.4. Statistical analysis of clinical evaluation

Statistical analysis is used to validate the clinical evaluation’s results. SPSS (Statistical Program for Social Sciences), Version 25.0, was used to do the statistical analysis of VTT results. After being coded, the variables were entered into SPSS. Generally, Cohen’s kappa (Cohen, Citation1960), weighted kappa (Vanbelle & Albert, Citation2009), Fleiss kappa (L. Statistics, 2015), etc. are used to analyze inter-observer agreement. In the current study, Fleiss kappa was used to analyze inter-observer agreement for ‘identification of real, generated images’ and ‘type identification’. The clinical evaluation observations have met all of the Fleiss-Kappa standards stated below:

  • The data is categorical and nominal.

  • Variables assessed by the spine surgeons are mutually exclusive.

  • The response variable has the same number of categories for each spine surgeon (real-R and generated-G).

  • Three spine surgeons are independent.

  • Randomly selected, generated images were used by spine surgeons for evaluation.

2.5. Spine fracture classification

Several models, including InceptionV3, Xception, ResNet50 (He et al., Citation2015), DenseNet121 and VGG16 (Simonyan & Zisserman, Citation2014), were investigated to categorize distinct kinds of spinal fractures. The VGG16 and ResNet50 models were yielding good results when compared to the other models. Hence, the VGG16+ ResNet50 ensemble model for type identification is proposed.

As shown in , we developed an average ensemble model for spine fracture type identification. The most widely used method for combining the predictions of the base learners in the literature is averaging ensemble. Here, the results of the base learners are averaged to obtain the ensemble model’s final prediction. Due to the high variation and low bias of deep learning architectures, simply averaging the ensemble models improves generalization performance by reducing the variance among the models. Generally, the base learners’ outputs are averaged either directly or via the ‘softmax’ function on the predicted probability of the classes.

Figure 4. Proposed model for spine fracture classification.

Figure 4. Proposed model for spine fracture classification.

Proposed ensemble model comprises of VGG16 and ResNet50 models as base learners, each of which uses identical image data for training. Upon obtaining an input image for classification, each network classifies it separately. The results of all of these models’ predictions are averaged together to produce the final predictions. Here, the ‘Softmax’ function is used to average the predicted probability of the classes, as shown in EquationEquation (1) (Phung & Rhee, Citation2019). Only real data was given as input to the model for type identification. Later, both real and generated images were given as input to the same model for observing the effect of extended DCGAN augmentation on the type identification model. (1) pji=Softmaxj(Oi)=Oijk=1Kexp(kj)(1) Pji=Outcome probability of ith unit in jth  base learner Oji=Outcome of ith unit in jth  base learner K = Number of Classes

2.6. Experimental setup

The Windows 10 operating system was used for all experiments in this study, and Python (version 3.7.6) was used. Every model was developed using the Keras (version 2.1.5) framework, which uses the TensorFlow GPU (version 2.1.0) as its backend. An Intel (R) Core i5 9th Gen processor, 8 GB of RAM and a 4 GB NVIDIA GeForce GTX 1650

For each type of image generation, Extended DCGAN took 8 h. The ensemble model took around 10 h for the identification of spine fractures.

3. Result

Sample synthetic spine fracture images generated from extended DCGAN are shown in .

Figure 5. Extended DCGAN generated different spine fracture type images.

Figure 5. Extended DCGAN generated different spine fracture type images.

3.1. Result of clinical evaluation

The bar chart of the clinical evaluation of spine surgeons in the first VTT for identification of real and generated images among the given 128 images is shown in .

Figure 6. Result of first VTT (by spine surgeons for identifying real(R) vs generated(G) spine fracture images). R-R: a real image identified as real; R-G: a real image identified as a generated; G-R: a generated image identified as a real; G-G: a generated image identified as a generated image.

Figure 6. Result of first VTT (by spine surgeons for identifying real(R) vs generated(G) spine fracture images). R-R: a real image identified as real; R-G: a real image identified as a generated; G-R: a generated image identified as a real; G-G: a generated image identified as a generated image.

The graph shows that during the evaluation of 128 generated images, spine surgeon 1 recognized 50 generated spine fracture CT images as real, spine surgeon 2 recognized 35 generated spine fracture CT images as real, and spine surgeon 3 recognized 60 generated spine fracture CT images as real. Using the values given in the bar chart, the prediction accuracy of spine surgeons 1, 2 and 3 was calculated. illustrates the prediction accuracy for spine surgeons 1, 2 and 3 as 43.33%, 61.25% and 45.83%, respectively. The average accuracy of three surgeons is 50.13%, which indicates that it is confusing for the spine surgeons to distinguish between the generated synthetic spine fracture and real spine fracture images. This is further proved with statistical analysis, which indicates that the images generated by extended DCGAN are useful for augmentation.

Figure 7. Accuracy of prediction.

Figure 7. Accuracy of prediction.

The result of the second VTT is as shown in . The second VTT experiment was used to identify different types of spine fractures (Type A, B and C) by three spine surgeons in the generated images. The pie chart indicates the ‘type identification’ percentage by spine surgeons, and different colors indicate different types of spine fracture. Among the 80 Type A images, 62, 63 and 59 were correctly identified as Type A by spine surgeons 1, 2 and 3, respectively. Similarly, among 80 Type B images, 68, 49 and 48 images were correctly identified as Type B, and among 80 Type C images, 64, 62 and 66 images were correctly identified as Type C by spine surgeons 1, 2 and 3, respectively.

Figure 8. The result of second VTT for identifying different types of spine fracture in generated images.

Figure 8. The result of second VTT for identifying different types of spine fracture in generated images.

3.2. Result of statistical analysis

The results of two VTT’s were statistically evaluated by the Fleiss-Kappa method, which assesses the inter-observer agreement between three spine surgeons. The interobserver agreement between the spine surgeons was categorized based on Fleiss kappa scores. The score of < 0.00 indicates poor agreement, 0.00–0.20 indicates slight agreement, 0.21–0.40 indicates fair agreement, 0.41–0.60 indicates moderate agreement, 0.61–0.80 indicates substantial agreement, and 0.81–1.00 indicates almost perfect agreement (L. Statistics 2015; Sreedhara & Mocko, Citation2015). It is evident from that fair inter-observer agreement for the first VTT was observed (k-value 0.336, p value <.0001). The fair inter-observer agreement for the first VTT implies that the spine surgeons experienced difficulty distinguishing real and generated images among the given 240 images. This signifies that the realistic images were generated from the extended DCGAN, which resembles the real images and makes it difficult for spine surgeons to clearly distinguish generated images from real ones. Fleiss kappa exhibits ‘substantial’ individual category interobserver agreement for the second VTT (k-value 0.723, k-value 0.612 and k-value 0.769, p value <.0001) by considering the A, B and C type identification in the generated images as shown in . The substantial inter-observer agreement for the second VTT implies that the generated images are capable of clearly showing the fracture lines of A, B and C types of spine fractures. This statistical analysis proves that extended DCGAN can be used to generate synthetic images of all types of fractures.

Table 3. Interobserver agreement between three spine surgeons.

3.3. Result of spine fracture type identification with and without extended DCGAN augmentation

Various models like VGG16, ResNet50, DenseNet121, InceptionV3 and Xception were examined to classify different types of spinal fractures. When compared to the other models, the VGG16 and ResNet50 were providing good results. Therefore, for type identification, we proposed the VGG16+ ResNet50 ensemble model. The proposed model uses the average ensemble technique, which integrates predictions from base learners and improves generalization performance by reducing variance among deep learning architectures. Hence, the proposed ensemble model outperformed a single-base learner in VCF-type identification.

The performance of the ensemble model for determining VCF type is significantly improved with the addition of the extended DCGAN-generated images. Details of the accuracy of spine fracture type identification with and without GAN augmentation are depicted in . This work illustrated that extended DCGAN can generate realistic spine fracture images of all types, which increases the size of the training data set and enhances the generalization ability of spine fracture type identification models.

Table 4. Accuracy of VGG16, ResNet50 in spine fracture type identification with/without GAN augmentation.

4. Conclusion

DCGAN effectively generates various types of spine fracture images, enhancing dataset size and aiding deep learning-based automatic type identification of spine fractures. Fair inter-observer agreement for the first VTT was observed, indicating the generated images resemble the original images. Substantial inter-observer agreement was noted for the identification of A, B and C fracture types in the generated images, which signifies that the generated images clearly show the fracture lines. In comparison to a single model, the proposed VGG16 and ResNet50 ensemble models accurately identify the types of spine fractures. When compared to single-base learners, ensemble models perform better. GAN augmentation has enhanced the performance of the type identification model for spine fractures in all models. In future, other GAN-based data augmentation models to address data set size issues could be explored and compare it with extended DCGAN augmentation for spine fracture type identification.

Ethical approval

This study was approved by the Institutional Ethics Committee. The authors declare that all procedures performed in this study abide by the ethical standards of the institutional research committee.

Acknowledgments

Our sincere thanks to the Hospital for providing the data. Also, thanks to Spine surgeons for participating in the Visual Turing Test.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Additional information

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not for- profit sectors.

Notes on contributors

Sindhura D. N.

Sindhura D N received the B.E. degree from the Jawaharlal Nehru National College of Engineering, VTU, Belgaum, and the Master’s degree in Computer Science and Engineering from Jawaharlal Nehru National College of Engineering, VTU, Belgaum, India. She is currently pursuing the Ph.D. degree in computer vision and deep learning at Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India. Her research interests include image processing and computer vision in healthcare.

Radhika M. Pai

Dr. Radhika M. Pai (Senior Member, IEEE) received the Ph.D. degree from the National Institute of Technology Karnataka, Surathkal, India. She is currently a Professor and the Head of the Department of Data Science and Computer Applications, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India. She has a teaching and research experience of over 31 years. She has published 95 papers in national/international journals/conferences and has guided three Ph.D.’s and several master’s thesis. She has 2 granted patents to her credit. Her research interests include data mining, big data analytics, character recognition, sensor networks, and e-learning. She was the Principal Investigator for a research grant project and has executed other projects as a co-investigator. She was a recipient of National Doctoral Fellowship from AICTE, Government of India.

Shyamasunder N. Bhat

Dr. Shyamasunder Bhat N is Professor and Head of Orthopaedics at Manipal Academy of Higher Education, Manipal, INDIA. Since 1999, he has been a faculty in the Department of Orthopaedics at Kasturba Medical College, Manipal affiliated to Manipal Academy of Higher Education. After his basic medical degree from University of Mysore (1995), he continued his postgraduate training and obtained masters in Orthopaedics at Kasturba Medical College, Manipal (1999). He received Diplomate of National Board in 1999. He is the recipient of AOTrauma Fellowship (Singapore), AOSpine Fellowship (Hong Kong), Fellowship in Degenerative Spine Surgery (Japan), Visiting Fellowship in Spinal Deformity (Canada), AOTrauma Visit the Expert Fellowship (Germany) and Visiting Fellowship in Endoscopic Spine Surgery (Japan). He has published several articles (90) in National and International Indexed journals. He authored and co-authored chapters in 3 books. He also has many National and International conference presentations to his credit. His interests are Spine surgery Trauma, Imaging in Spine surgery and Interprofessional Education and Practice.

Manohara Pai M. M.

Dr. Manohara Pai M.M (Senior Member, IEEE) received the Ph.D. degree in Computer Science and Engineering. He is currently a Senior Professor with the Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India. He has a rich experience of 31 years in Teaching/Research. He holds nine patents to his credit and has published 145 papers in national and international journals/conference proceedings. He has published two books, guided six Ph.D.’s, and 85 master’s thesis. His research interests include data analytics, cloud computing, the IoT, computer networks, mobile computing, scalable video coding, and robot motion planning. He is also a Life Member of ISTE and a Life Member of Systems Society of India. He is also a Principal Investigator for multiple industry/government research projects. He has been an Executive Committee Member of the IEEE Bangalore Section, Mangalore Subsection, and the past Chair of the IEEE Mangalore Subsection. He has received the National Technical Teachers’ Award (NTTA 2022) from Ministry of Education, Government of India.

References

  • Aamir, M., Rahman, Z., Dayo, Z. A., Abro, W. A., Uddin, M. I., Khan, I., Imran, A. S., Ali, Z., Ishfaq, M., Guan, Y., & Hu, Z. (2022). A deep learning approach for brain tumor classification using MRI images. Computers and Electrical Engineering, 101, 108105. https://doi.org/10.1016/j.compeleceng.2022.108105
  • Amina, B., Nadjia, B., & Azeddine, B. (2021). Gastrointestinal image classification based on vgg16 and transfer learning. In Mohamed Ridda Laouar (Ed.), 2021 International Conference on Information Systems and Advanced Technologies, (ICISAT) (pp. 1–5). IEEE.
  • Apostolopoulos, I. D., & Mpesiana, T. A. (2020). Covid-19: Automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Physical and Engineering Sciences in Medicine, 43, 635–640. https://doi.org/10.1007/s13246-020-00865-4
  • Cheng, C.-T., Ho, T. Y., Lee, T. Y., Chang, C. C., Chou, C. C., Chen, C. C., Chung, I., & Liao, C. H. (2019). Application of a deep learning algorithm for detection and visualization of hip fractures on plain pelvic radiographs. European Radiology, 29(10), 5469–5477. https://doi.org/10.1007/s00330-019-06167-y
  • Choudhary, P., & Hazra, A. (2019). Chest disease radiography in twofold: Using convolutional neural networks and transfer learning. Evolving Systems. https://www.researcher-app.com/paper/3982729
  • Chuquicusma, M. J. M., Hussein, S., Burt, J., & Bagci, U. (2017). How to fool radiologists with generative adversarial networks: A visual turing test for lung cancer diagnosis. In Proceedings of 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018) (pp. 240–244). IEEE.
  • Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46. https://doi.org/10.1177/001316446002000104
  • Divi, S. N., Schroeder, G. D., Oner, F. C., Kandziora, F., Schnake, K. J., Dvorak, M. F., Benneker, L. M., Chapman, J. R., & Vaccaro, A. R. (2019). AOSpine—Spine trauma classification system: The value of modifiers: A narrative review with commentary on evolving descriptive principles. Global Spine Journal, 9(1), 77S–88S. https://doi.org/10.1177/2192568219827260
  • Frid-Adar, M., Klang, E., Amitai, M., Goldberger, J., & Greenspan, H. (2018). Synthetic data augmentation using GAN for improved liver lesion classification. https://ieeexplore.ieee.org/abstract/document/8363576
  • Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., & Greenspan, H. (2018). GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification, Neurocomputing, 321, 321–331. https://doi.org/10.1016/j.neucom.2018.09.013
  • Ghassemi, N., Shoeibi, A., & Rouhani, M. (2020). Deep neural network with generative adversarial networks pre-training for brain tumor classification based on MR Images. Biomedical Signal Processing and Control, 57, 101678. https://doi.org/10.1016/j.bspc.2019.101678
  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139–144. https://doi.org/10.1145/3422622
  • Gupta, D., & Pal, B. (2022). Vulnerability analysis and robust training with additive noise for FGSM attack on transfer learning-based brain tumor detection from MRI. In Mohammad Shamsul Arefin, M. Shamim Kaiser, Anirban Bandyopadhyay, Md. Atiqur Rahman Ahad, Kanad Ray (Eds.), Proceedings of the International Conference on Big Data, IoT, and Machine Learning (pp. 103–114). Singapore: Springer.
  • Hamamoto Suvarna, R., Yamada, M., Kobayashi, K., Shinkai, N., Miyake, M., Takahashi, M., Jinnai, S., Shimoyama, R., Sakai, A., Takasawa, K., Bolatkan A., Shozu, K., Dozen, A., Machino, H., Takahashi, S., Asada, K., Komatsu, M., Sese, J., & Kaneko, S. (2020). Application of artificial intelligence technology in oncology: Towards the establishment of precision medicine. Cancers, 12(12), 3532. https://doi.org/10.3390/cancers12123532
  • Han, C., Hayashi, H., Rundo, L., Araki, R., Shimoda, W., Muramatsu, S., Furukawa, Y., Mauri, G., & Nakayama, H. (2018). GAN-based synthetic brain MR image generation. In IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018) (pp. 734–738). IEEE. https://kyushu-u.pure.elsevier.com/en/publications/gan-basedsynthetic-brain-MR-image-generation
  • He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. arXiv.org. https://arxiv.org/abs/1512.03385v1
  • Javaid, U., & Lee, J. A. (2016). Capturing variabilities from computed tomography images with generative adversarial networks. arXiv:1805.11504 [cs, stat]. https://arxiv.org/abs/1805.11504
  • Jiang, Z.-P., Liu, Y.-Y., Shao, Z.-E., & Huang, K.-W. (2021). An improved VGG16 model for pneumonia image classification. Applied Sciences, 11(23), 11185. https://doi.org/10.3390/app112311185
  • Kumar, S., Choudhary, S., Jain, A., Singh, K., Ahmadian, A., & Bajuri, M. Y. (2023). Brain tumor classification using deep neural network and transfer learning. Brain Topography, 36(3), 1–14. https://doi.org/10.1007/s10548-023-00953-0
  • Liu, Y., Jain, A., Eng, C., Way, D. H., Lee, K., Bui, P., Kanada, K., de Oliveira Marinho, G., Gallegos, J., Gabriele, S., Gupta, V., Singh, N., Natarajan, V., Hofmann-Wellenhof, R. S., Corrado, G. H., Peng, L. R., Webster, D., Ai, D. J., Huang, S., Liu, Y., Dunn, R., & Coz, D. (2020). A deep learning system for differential diagnosis of skin diseases. Nature Medicine, 26(6), 900–908. https://doi.org/10.1038/s41591-020-0842-3
  • L. Statistics, Statistical tutorials and software guides. (2015). https://statistics.laerd.com/
  • Nazir, T., Irtaza, A., Javed, A., Malik, H., Hussain, D., & Naqvi, R. A. (2020). Retinal image analysis for diabetes-based eye disease detection using deep learning. Applied Sciences, 10(18), 6185. https://doi.org/10.3390/app10186185
  • Phung, V. H., & Rhee, E. J. (2019). A high-accuracy model average ensemble of convolutional neural networks for classification of cloud image patches on small datasets. Applied Sciences, 9, 4500. https://doi.org/10.3390/app9214500
  • Poedjiastoeti, W., & Suebnukarn, S. (2018). Application of convolutional neural network in the diagnosis of jaw tumors. Healthcare Informatics Research, 24(3), 236. https://doi.org/10.4258/hir.2018.24.3.236
  • Qin, X.-H, Wang, Z. Y., Yao, J. P., Zhou, Q., Zhao, P. F., Wang, Z. Y., & Huang, L. (2020). Using a one-dimensional convolutional neural network with a conditional generative adversarial network to classify plant electrical signals. Computers and Electronics in Agriculture, 174, 105464. https://doi.org/10.1016/j.compag.2020.105464
  • Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv.org. https://arxiv.org/abs/1511.06434
  • Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: Understanding transfer learning for medical imaging. arXiv:1902.07208 [cs, stat]. https://arxiv.org/abs/1902.0720
  • Reddy, A. S. B., & Juliet, D. S. (2019). Transfer learning with ResNet-50 for malaria cell-image classification. https://ieeexplore.ieee.org/abstract/document/8697909
  • Saber, A., Sakr, M., Abo-Seida, O. M., Keshk, A., & Chen, H. (2021). A novel deep-learning model for automatic detection and classification of breast cancer using the transfer-learning technique. IEEE Access, 9, 71194–71209. https://doi.org/10.1109/access.2021.3079204
  • Saeed, A. Q., Sheikh Abdullah, S. N. H., Che-Hamzah, J., & Abdul Ghani, A. T. (2021). Accuracy of using generative adversarial networks for glaucoma detection: Systematic review and bibliometric analysis. Journal of Medical Internet Research, 23(9), e27414. https://doi.org/10.2196/27414
  • Salehinejad, H., Colak, E., Dowdell, T., Barfett, J., & Valaee, S. (2019). Synthesizing chest X-ray pathology for training deep convolutional neural networks. IEEE Transactions on Medical Imaging, 38(5), 1197–1206. https://doi.org/10.1109/tmi.2018.2881415
  • Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv.org. https://arxiv.org/abs/1409.1556
  • Sreedhara, V. S. M., & Mocko, G. (2015). Control of thermoforming process parameters to increase quality of surfaces using pin-based tooling. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 57113), (p. V004T05A016). American Society of Mechanical Engineers. https://doi.org/10.1115/DETC2015-47682
  • Urakawa, T., Tanaka, Y., Goto, S., Matsuzawa, H., Watanabe, K., & Endo, N. (2018). Detecting intertrochanteric hip fractures with orthopedist-level accuracy using a deep convolutional neural network. Skeletal Radiology, 48(2), 239–244. https://doi.org/10.1007/s00256-018-3016-3
  • Valieris, R., Amaro, L., Osório, C. A. B. D. T., Bueno, A. P., Rosales Mitrowsky, R. A., Carraro, D. M., Nunes, D. N., Dias-Neto, E., & Silva, I. T. D. (2020). Deep learning predicts underlying features on pathology images with therapeutic relevance for breast and gastric cancer. Cancers, 12(12), 3687. https://doi.org/10.3390/cancers12123687
  • Vanbelle, S., & Albert, A. (2009). A note on the linearly weighted kappa coefficient for ordinal scales. Statistical Methodology, 6(2), 157–163. https://doi.org/10.1016/j.stamet.2008.06.001
  • Yi, X., Walia, E., & Babyn, P. (2019). Generative adversarial network in medical imaging: A review. Medical Image Analysis, 58, 101552. https://doi.org/10.1016/j.media.2019.101552
  • Zhang, C., Lei, T., & Chen, P. (2022). Diabetic retinopathy grading by a source free transfer learning approach. Biomedical Signal Processing and Control, 73, 103423. https://doi.org/10.1016/j.bspc.2021.103423
  • Zhao, D., Zhu, D., Lu, J., Luo, Y., & Zhang, G. (2018). Synthetic medical images using F and BGAN for improved lung nodules classification by multi-scale VGG16. Symmetry, 10(10), 519. https://doi.org/10.3390/sym10100519
  • Zhou, Z., Zhai, X., & Tin, C. (2021). Fully automatic electrocardiogram classification system based on generative adversarial network with auxiliary classifier. Expert Systems with Applications, 174, 114809. https://doi.org/10.1016/j.eswa.2021.114809