1,879
Views
1
CrossRef citations to date
0
Altmetric
COMPUTER SCIENCE

Developing a knowledge-based system for diagnosis and treatment recommendation of neonatal diseases

ORCID Icon &
Article: 2153567 | Received 15 Sep 2022, Accepted 28 Nov 2022, Published online: 22 Dec 2022

Abstract

An infant in the first 28 days following birth is referred to as a newborn baby. In Ethiopia, neonatal mortality is a serious problem that accounts for the lion’s share of under-five mortality. Diagnosis and treatment of infant disease need specialized medical resources with plenty of expert knowledge and experience. Globally and particularly in low-income countries, there is a lack of such professional which make the diagnosis and treatment more difficult. The goal of this paper is to design a knowledge-based system for the diagnosis and treatment recommendation of neonatal diseases by collaborating with the knowledge obtained from machine learning and health experts. Design science research approach has been employed as the overall research design, and the hybrid data mining process model is used to extract knowledge from the collected clinical dataset. To this end, three classification algorithms in WEKA tools, namely, J48, PART, and JRip, were considered. Then, a partial decision tree (PART) algorithm under 10-fold cross-validation achieved the highest performance result with an accuracy of 98.06% and the researchers decided to use the generated rules for the development of a knowledge-based system. Evaluation results show that the developed prototype registers 90.9% accuracy in system performance testing and 89.2% in user acceptance testing. In conclusion, the system is used as an assistant tool for healthcare experts and could be effective if it could be implemented

1 Introduction

The first 28 days of a baby’s life are very important for adapting to new surroundings. This period is characterized by the transition to life outside the womb, rapid growth, and progression [Mekonnen et al., Citation2018]. Many people die in infancy because of common diseases, especially in sub-Saharan countries like Ethiopia, such as premature birth (27%), sepsis (26%), tetanus (7%), diarrhea (3%), and asphyxia (23%; Gizaw et al., Citation2014). According to (EPHI and ICF, 2019), perinatal asphyxia, respiratory distress syndrome (RDS), and sepsis are the three leading causes of neonatal mortality and morbidity. This disease-account for 93% of death in the first month of life. As per (Ahuja, Citation2019), in the first month of life, 2.5 million children lose their lives worldwide, and around 7,000 neonatal fatalities occur every day. The 2019 Ethiopia Demographic and Health Survey (EDHS) found that neonatal mortality is approximately 30 per 1000 live births (EPHI and ICF, 2019).

As well, evidence showed a significant difference in death rates across different parts of Ethiopia, with rates ranging from 21 to 62 per 1000 live births. In accordance with (Tefera, Citation2015), in Ethiopia, the report finds that the ratio of healthcare workers to the population is approximately 84 per 1000 people. Information technologies are being increasingly used in healthcare institutions to support healthcare experts and to enhance clinicians’ decision-making. Knowledge-based systems (KBS) are one of the promising and ambitious areas of artificial intelligence to improve healthcare service delivery (Ahuja, Citation2019). Nowadays, machine learning algorithm has been used effectively in many areas, including health sectors to draw insights from large medical datasets and accelerate medical research.

This study designed a knowledge-based system (KBS) by collaborating automatic knowledge acquisition (using machine learning) and manual knowledge acquisition (interviews with the health expert) for the purpose of diagnosis and treatment recommendation of newborn baby diseases. KBS assists health management in providing high-quality healthcare services and helping to achieve the Sustainable Development Goal of reducing newborn mortality to less than 12 per 1000 live births by 2030 (O, Citation2021).

The primary contribution of this paper is to produce an innovative knowledge-based system artifact in the form of a model. Moreover, the developed prototype systems could serve as a model for the development of related systems, which can be used for further authors that follow design science research. To this end, this paper would contribute to healthcare systems by enhancing the effectiveness and efficiency of service delivery, improving access and quality healthcare professionals and existing policies of the country.

The rest of the paper is organized as follows. Section 2 explains the related work carried out by other researchers. Methodology and pre-processing are discussed in Section 3. Experimentation and implementation are discussed in Section 4. The conclusion and scope for future work are stated in Section 5.

2. Related works

In this section, related works on infant disease diagnosis and treatment methods and techniques were discussed and many authors have tried to design knowledge-based systems using different techniques and approaches for different children’s diseases around the world.

Tefera (Citation2015) developed a user-friendly knowledge-based system for the diagnosis and treatment of pneumonic patients. This paper was conducted by applying implicit and codified knowledge. The prototype system's performance is 83.33% accuracy. Also, the user acceptance evaluation result by the healthcare worker is 90.33%. Authors in Adugnaw (Citation2019) designed a knowledge-based system to identify the nutritional status of children through diagnosis and treatment for under-nutritional children. Both structured and unstructured interviews were used to extract the knowledge. This paper employed rule-based reasoning techniques through a backward reasoning method, and the prototype was developed with SWI-Prolog. Generally, the prototype system is evaluated based on two techniques; those are system performance testing scored 85% and user acceptance testing achieved 86.2%.

On the other hand, Safdari et al. (Citation2016) has developed a fuzzy expert system to predict the risk of neonatal death in Iran. In this paper, a questionnaire was distributed to neonatologists to acquire knowledge and the more significant factors were determined based on their responses. According to the results, the accuracy, sensitivity, and specificity of the proposed model were 90%, 83%, and 97%, respectively. Another study conducted by Desalegn (Citation2011) has designed a predictive model for low birth weight babies that can support healthcare workers. This paper aimed to predict low birth weight by using machine learning techniques. A total of 9861 instances of a dataset were used for the experiments. J48 decision tree and PART rule induction algorithms were used for experimentation. The J48 data mining classification algorithm scored the best performance with 94.7% accuracy.

To the best of the author’s knowledge, no previous related work has been undertaken in terms of designing a knowledge-based system for diagnosis and treatment recommendation of newborn baby disease among admitted neonates in the neonatal intensive care unit. In this paper, a novel knowledge-based system is proposed by collaborating predictive model and the knowledge of health experts to assist healthcare workers in the diagnosis and healthcare treatment management of neonatal diseases. The predictive model was designed and used to classify the newborn disease as sepsis, perinatal asphyxia, and respiratory distress syndrome. The designed classifier model was trained and tested using a clinical dataset collected from Gandhi Memorial Hospital, which was found in Addis Ababa, Ethiopia.

3. Methodology and data preprocessing

In this paper, Design Science Research (DSR) has been applied as a general research methodology to design a knowledge-based system. This problem-solving approach is a powerful method that has helped engineers and computer scientists improve their work and extend the boundaries of human capabilities by creating new and innovative artifacts, and it grows from relevance (environment) and rigor (knowledge base; Peffers et al., Citation2008). According to (Peffers et al., Citation2008), the design science research methodology (DSRM) process model consists of six activities, namely, problem identification and motivation, defining the objectives for a solution, design, and development, demonstration, evaluating, and communicating.

3.1 Design and development

This activity is defined as “designing an artifact.” It includes determining the desired functionality of the innovative artifact and proposed systems, and then creating the model based on the planned remedy. Figure illustrates the framework of the proposed system (Ramírez-Gallego et al., Citation2015).

Figure 1. Framework of the proposed system.

Figure 1. Framework of the proposed system.

3.2 Automatic knowledge acquisition

A critical problem in the development of expert systems is acquiring the necessary knowledge from health workers. To rectify the traditional knowledge gathering challenge and to enrich the knowledge-based system, automatic knowledge acquisition is needed. Nowadays, datasets stored in the healthcare sector are growing at an increasing rate, and the application of machine learning in the healthcare industry is the best remedy to extract relevant knowledge from these enormous volumes of datasets.

3.3 Manual knowledge acquisition

Both structured and unstructured interviews were employed to obtain implicit knowledge from medical experts. The author interviews health workers to gain knowledge and recommendations for a prospective remedy to enhance existing practices and problems. In this paper, the purposive sampling technique is used to select the healthcare medical experts based on the author’s decision to gather the implicit knowledge. Accordingly, the acquired knowledge was modeled using the decision tree and represented using the rule-based knowledge representation approach.

3.4 Knowledge-based system

Therefore, after the knowledge was acquired from the machine learning algorithm and medical expert, the next task is building the KBS for neonatal disease diagnosis and treatment management. The designed KBS consists of the following four main components: knowledge base, explanation facility, inference engine, and user interface. To summarize, a combination of two programming languages was used to build a prototype system. Hence, SWI-Prolog editor tool version 8.2.3 is used to construct the knowledge base, and Java NetBeans IDE 8.2 with JPL package was used to design the user interface.

3.5 Dataset description

To extract the hidden knowledge, the clinical dataset was collected from Gandhi Memorial Hospital (GMH) which is found in Addis Ababa, Ethiopia. Therefore, a total of 2372 instances have been collected with 16 features, and the class included three labels. Table offers the description of newborn disease dataset features.

Table 1. Description of newborn disease dataset features

3.5 Data preprocessing

Preprocessing of the dataset is a key step in machine learning of healthcare applications; a minor data quality adjustment may bring higher effectiveness, which will significantly increase the validity and quality of the discovered knowledge (Hailemariam, Citation2012). To improve the performance of the predictive model, different techniques presented in the preprocessing phase, such as imputation (filling) of missing values, feature selection, and transformation of data were performed.

3.6 Missing value imputation

Handling missing values is a data preprocessing technique that is used to obtain the common newborn baby clinical disease datasets without missing values. Missing data or values in a dataset can affect the performance of the predictive model, which leads to difficulty in extracting useful knowledge from clinical datasets (Mulugeta & Beshah, Citation2015). In this paper, the distinct values of each numerical feature were imputed with the mean. Moreover, nominal features were imputed with mode (value having the highest frequency). Table presents the missing value imputation.

Table 2. Missing value imputation

3.7 Data transformation

In data transformation, data will be translated or consolidated into acceptable forms for mining. In this paper, data discretization techniques and healthcare worker advice were used to change the distinct values of features. Data discretization is capable of not only improving the readability and understandability of the data for the professional but also improving the performance of prediction for the induced model. Discretization is one of the most powerful preprocessing activities to handle numerical features in machine learning, and it divides continuous data into categorical data (Ramírez-Gallego et al., Citation2015). Table describes the value of transformed attributes.

Table 3. Discretization of the features

3.8 Feature selection

It is one of the important phases in preprocessing of machine learning by selecting the best features set among the original features. For this paper, Information Gain Ratio for feature selection measures was applied to rank the features. The gain Ratio strategy leads to better generalization (less overfitting) of decision tree models. Gain Ratio attempts to lessen the bias of Information Gain on highly branched predictors by introducing a normalizing term called Intrinsic Information. Intrinsic information is the entropy of the distribution of instances into branches (i.e. how much info do we need to tell which branch an instance belongs to; Bhatt, Citation2012).

(1) Intrinsic Information=j=1nNjNlog2NjN(1)

The Gain Ratio is defined as: GainRatio=Information GainIntrinsic Information(2)

To rank and select the best determinant features, to design the model, the authors used “GainRatioAttributeEval” feature evaluator techniques with a “Ranker” search method and the health workers’ advice. The ranked attribute according to the Gain Ratio feature evaluator methods is presented in Table .

Table 4. Ranked features based on gain ratio value

Based on health workers’ advice the reflexes, temperature, crying, and vomiting features are not considered the most determinant features. Hence, the authors decided on the minimum threshold value of 0.09 and approved 12 features as the most determinant factor in predicting the newborn baby’s common disease.

3.9 Developing model

To design a novel knowledge-based system for neonatal disease diagnosis, machine learning algorithms are applied to construct a predictive model. Rule induction and decision tree algorithms were employed, namely J48, PART, JRip, and their result was compared to generate rules for the development of KBS.

J48 Algorithm: An upgraded version of the C4.5 is referred to as the J48 algorithm. It utilizes a divide-and-conquer approach and recursively creates a decision tree based on the greedy algorithm. The decision tree is similar to a tree structure with several nodes, such as the root node, internal nodes, and leaf nodes. Each node in that tree contains a decision and the decision leads to our result as the name is the decision tree (Kaur & Chhabra, Citation2014).

JRip Rule classifiers: JRip rule classifiers presented a Repeated Incremental Pruning to Produce Error Reduction (RIPPER). It is an inference and rule-based learner that can be applied to predict elements with propositional rules. JRip is a rapid classification algorithm for learning “IF-THEN”, and it has the advantage of being a high-level and symbolic knowledge representation that contributes to the discoverability of knowledge (Lehr et al., Citation2011).

Partial Decision Tree (PART): PART is a separate-and-conquer rule learning strategy. The rule induction algorithm produces sets of rules called decision lists which are ordered sets of rules. A new instance is compared to each rule in the decision lists, and the item is assigned to the group of the first matching rule. PART produces a pruned decision tree using the C4.5 statistical classifiers in each iteration. From the best tree, the leaves are translated into rules (Lehr et al., Citation2011).

3.10 Performance evaluation metrics

In this paper, the evaluation of the classifier models was done by comparing the accuracy of the models in terms of the confusion matrices such as True Positive Rate (TPR), False Positive Rate (FPR), Precision, and F-Measure. Furthermore, to estimate the performance (or accuracy) of machine learning models 10-fold and hold-out cross-validation (CV) statistical methods were used.

True positive (TP) is the number of positive instances which are correctly classified as positive. False positive (FP) measures the number of instances that are predicted as positive but it is a negative class. True Negative (TN) refers to the number of negative instances which are correctly classified as negative. False Negative (FN) refers to the number of instances that are predicted as negative but it is a positive class. Moreover, Precision measures the proportion of instances that are classified as positive that are positive. The performance metrics for Accuracy, TPR, FPR, and Precision have been used for evaluating the classifier presented in Equation (3.3)–(3.7).

(3) Accuracy%= TP+TNTP+TN+FP+FN(3)
(4) TPR=TPTP+FN(4)
(5) FPR=TNTN+FP(5)
(6) Precision=TPTP+FP(6)
(7) FMeasure=2PrecisionRecallPrecision+Recall(7)

4. Experimentation and implementation

Model building is the core task in machine learning, and the model is developed by providing the processed data to the selected machine learning classification algorithms (Micheline et al., Citation2012). Consequently, to attain the objectives of this paper, three classification algorithms have been selected, namely, J48, JRip, and PART. Furthermore, three experiments were performed and three scenarios were considered for each experiment, one with whole attributes and the other with selected features (the most determinant features). After designing a model choosing the best predictive model that can classify the neonatal disease as sepsis, RDS, and perinatal asphyxia is one of the purposes of this study. In Table , Scenario-I represents the whole attribute with 10-fold CV, Scenario-II indicates the most determinant features with a 10-fold CV, and Scenario-III represents the most determinant attributes with a Hold-out CV (66% of the dataset for training and 34% for testing) for each experiment. Table presents a summary of the experimental results of all nine models.

Table 5. Summary of experimental results

Based on the above summary result, the best classifier model was selected by considering the performance they achieved. For instance, the PART algorithm has scored the best accuracy performance of 98.06% in newborn disease prediction. Table presents the confusion matrix of the PART algorithm with Scenario II, the designed model correctly identified 833 as sepsis out of 847 sepsis instances and the remaining 2 and 12 were identified incorrectly as RDS and perinatal asphyxia, respectively, the developed predictive model correctly identified 769 as RDS out of 788 RDS instance and the remaining 9 and 10 were identified incorrectly as sepsis and perinatal asphyxia, respectively, and the predictive model correctly identified 724 as perinatal asphyxia out of 737 perinatal asphyxia instance and the remaining 5 and 8 were identified incorrectly as sepsis and RDS, respectively.

Table 6. Confusion matrix for PART with selected attributes under the 10-fold cross-validation

Table presents the detailed accuracy of the selected predictive model, which is PART with selected features under the 10-fold cross-validation test option techniques. The performance of the partial decision tree predictive model is assessed in terms of True Positive Rate, False Positive Rate, Precision, F-Measure, and ROC Area.

Table 7. Performance result of PART with selected attributes under the 10-fold cross-validation

Sample rules generated by the PART classifier model are:

Rule 1: IF APGAR score = abnormal AND Resuscitate = yes AND LLVW = no AND CRP = normal: THEN Perinatal asphyxia (583.0/3.0)

Rule 2: IF ICSCR = yes AND GA = Pre-term AND CRP = normal: THEN RDS (651.0/5.0)

Rule 3: IF APGAR score = abnormal AND Blood cultures = negative AND ICSCR = no AND

Resuscitate = yes AND WBC = normal: THEN Perinatal asphyxia (56.0)

Rule 4: IF APGAR score = normal AND SpO2 = normal AND Blood cultures = positive AND GA = term: THEN Sepsis (109.0)

Rule 5: IF APGAR score = normal AND WBC = normal AND LLVW = yes: THEN RDS (100.0/1.0)

Rule 6: IF APGAR score = normal AND SpO2 = normal AND WBC = low: THEN Sepsis (44.0/1.0)

Rule 7: IF SpO2 = normal AND WBC = high: THEN Sepsis (21.0)

Knowledge Acquisition: In the artificial intelligence field, knowledge acquisition and representation are important activities in knowledge-based systems development. The knowledge gained during the first stages of the development of knowledge-based systems has determined the success of the intelligent system (Mohammad & Al Saiyd, Citation2010).

Knowledge Modeling: Models are used to apprehend the essential features of real systems by breaking them down into more controllable parts that are easy to acknowledge and manipulate. Hence, Figure illustrates the knowledge acquired from the healthcare expert in the form of the decision tree for the diagnosis and treatment recommendation for newborn baby disease.

Figure 2. Decision tree for diagnosis and treatment recommendations for newborn disease.

Figure 2. Decision tree for diagnosis and treatment recommendations for newborn disease.

Knowledge Representation:After the knowledge is acquired and modeled, it is represented by using the rule-based knowledge representation techniques; therefore, in this study, the acquired knowledge from healthcare experts was represented in IF-THEN format. Sample rules for the diagnosis of newborn diseases are:

Rule 1: IF newborn patients’ blood culture is Positive, THEN the Disease = Sepsis.

Rule 2: IF newborn patients’ blood culture is Negative the intercostal subcostal retraction is Yes THEN the Disease = RDS.

Rule 3: IF newborn patients’ blood culture is Negative the intercostal subcostal retraction is No and the Resuscitate is Yes THEN the Disease = Perinatal Asphyxia.

Rule 4: IF newborn patients have low lung volume with whiteout color is Yes THEN the Disease = RDS.

4.1. Developing knowledge-based system

Consequently, after the knowledge acquisition process, modeling, and representation tasks are accomplished, the next activity is designing the expert systems. To diagnose neonatal diseases such as sepsis, RDS, and perinatal asphyxia, 16 and 8 rules are generated from PART classification algorithms and domain experts, respectively. Also, for managing each disease, the knowledge extracted from domain experts was functional.

Then, the acquired knowledge is programmed in the knowledge base as facts about the subject and knowledge relationships in terms of if-then rules. Beyond the knowledge representation or rule-based reasoning approach, the knowledge engineer should manage the knowledge base and consistently check the represented knowledge. The developed KBS consists of the following four main components: knowledge base, explanation facility, inference engine, as well as interface. Figure shows the prototype for diagnosis and treatment recommendations for neonatal disease.

Figure 3. User interface of the developed knowledge-based systems.

Figure 3. User interface of the developed knowledge-based systems.

4.2. Evaluation of the prototype

Testing and evaluation of the developed prototype KBS is the final step that assists the knowledge engineer to measure whether the prototype system meets the proposed objective or not. Accordingly, the developed prototype systems were tested by preparing test cases to measure the performance of the system, and the test cases were given to domain experts, and the same case was also given to the proposed system to compare the results of the system. In addition, a user acceptance evaluation was also done to check the accessibility and usability of the system based on the criteria with user interaction of the prototype system.

Performance testing is the process of determining whether the developed prototype system achieves the required level of accuracy or not. To address the validation, the authors prepared 22 test instances for system performance testing by considering the number of attributes and the time it consumes to predict them. Hence, for evaluating the system performance confusion matrix was used. Table underneath displays the confusion matrix of a three-class.

Table 8. Confusion matrix for system performance testing

The system correctly identified 20 test case instances out of 22 test cases to their correct class. This means that the overall results of the testing have scored 90.9% diagnosis accuracy for neonatal diseases.

User acceptance testing consists of a process of verifying that the prototype system works for the user. The domain experts evaluated the prototype system performance based on the following evaluation criteria. These include the efficiency of the system relative to its response time, the attractiveness of the prototype system, knowledge adequacy regarding disease diagnosis and treatment, simplicity to use the system, the aptitude of the prototype to prevent errors made by the user, user control, and freedom of KBS, and contribution of the developed prototype system in domain areas (Bekele et al., Citation2020).

In addition, to analyze the system performance with user evaluations, the researcher assigned a value for each word within the scale. Such as Excellent = 5, Very Good = 4, Good = 3 Fair = 2, and Poor = 1. Table illustrates the outcomes achieved after the evaluation of domain experts.

Table 9. User acceptance evaluation criteria and their results

As Table presented, the criteria of evaluation concerning the prototype system efficiency in time, 75% of the evaluators rated it as Excellent, whereas 25% responded to it as Very Good. In the case of the second criteria of evaluation, i.e. attractiveness of the prototype system, it was 50% of the evaluators rated it as Excellent and the remaining 50% responded as Very Good. In the third criterion, which means knowledge adequacy regarding disease diagnosis and treatment of KBS, 25% of the evaluators graded as Excellent and 50% of them graded as Very Good and 25% of them graded as Good. On simplicity to use and interact with its criteria of evaluation, 75% of the response rated as Excellent, and 25% as Very Good.

On the other side, regarding the ability of the system in error prevention, 50% of the respondents rated it as Excellent, and 50% of respondents rated it as Very Good. The other evaluation criterion is about user control and freedom of the system 50% of the evaluators rated as Very Good and 25% of responses rated as Excellent and 25% of responses rated as Good. The final evaluation criteria, i.e., concerning the contribution of the developed prototype system in the domain area, 75% of the evaluators rated it as Excellent, and 25% of them rated it as Very Good. Finally, according to the experts’ evaluation results, the prototype system’s average performance is 4.46 out of 5. This result designates that the overall average performance of the prototype KBS is 89.2% which is above Very Good.

5. Conclusion and recommendation

To achieve the healthcare service goal, the novel knowledge-based systems in the diagnosis and treatment recommendations of newborn baby diseases play an appropriate contribution to enhancing the healthcare service outcomes, enhancing the efficiencies of a healthcare institution, reducing the economic burden involved in healthcare management, and reducing newborn mortality.

Consequently, to acquire the knowledge from data mining, the dataset was taken from the GMH which is found in Addis Ababa city with a sample size of 2372 instances, 16 attributes, and 3 class labels were used for extracting knowledge in the form of rules. Correspondingly, to develop the KBS for diagnosis and treatment recommendation of neonatal diseases, both data mining results and expert knowledge have collaborated.

Moreover, in the data-preprocessing phase, the following tasks were performed to improve the accuracy of the classifier model, namely, missing value imputation, feature selection, and data transformation. Based on the objective measurement, partial decision tree classifier with selected attributes under the 10-fold cross-validation test option has scored the highest classification accuracy which classifies correctly 2326 instances (98.06% accuracy) from a total number of 2372 instances. As a result, the rules extracted from the PART algorithm were used for the development of the KBS with the knowledge acquired from domain experts. Generally, in a user acceptance test, the prototype system achieved an 89.2% score. Furthermore, according to the system’s performance testing, the system has a 90.9% overall accuracy. Therefore, we advise other researchers to include other prevalent infant diseases. The current research was limited to rule-based for building KBS. Future work should examine a case-based reasoning approach.

Acknowledgments

We would like to acknowledge the staff of the department of Information Systems, Debre Berhan University, Ethiopia. We also thank Gandhi Memorial Hospital (GMH), Addis Ababa, Ethiopia, and Dr. Getachew Yilma for his professional advice and support which was a constant source of inspiration.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The authors received no direct funding for this research.

References

  • Adugnaw, B., “Knowledge Based System for Diagnosis of Under Nutrition Status of Children Under Five,” Unpublished Masters Thesis, Bahir Dar University, February, 2019.
  • Ahuja, A. S. (2019). The impact of artificial intelligence in medicine on the future role of the physician. PeerJ, 7(e7702), 1–16. https://doi.org/10.7717/peerj.7702
  • Bekele, W. M., Fana, M. B., & Sime, A. B. (2020). A self-learning knowledge based system for credit evaluation of loan application: the case of commercial bank of Ethiopia. Information and Knowledge Management, 10(6), 11–18. https://doi.org/10.7176/IKM/10-6-03
  • Bhatt, A. S., “Comparative analysis of attribute selection measures used for attribute selection in decision tree induction,” International Conference on Radar, Communication and Computing, pp. 230–234, 2012.
  • Desalegn, B., “Predicting Low Birth Weight Using Data Mining Techniques On Ethiopia Demographic and Health Survey Datasets,” Unpublished Masters Thesis,Addis Ababa University, June,2011.
  • Gizaw, M., Molla, M., & Mekonnen, W. (2014). Trends and risk factors for neonatal mortality in Butajira District, South Central Ethiopia, (1987-2008): A prospective cohort study. BMC Pregnancy and Childbirth, 14(64), 1–6. https://doi.org/10.1186/1471-2393-14-64
  • Hailemariam, T., “Application of Data Mining for Predicting Adult Mortality,” Unpublished Masters Thesis,Addis Ababa University, June,2012.
  • Kaur, G., & Chhabra, A. (2014). Improved J48 classification algorithm for the prediction of diabetes. International Journal of Computer Applications, 98(22), 13–17. https://doi.org/10.5120/17314-7433
  • Lehr, T., Yuan, J., Zeumer, D., Jayadev, S., & Ritchie, M. D. (2011). Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies. BioData Mining, 4(1), 1–14. https://doi.org/10.1186/1756-0381-4-4
  • Mekonnen, T., Tenu, T., Aklilu, T., & Abera, T. (2018). Assessment of neonatal death and causes among admitted neonates in neonatal intensive care Unit of Mizan Tepi University Teaching Hospital. Clinics in Mother and Child Health, 15(4), 1–5. https://doi.org/10.4172/2090-7214.1000305
  • Micheline, K., Jiawei, H., & Jian, P. (2012). Data mining concepts and (Technique.3rd) ed.). Morgan Kaufmann-Elsevier.
  • Mohammad, A. H., & Al Saiyd, N. A. M. (2010). A framework for expert knowledge acquisition. International Journal of Computer Science and Network Security, 10(11), 145–151.
  • Mulugeta, T. A., & Beshah, T. T. (2015). Integrating data mining results with the knowledge based system for diagnosis and treatment of visceral leishmaniasis. International Journal of Advanced Research in Computer Science and Software Engineering, 5(5), 128–142.
  • Peffers, K., Tuunanen, T., Rothenberger, M. A., & Chatterjee, S. (2008). A design science research methodology for information systems research. Journal of Management Information Systems, 24(3), 45–77. https://doi.org/10.2753/MIS0742-1222240302
  • Ramírez-Gallego, S., García, S., Mouriño-Talín, H., Martínez-Rego, D., Bolón-Canedo, V., Alonso-Betanzos, A., Benítez, J. M., & Herrera, F. (2015). Data discretization: Taxonomy and big data challenge. WIREs Data Mining and Knowledge Discovery, 6(1), 5–21. https://doi.org/10.1002/widm.1173
  • Safdari, R., Kadivar, M., Langarizadeh, M., Nejad, A. F., & Kermani, F. (2016). Developing a fuzzy expert system to predict the risk of neonatal death. Acta Informatica Medica, 24(1), 34–37. https://doi.org/10.5455/aim.2016.24.34-37
  • Tefara, Y.G, & Ayele, A. A. (2021). Newborns and Under-5 Mortality in Ethiopia: The Necessity to Revitalize Partnership in Post-COVID-19 Era to Meet the SDG Targets. Journal of Primary Care & Community Health, 12, 1–5. https://doi.org/10.1177/2150132721996889
  • Tefera, A., “A User Friendly Knowledge-Based System for Diagnosis and Treatment of Pneumonia,” Unpublished Masters Thesis,University of Gondar, December,2015.