Comparative Analysis of Classical Methods with Machine Learning Algorithm on Survival Classification of Heart Failure Patients

Keywords: Featured Selection, Heart Failure, Imbalanced Data, Classification, Machine Learning

Abstract

Cardiovascular disease is a global threat and is the main cause of death worldwide. More than 17.9 million people died from heart and blood vessel problems. Most of these deaths, around 80%, occurred in countries with low or middle economies, including Indonesia. This research aims to find the most accurate and efficient model for classifying cardiovascular disease data so that cardiovascular disease can be detected early.

This research uses heart failure patient data with predictor and response variables. The response variable has two categories such as passed away and alive. Moreover, predictor variables are obtained from the patient’s behavioral risk factors. Data preprocessing was done before the modeling and divided into 0% training and 20% testing data. Modeling in training data was done with multiple algorithms such as Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Each model was evaluated with metrics such as Accuracy, Precision, and Recall obtaining the best model.

This study found that the use of all research variables in the classification analysis leads to a decrease in classification performance, so this study used SelectKBest with a total of 8 significant variables. Furthermore, the Random Forest algorithm with optimal parameters using entropy criterion and a maximum depth of 8 is the method with the most optimal performance, achieving a precision of 90.51% for the 'alive' category, recall of 88.27% for 'alive', the precision of 88.55% for 'deceased', recall of 90.74% for 'deceased', training accuracy of 89.51%, AUC of 0.895, and testing accuracy of 87.80%, placing it in the category of good classification.

Although this research is limited to medical records and behavioral risk factors of heart failure patients to classify patient survival resilience, it addressed data imbalance, employed feature selection, and compared multiple algorithms to provide insights into their effectiveness for this specific classification task and improve model efficiency.

Published
2024-06-30
How to Cite
Sa’idah Zahrotul Jannah, Grace Lucyana Koesnadi, & Elly Pusporani. (2024). Comparative Analysis of Classical Methods with Machine Learning Algorithm on Survival Classification of Heart Failure Patients. Jurnal Statistika Dan Aplikasinya, 8(1), 99 - 113. https://doi.org/10.21009/JSA.08109