Penerapan Metode SMOTE CHAID dalam Klasifikasi Tuberkulosis Relapse

  • Vera Maya Santi Universitas Negeri Jakarta
  • Lina Nafisah Universitas Negeri Jakarta
  • Qorry Meidianingsih
Keywords: CHAID, tuberculosis, relapse, imbalanced class, SMOTE

Abstract

DKI Jakarta Province is one of the provinces with the highest tuberculosis cases, and a person's chance of contracting tuberculosis is the greatest among other provinces. Tuberculosis can be cured with regular treatment within a certain period of time, but after recovering, some tuberculosis sufferers may relapse so that it can cause new problems. This study aims to build a classification model and determine which factors influence tuberculosis relapse using the CHAID method. SMOTE with majority undersampling is applied as a solution to deal with the problem of patient categories (relapse and non-relapse) who have an unbalanced number of observations. Based on the CHAID classification tree, the results show that the factors that influence relapse in tuberculosis patients include the type of diagnosis, age, gender, and place of residence. In addition, the application of SMOTE can improve the performance of the CHAID classification tree in classifying patients based on their categories. These results were indicated by an increase in the values of accuracy, sensitivity, and specificity to 76,153; 26,667; and 82,608 compared to the performance of CHAID without SMOTE. Based on these results, the SMOTE CHAID classification model has better performance than CHAID

Published
2022-06-30
How to Cite
Santi, V. M., Nafisah, L., & Meidianingsih, Q. (2022). Penerapan Metode SMOTE CHAID dalam Klasifikasi Tuberkulosis Relapse. Jurnal Statistika Dan Aplikasinya, 6(1), 26 - 36. https://doi.org/10.21009/JSA.06103