DATA IMPUTATION FOR BIVARIATE GAMMA-GENERATED DATA USING PREDICTIVE MEAN MATCHING AND RANDOM FOREST METHODS

Authors

  • Muhammad Arib Alwansyah Arib Universitas Negeri Jakarta
  • Jose Rizal Jose Universitas Bengkulu
  • Ramya Rachmawati Ramya Universitas Bengkulu

DOI:

https://doi.org/10.21009/JSA.10103

Keywords:

Bivariate Gamma, Mean Absolute Percentage Error, Predictive Mean Matching, Random Forest Imputations, Root Mean Square Error

Abstract

Missing data is a common problem in data analysis and can reduce the quality and accuracy of research results if not handled properly. This study aims to compare the Predictive Mean Matching (PMM) and Random Forest (RF) imputation methods in handling missing data with missing levels of 5%, 10%, 15%, and 20% using correlation indicators, p-values, and observing the smallest Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) values. The results show that both methods differ at each level of missing data. At 5% missing data, both methods show significant differences to the original data with a p-value smaller than α = 0.05, but the RF method produces smaller MAPE and RMSE values ​​than PMM. At 10% missing data, the PMM method still shows significant differences to the original data, while the RF method does not. At 15% missing data, the PMM method showed results that were not significantly different from the original data and had smaller MAPE and RMSE values ​​than RF. Meanwhile, at 20% missing data, the RF method produced the highest correlation value of 0.7788 compared to PMM at 0.7638. In general, the results of the study indicate that the greater the proportion of missing data, the imputation error rate also tends to increase. Therefore, the selection of imputation methods needs to be adjusted to the characteristics and proportion of missing data to obtain optimal imputation results.

Downloads

Published

2026-06-30

How to Cite

Arib, M. A. A., Jose, J. R., & Ramya, R. R. (2026). DATA IMPUTATION FOR BIVARIATE GAMMA-GENERATED DATA USING PREDICTIVE MEAN MATCHING AND RANDOM FOREST METHODS. Jurnal Statistika Dan Aplikasinya, 10(1), 29–38. https://doi.org/10.21009/JSA.10103