Analisis Butir Situasional Judgement Test Kompetensi Kepemimpinan Karyawan BUMN dengan Rasch Model
DOI:
https://doi.org/10.21009/JPPP.132.06Keywords:
item analysis, situational judgement test, leadership competency , competency assessment, assessment own-stated enterprise employeeAbstract
Asesmen kompetensi SDM memiliki peran krusial dalam organisasi, khusunya pada PT X, sebuah BUMN di sektor transportasi Indonesia. PT X mengembangkan tes kompetensi kepemimpinan melalui Situational Judgement Test (SJT), diharapkan dapat secara objektif mengukur kompetensi dengan skenario pekerjaan realistis. Penelitian ini melibatkan 2.368 karyawan PT X dari kelompok jabatan level 1. Analisis butir dilakukan pada 48 aitem yang mengukur 7 kompetensi kepemimpinan pada level kompetensi 1, dengan tujuan meningkatkan mutu tes melalui perbaikan atau penghapusan butir yang tidak sesuai. Ditemukan bahwa sebanyak 15 aitem dijawab >50% responden, sedangkan 33 aitem dijawab benar oleh <50% responden. Meskipun sebagian besar aitem memiliki tingkat kesukaran yang baik (-2 ≥ b ≥ +2), beberapa aitem seperti DLE 1.3.3, DLE 1.2.2, dan DEX 1.2.2 yang memiliki tingkat kesukaran kurang baik. Pengukuran paling akurat ditemukan pada beberapa aitem, seperti SOR 1.2.1, SOR 1.3.1, SOR 1.1.1, dan SOR 1.1.2, sementara DEX 1.2.2 menunjukkan pengukuran yang kurang akurat. Evaluasi kecocokan (Infit & Outfit) pada seluruh aitem menunjukkan nilai yang sesuai (0,5 – 1,5) dengan kompetensi yang diukur, menegaskan keandalan tes. Wright Map menunjukkan kompetensi DEX mampu mengukur keseluruhan abilitas; kompetensi DLE mampu memotret abilitas responden rata-rata hingga tinggi; kompetensi Strategic Orientation (SOR), Developing Organizational Capabilities (DOC), dan Leading Change (LCH) memotret abilitas rata-rata; kompetensi Global Business Savvy (GBS) dan Managing Diversity (MDI) memotret abilitas pada tingkat rata-rata dan di bawah rata-rata. Penelitian ini menyimpulkan bahwa tes SJT ini memiliki kualitas butir yang baik dan dapat diandalkan untuk asesmen kompetensi PT X secara berkelanjuta.
References
Affleck, P., Bowman, M., Wardman, M., Sinclair, S., & Adams, R. (2016). Can we improve on situational judgement tests? British Dental Journal, 220(1), 9-10. https://doi.org/10.1038/sj.bdj.2016.17
Aiken, L. R. (1994). Psychological Testing and Assessment (8th ed.). Allyn & Bacon.
Anastasi, A., & Urbina, S. (1997). Psychological Testing (7th ed.). Upper Saddle River, NJ: Prectice Hall.
Ang, S., Van Dyne, L., & Rockstuhl, T. (2015). Cultural intelligence: Origins, conceptualization, evolution, and methodological diversity. In M. J. Gelfand, C.-Y. Chiu, & Y.-Y. Hong (Eds.), Handbook of advances in culture and psychology, Vol. 5, pp. 273–323). Oxford University Press. https://doi.org/10.1093/acprof:oso/9780190218966.003.0006
Arikunto, S. (2008). Dasar-Dasar Evaluasi Pendidikan. Bumi Aksara.
Ashraf, Z. A., & Jaseem, K. (2020). Classical and modern methods in item analysis of test tools. International Journal of Research and Review, 7(5), 397-403. Azizah, A., & Wahyuningsih, S. (2020). Penggunaan Model Rasch untuk Analisis Instrumen Tes pada Mata Kuliah Matematika Aktuaria. JUPITEK: Jurnal Pendidikan Matematika, 3(1), 45-50. https://doi.org/10.30598/jupitekvol3iss1pp45-50 Blanc, A., & Rojas, A. J. (2018). Use of Rasch Person-Item Maps to Validate a Theoretical Model for Measuring Attitudes toward Sexual Behaviors. PLOS ONE, 13(8), e0202551. https://doi.org/10.1371/journal.pone.0202551 Bond, T.G., & Fox, C.M. (2015). Applying the rasch model fundamental measurement in the human sciences (3rd ed.). Mahwah, NJ: Erlbaum. Boone, W. J. (2016). Rasch Analysis for Instrument Development: Why, When, and How? CBE—Life Sciences Education, 15(4), rm4. https://doi.org/10.1187/cbe.16-04-0148
Courville, T. G. (2004). An Empirical Comparison of Item Response Theory and Classical Test Theory Item/Person Statistics. Unpublished Ph.D Dissertation, Texas A & M University. Engelhard Jr., G. (2013). Invariant Measurement: Using Rasch Models in the Social, Behavioral and Health Sciences. https://doi.org/10.4324/9780203073636 Fernanda, J. W., & Hidayah, N. (2020). Analisis Kualitas Soal Ujian Statistika Menggunakan Classical Test Theory dan Rasch Model. Square: Journal of Mathematics and Mathematics Education, 2(1), 49. https://doi.org/10.21580/square.2020.2.1.5363
Fitrianawati, M. (2017). Peran Analisis Butir Soal Guna Meningkatkan Kualitas Butir Soal, Kompetensi Guru dan Hasil Belajar Peserta Didik. https://publikasiilmiah.ums.ac.id/xmlui/handle/11617/9117 Guenole, N., Chernyshenko, O., Stark, S., & Drasgow, F. (2014). Are Predictions Based on Situational Judgement Tests Precise Enough for Feedback in Leadership Development? European Journal of Work and Organizational Psychology, 24(3), 433-443. https://doi.org/10.1080/1359432x.2014.926890 Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1985, December 31). Item Response Theory. SpringerLink. https://link.springer.com/book/10.1007/978-94-017-1988-9 Jumini, S., Madnasri, S., Cahyono, E., & Parmin, P. (2023, June). Analisis Kualitas Butir Soal Pengukuran Literasi Sains Melalui Teori Tes Klasik Dan Rasch Model. In Prosiding Seminar Nasional Pascasarjana (Vol. 6, No. 1, pp. 758-765). https://proceeding.unnes.ac.id/index.php/snpasca/article/view/2215 Karabatsos, G. (2000). A Critique of Rasch Residual Fit Statistics. Journal of Applied Measurement, 1(2), 152–176. https://pubmed.ncbi.nlm.nih.gov/12029176/ Katz, D., Clairmont, A., & Wilton, M. (2021). Chapter 3 the Rasch model | Measuring what matters: Introduction to Rasch analysis in R. Bookdown. https://bookdown.org/chua/new_rasch_demo2/viewdifficulty.html Kementerian BUMN. (2019). Kamus Kompetensi ASN di Lingkungan Kementerian BUMN. https://jdih.bumn.go.id/storage/peraturan/PER%2004%20MBU%2010%202019.pdf Kementerian BUMN. (2021). Permen BUMN no. PER-11/MBU/07/2021 Tahun 2021. https://peraturan.bpk.go.id/Details/181405/permen-bumn-no-per-11mbu072021-tahun-2021 Krabbe, P. F. M. (2017). Item Response Theory. The Measurement of Health and Health Status, 171–195. https://doi.org/10.1016/B978-0-12-801504-9.00010-6 Krumm, S., Lievens, F., Hüffmeier, J., Lipnevich, A. A., Bendels, H., & Hertel, G. (2015). How “situational” is judgment in situational judgment tests? Journal of Applied Psychology, 100(2), 399–416. https://doi.org/10.1037/a0037674 Kurniawan, U., & Andriyani, K. D. (2018). Analisis Soal Pilihan Ganda dengan Rasch Model. Statistika, 6(1), 34-39. https://doi.org/10.26714/jsunimus.6.1.2018.%25p Labola, Y. A. (2019). Konsep Pengembangan Sumber Daya Manusia Berbasis Kompetensi, Bakat dan Ketahanan dalam Organisasi. Jurnal Manajemen & Kewirausahaan, 7(1), 28-35. Lievens, F., & Motowidlo, S. J. (2015). Situational Judgment Tests: From Measures of Situational judgment to Measures of General Domain Knowledge. Industrial and Organizational Psychology, 9(1), 3-22. https://doi.org/10.1017/iop.2015.71 Lievens, F., & Patterson, F. (2011). The Validity and Incremental Validity of Knowledge Tests, Low-Fidelity Simulations, and High-Fidelity Simulations for Predicting Job Performance in Advanced-Level High-Stakes Selection. Journal of Applied Psychology, 96(5), 927-940. https://doi.org/10.1037/a0023496 Lievens, F., Peeters, H., & Schollaert, E. (2008). Situational Judgment Tests: A Review of Recent Research. Personnel Review, 37(4), 426-441. Lievens, F., Buyse, T., & Sackett, P. R. (2005). Retest effects in operational selection settings: Development and test of a framework. Personnel Psychology, 58(4), 981–1007. https://doi.org/10.1111/j.1744-6570.2005.00713.x Linacre, J.M. (2002). What Do Infit and Outfit Mean-Square and Standardized Mean?. Rasch Measurement Transaction, 16, 878.
Linacre, J.M. (2002). Understanding Rasch Measurement: Optimizing Rating Scale Category Effectiveness. Journal of Applied Measurement. 3. 85-106. Linacre, J. M. (2012). Expected score ICC, IRF (Rasch-half-point thresholds). Winsteps and Facets: Rasch Analysis + Rasch Measurement Software + 1PL IRT. https://www.winsteps.com/winman/expectedscoreicc.htm Linden, W. J., & Hambleton, R. K. (1997). Item Response Theory: Brief History, Common Models, and Extensions. Handbook of Modern Item Response Theory, 1-28. https://doi.org/10.1007/978-1-4757-2691-6_1
Muktamiroh, H., Herqutanto, H., Soemantri, D., & Purwadianto, A. (2021). The Potential of Situational Judgement Test as an Instrument of Ethical Competence Assessment: A Literature Review. Jurnal Pendidikan Kedokteran Indonesia: The Indonesian Journal of Medical Education, 10(3), 314. https://doi.org/10.22146/jpki.53735
Musid, N. A., Matore, M. E., & Hamid, H. A. (2023, September 23). Inter-rater reliability for assessing digital leadership situational judgement test linguistic validation using Cohen kappa. Journal for ReAttach Therapy and Developmental Diversities. https://www.jrtdd.com/index.php/journal/article/view/1504 Olsen, L. W. (2003). Essays on Georg Rasch and His Contributions to Statistics. Københavns Universitet, Økonomisk Institut. Passi, V., Doug, M., Peile, E., Thistlethwaite, J., & Johnson, N. (2010). Developing medical professionalism in future doctors: A systematic review. International Journal of Medical Education, 1, 19-29. https://doi.org/10.5116/ijme.4bda.ca2a Rasch, G. (1966). An Item Analysis which Takes Individual Differences into Account. British Journal of Mathematical and Statistical Psychology, 19(1), 49–57. https://doi.org/10.1111/j.2044-8317.1966.tb00354.x Rasch, G. (1960). Studies in Mathematical Psychology: I. Probabilistic Models for Some Intelligence and Attainment Tests. Nielsen & Lydiche. https://psycnet.apa.org/record/1962-07791-000 Rost, J., & Von Davier, M. (1994). A Conditional Item-Fit Index for Rasch Models. Applied Psychological Measurement, 18(2), 171-182. https://doi.org/10.1177/014662169401800206 Sackett, P. R., Zhang, C., Berry, C. M., & Lievens, F. (2021). Revisiting meta-analytic estimates of validity in personnel selection: Addressing systematic overcorrection for restriction of range. Journal of Applied Psychology, 106(7), 1031–1052. https://doi.org/10.1037/apl0000994 Seol, H. (2020). Item Analysis - Jamovi. https://forum.jamovi.org/viewtopic.php?f=6&t=1385 Sumintono, B. (2017). Rasch Model Measurement as Tools in Assessment for Learning. Advaces in social science. Education and Humanities Research, 173. Sumintono, B. (2014). Model Rasch untuk penelitian sosial kuantitatif. Sumintono. (2013). Ukuran Sampel untuk Kalibrasi Aitem. Rasch Model: Riset Kuantitatif. https://deceng3.wordpress.com/2013/08/13/sampel/ Sumintono, B., & Widhiarso, W. (2013). Aplikasi Model Rasch Untuk Penelitian Ilmu-Ilmu Sosial (Edisi Revisi). Trim Komunikata Publishing House. Walsh, J. L., Woolley, M. R., Brady, M. F., Melick, S. R., & Carretta, T. R. (2021, December). Air Force Officer Qualifying Test (AFOQT) form T: Psychometric Evaluation of the Situational Judgment Test. DTIC. https://apps.dtic.mil/sti/citations/AD1157021 Widhiarso, W. (2021). Panduan Penulisan Situational Judgment Test (SJT). Yogyakarta: UPAP Fakultas Psikologi UGM. Widhiarso, W., Hidayat, R., & Anggoro, W. J. (2018). Panduan Pengembangan Tes Penilaian Situasional (Situational Judgement Test). Yogyakarta: Fakultas Psikologi UGM & Pusat Penilaian Pendidikan Balitbang Kemdikbud. Widhiarso, W. (2017). Penerapan Model Rasch untuk Mengevaluasi Tes UKKS dan UKPS. https://widhiarso.staff.ugm.ac.id/wp/wp-content/uploads/Widhiarso-Penerapan-Model-Rasch-Untuk-Mengevaluasi-Tes-UKKS-Dan-UKPS.pdf Yukl, G. A. (2002). Leadership in organizations (5th ed.). Prentice Hall. Zubairi, A.M., & Kassim, N.L.A. (2006). Classical and rasch analyses of dichotomously scored reading comprehension test items. Malaysian Journal of ELT Research, 2(1), 1-20. https://www.researchgate.net/publication/254504568