Evaluation of TF-IDF Extraction Techniques in Sentiment Analysis of Indonesian-Language Marketplaces Using SVM, Logistic Regression, and Naive Bayes
DOI:
https://doi.org/10.21009/j-koma.v8i1.05Keywords:
Sentiment Analysis, Naïve Bayes, SVM, TF-IDFAbstract
This study evaluates the application of TF-IDF feature extraction in sentiment analysis of Indonesian-language marketplace product reviews using Logistic Regression, Naïve Bayes, and Support Vector Machine (SVM) algorithms. The dataset, sourced from Kaggle, comprises 831 reviews (385 positive, 446 negative), processed through preprocessing steps including text cleaning, tokenization, stopword removal, and stemming. The data was split into 80% training and 20% testing sets. Results show that Logistic Regression with TF-IDF achieved the highest performance, with 90.4% accuracy, 91.8% precision, 90.4% recall, and 90.9% F1-measure, outperforming Naïve Bayes (87.4% accuracy) and SVM (89.8% accuracy). Logistic Regression effectively captures linear relationships in TF-IDF features, while Naïve Bayes struggles with emotional context, and SVM requires complex parameterization. TF-IDF is efficient for explicit reviews but limited in handling complex semantic contexts like sarcasm. This study confirms that Logistic Regression combined with TF-IDF is the most effective approach for sentiment analysis of Indonesian marketplace reviews, with recommendations for future exploration of methods like word embedding.Downloads
Published
2025-06-22
How to Cite
Budi Lestari, V., & Apriansyah Hutagalung, C. (2025). Evaluation of TF-IDF Extraction Techniques in Sentiment Analysis of Indonesian-Language Marketplaces Using SVM, Logistic Regression, and Naive Bayes. J-KOMA : Jurnal Ilmu Komputer Dan Aplikasi, 8(1), 36–44. https://doi.org/10.21009/j-koma.v8i1.05
Issue
Section
Articles
License
Copyright (c) 2025 J-KOMA : Jurnal Ilmu Komputer dan Aplikasi

This work is licensed under a Creative Commons Attribution 4.0 International License.