COMPARISON OF XGBOOST AND SVM FOR SENTIMENT ANALYSIS MERAH PUTIH COOPERATIVE POLICY
DOI:
https://doi.org/10.21009/JSA.10105Keywords:
Extreme Gradient Boosting, Hyperparameter Tuning, Sentiment Analysis, Support Vector Machine, TF-IDFAbstract
Large volumes of textual data have been generated by the rapid growth of social media, making sentiment analysis an effective approach for understanding public perceptions of government policies. However, text classification still faces challenges such as high feature dimensionality, class imbalance, and non-linear relationships within the data. This study used data from the X social media platform to evaluate the performance of the Support Vector Machine (SVM) and XGBoost algorithms in classifying public sentiment toward the Koperasi Desa Merah Putih policy. The dataset consisted of 1,074 tweets collected through a scraping technique between July 21 and November 4, 2025, comprising 800 positive tweets and 274 negative tweets. The research process included data preprocessing, feature extraction using TF-IDF, data splitting with an 80:20 ratio, and hyperparameter tuning using GridSearchCV with 5-fold cross-validation. The models were evaluated using accuracy, precision, recall, and F1-score. Hyperparameter tuning successfully enhanced the performance of both models, with SVM benefiting the most from the optimisation process. The findings demonstrated that both models achieved strong classification performance; however, SVM outperformed XGBoost. The SVM model achieved an accuracy of 95%, with more balanced precision, recall, and F1-score values across both sentiment classes, whereas XGBoost achieved an accuracy of 91% and showed limitations in detecting negative sentiment as the minority class. The data exploration results also indicated that most users expressed positive sentiment toward the Koperasi Desa Merah Putih policy. Nevertheless, this study has several limitations, particularly the use of TF-IDF-based feature representation, which does not capture semantic relationships or sarcasm in textual data. The novelty of this study lies in the comparison of SVM and XGBoost with hyperparameter tuning using GridSearchCV in the context of sentiment analysis of the Koperasi Desa Merah Putih policy, a topic that has received limited attention in previous studies.



