PERFORMANCE EVALUATION OF WORD EMBEDDING TECHNIQUES IN TWITTER SENTIMENT ANALYSIS USING LSTM

Authors

  • Faroh Ladayya Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
  • Widyanti Rahayu Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
  • Siti Rohmah Rohimah Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
  • Ferdiansyah Rizki Saputra Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
  • Thoriq Akbar Maulana Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
  • Najwa Nur Madinah Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta

DOI:

https://doi.org/10.21009/JSA.09206

Keywords:

Long-Short Term Memory, Sentiment Analysis, Word Embedding

Abstract

Opinions expressed on social media can be used as feedback on a product, both goods and services. The sentiment analysis was utilized for analyzing opinions given by the public via social media. The sentiment contained in an opinion can be positive, negative, or neutral. This study aims to compare the performance of three word embedding techniques—Word2Vec, GloVe, and FastText—when combined with a Long Short-Term Memory (LSTM) model for sentiment classification of Indonesian Twitter data. LSTM was selected due to its ability to model sequential text data and capture long-term contextual dependencies that are often present in natural language. To enable sentiment classification using LSTM, textual data from social media were transformed into numerical vectors. Thus, the word embedding technique is used to convert text into a vector. The vector that had been obtained will be used as input for LSTM. All embeddings were evaluated under the same preprocessing pipeline and LSTM architecture to ensure a fair comparison. Model performance was assessed using accuracy, precision, recall, F1-score, and ROC/AUC metrics. The results indicate that the LSTM model effectively captures sentiment patterns in Indonesian tweets, with Word2Vec achieving the best overall performance, followed by GloVe and FastText. These findings suggest that domain-adapted word embeddings remain highly effective for sentiment analysis in Indonesian social media contexts.

Author Biography

Faroh Ladayya, Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta

Departement of Statistics

Downloads

Published

2025-12-31

How to Cite

Ladayya, F., Rahayu, W., Rohimah, S. R., Saputra, F. R., Maulana, T. A., & Madinah, N. N. (2025). PERFORMANCE EVALUATION OF WORD EMBEDDING TECHNIQUES IN TWITTER SENTIMENT ANALYSIS USING LSTM. Jurnal Statistika Dan Aplikasinya, 9(2), 55–68. https://doi.org/10.21009/JSA.09206