PERFORMANCE EVALUATION OF WORD EMBEDDING TECHNIQUES IN TWITTER SENTIMENT ANALYSIS USING LSTM

Faroh Ladayya; Widyanti Rahayu; Siti Rohmah Rohimah; Ferdiansyah Rizki Saputra; Thoriq Akbar Maulana; Najwa Nur Madinah

doi:10.21009/JSA.09206

Authors

Faroh Ladayya Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
Widyanti Rahayu Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
Siti Rohmah Rohimah Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
Ferdiansyah Rizki Saputra Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
Thoriq Akbar Maulana Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta
Najwa Nur Madinah Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta

DOI:

https://doi.org/10.21009/JSA.09206

Keywords:

Long-Short Term Memory, Sentiment Analysis, Word Embedding

Abstract

Opinions expressed on social media can be used as feedback on a product, both goods and services. The sentiment analysis was utilized for analyzing opinions given by the public via social media. The sentiment contained in an opinion can be positive, negative, or neutral. This study aims to compare the performance of three word embedding techniques—Word2Vec, GloVe, and FastText—when combined with a Long Short-Term Memory (LSTM) model for sentiment classification of Indonesian Twitter data. LSTM was selected due to its ability to model sequential text data and capture long-term contextual dependencies that are often present in natural language. To enable sentiment classification using LSTM, textual data from social media were transformed into numerical vectors. Thus, the word embedding technique is used to convert text into a vector. The vector that had been obtained will be used as input for LSTM. All embeddings were evaluated under the same preprocessing pipeline and LSTM architecture to ensure a fair comparison. Model performance was assessed using accuracy, precision, recall, F1-score, and ROC/AUC metrics. The results indicate that the LSTM model effectively captures sentiment patterns in Indonesian tweets, with Word2Vec achieving the best overall performance, followed by GloVe and FastText. These findings suggest that domain-adapted word embeddings remain highly effective for sentiment analysis in Indonesian social media contexts.

Author Biography

Faroh Ladayya, Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta

Departement of Statistics

PERFORMANCE EVALUATION OF WORD EMBEDDING TECHNIQUES IN TWITTER SENTIMENT ANALYSIS USING LSTM

Authors

DOI:

Keywords:

Abstract

Author Biography

Faroh Ladayya, Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Jakarta

Downloads

Published

How to Cite

Issue

Section

Menu

certificate

statistics

Tools