Sentiment Classification of Community towards COVID-19 Issues on Twitter (Case Study: Indonesia, March-May 2020)

Authors

  • Nur Ainun Daulay Politeknik Statistika STIS
  • Rifqi Ramadhan Politeknik Statistika STIS
  • Lya Hulliyyatus Suadaa Politeknik Statistika STIS

DOI:

https://doi.org/10.34123/icdsos.v2023i1.360

Keywords:

COVID-19, Twitter, Lexicon, Machine Learning, Sentiment

Abstract

This study examines sentiment analysis related to COVID-19 in Indonesia (March-May 2020) using InSet Lexicon as training data in supervised machine learning models. The dataset comprises 7,967 tweets, divided into 90% training data and 10% testing data. The results reveal that Support Vector Machine (SVM) and Random Forest (RF) are the most effective methods, achieving accuracy above 80%, with SVM reaching 87% and RF at 86%. InSet Lexicon itself attains an accuracy of 75%, a macro average of 69%, and a weighted average of 74%, making it an effective alternative for large-scale data labeling. Research recommendations support further development of InSet Lexicon for sentiment classification and expansion of the lexicon for foreign languages to enhance sentiment analysis accuracy in a global context. This study provides valuable insights into understanding public sentiment regarding crucial issues such as COVID-19 in Indonesia.

Downloads

Published

2023-12-29

How to Cite

Daulay, N. A., Rifqi Ramadhan, & Lya Hulliyyatus Suadaa. (2023). Sentiment Classification of Community towards COVID-19 Issues on Twitter (Case Study: Indonesia, March-May 2020). Proceedings of The International Conference on Data Science and Official Statistics, 2023(1), 201–217. https://doi.org/10.34123/icdsos.v2023i1.360