Comparison of Naive Bayes, K-Nearest Neighbor, and Support Vector Machine Classification Methods in Semi-Supervised Learning for Sentiment Analysis of Kereta Cepat Jakarta Bandung (KCJB)

Muhammad Farhan; Renata De La Rosa Manik; Hana Raihanatul Jannah; Lya Hulliyyatus Suadaa

doi:10.34123/icdsos.v2023i1.332

Comparison of Naive Bayes, K-Nearest Neighbor, and Support Vector Machine Classification Methods in Semi-Supervised Learning for Sentiment Analysis of Kereta Cepat Jakarta Bandung (KCJB)

Authors

Muhammad Farhan Politeknik Statistika STIS
Renata De La Rosa Manik Politeknik Statistika STIS
Hana Raihanatul Jannah Politeknik Statistika STIS
Lya Hulliyyatus Suadaa Politeknik Statistika STIS

DOI:

https://doi.org/10.34123/icdsos.v2023i1.332

Keywords:

supervised learning, Naive Bayes, K-NN, SVM, Sentiment Analysis, KCJB, Semi-Supervised Learning

Abstract

Transportation technology has developed very rapidly in the 21st century; one of them is high-speed trains. Currently, the Indonesian government is implementing the construction of the Kereta Cepat Jakarta-Bandung (KCJB) project in collaboration with China. The construction of this fast train project has attracted various comments and opinions from the public on Twitter and social media. This research aims to compare the classification methods of Naïve Bayes, K-Nearest Neighbor (K-NN), and Support Vector Machine (SVM) in classifying sentiment in tweets about high-speed trains obtained by scraping Twitter. The comparison process was carried out using semi-supervised learning, and the results showed that the semi-supervised SVM model had the best performance with an average accuracy of 86%, followed by the semi-supervised Naïve Bayes model and semi-supervised K-NN with an average accuracy of 81% and 58% respectively. Overall, the prediction results from the three models conclude that there are more tweets with negative sentiment than tweets with positive and neutral sentiment.

Downloads

Published

2023-12-29

How to Cite

Muhammad Farhan, Renata De La Rosa Manik, Hana Raihanatul Jannah, & Lya Hulliyyatus Suadaa. (2023). Comparison of Naive Bayes, K-Nearest Neighbor, and Support Vector Machine Classification Methods in Semi-Supervised Learning for Sentiment Analysis of Kereta Cepat Jakarta Bandung (KCJB). Proceedings of The International Conference on Data Science and Official Statistics, 2023(1), 109–120. https://doi.org/10.34123/icdsos.v2023i1.332

Download Citation

Issue

Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official Statistics (ICDSOS)

Section

Data Science

Comparison of Naive Bayes, K-Nearest Neighbor, and Support Vector Machine Classification Methods in Semi-Supervised Learning for Sentiment Analysis of Kereta Cepat Jakarta Bandung (KCJB)

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Current Issue

Information

SUPPORTED BY

SITE LINKS

CONTACT US