Automated Indonesian Text Augmentation with Web-Based Application Using Flask Framework

Iftitah Athiyyah Rahma; Lya Hulliyyatus Suadaa

doi:10.34123/icdsos.v2023i1.324

Authors

Iftitah Athiyyah Rahma Politeknik Statistika STIS
Lya Hulliyyatus Suadaa Politeknik Statistika STIS

DOI:

https://doi.org/10.34123/icdsos.v2023i1.324

Keywords:

text augmentation, text classification, imbalanced data, web application

Abstract

In real world, data and resources available for text classification are limited. One of issues on labelled data is imbalanced data. Problem of imbalanced data affects performance and accuracy of model because the model only focuses on data with majority label. Therefore, the measure of model accuracy cannot describe the true quality of model. To overcome this, an oversampling approach is carried out. Text-based oversampling is known as text augmentation. However, NLP resources for Indonesian, especially in performing text augmentation, are still limited. Therefore, this research conducts development of a web application to augment Indonesian text automatically. The application was bulit using prototype method. The application was successfully built and can facilitate users to perform augmentation automatically for all texts in the dataset. Users can select preferred augmentation technique and are required to upload datasets as input. The output of application is same dataset file as input with an additional column containing synthetic text augmented by the application. This application can contribute to further research in performing text augmentation for Indonesians.

Automated Indonesian Text Augmentation with Web-Based Application Using Flask Framework

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Current Issue

Information

SUPPORTED BY

SITE LINKS

CONTACT US