Geospatial Big Data Approaches to Estimate Granular Level Poverty Distribution in East Java, Indonesia using Machine Learning and Deep Learning Regressions

Authors

  • Rifqi Ramadhan Politeknik Statistika STIS
  • Arie Wahyu Wijayanto Department of Statistical Computing, Politeknik Statistika STIS, Jakarta, Indonesia
  • Setia Pramana Department of Statistical Computing, Politeknik Statistika STIS, Jakarta, Indonesia

DOI:

https://doi.org/10.34123/icdsos.v2023i1.359

Keywords:

big data, remote sensing, poverty

Abstract

One of the economic development the focus of the Indonesian government's efforts is for reducing poverty. In Indonesia, collecting poverty data uses the conventional method, the name is National Socio-Economic Survey (SUSENAS) which takes a large cost, time, and effort. To overcome these limitations, there is a need for additional data to provide more detailed poverty data. Recent studies show that the use of geospatial big data could identify poverty at a granular level, with a lower cost and faster update because of their unique and unbiased capacity to identify physical and socioeconomic phenomena. The integrated multi-source satellite imagery data such as the normalized difference vegetation index (NDVI) for detecting rural areas based on vegetation, built-up index (BUI) for identifying urban areas through building distribution, normalized difference water index (NDWI) for land cover detection, day time land surface temperature (LST) for identifying urban regions based on surface temperature, and pollutants such as carbon monoxide (CO), nitrogen dioxide (NO2), and sulfur dioxide (SO2) to evaluate economic activities based on pollution. Additionally, point of interest (POI) density and minimum POI distance are used to measure area accessibility. Therefore, the contribution of this research is to implement the utilization of geospatial big data to estimate the numbers of poverties at a granular level to the 666 sub-districts in East Java Province using machine learning and deep learning regression models. The evaluation results to estimate sub-district level poverty shows that the best model development using Support Vector Regression (SVR) in machine learning was the best root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) values of 0.365, 0.293, and 0.032 with R-squared of 0.59 and MLP in deep learning algorithm with 0.444, 0.345, and 0.039 values of RMSE, MAE, and MAPE with R2 0.52. In addition, the results of visual identification revealed that high estimates of lower poverty are typically found in urban areas with high accessibility, and these areas are not spatially deprived areas with limited accessibility.

Downloads

Published

2023-12-29

How to Cite

Rifqi Ramadhan, Arie Wahyu Wijayanto, & Setia Pramana. (2023). Geospatial Big Data Approaches to Estimate Granular Level Poverty Distribution in East Java, Indonesia using Machine Learning and Deep Learning Regressions. Proceedings of The International Conference on Data Science and Official Statistics, 2023(1), 186–200. https://doi.org/10.34123/icdsos.v2023i1.359