Geospatial Big Data Approaches to Estimate Granular Level Poverty Distribution in East Java, Indonesia using Machine Learning and Deep Learning Regressions
DOI:
https://doi.org/10.34123/icdsos.v2023i1.359Keywords:
big data, remote sensing, povertyAbstract
One of the economic development the focus of the Indonesian government's efforts is for reducing poverty. In Indonesia, collecting poverty data uses the conventional method, the name is National Socio-Economic Survey (SUSENAS) which takes a large cost, time, and effort. To overcome these limitations, there is a need for additional data to provide more detailed poverty data. Recent studies show that the use of geospatial big data could identify poverty at a granular level, with a lower cost and faster update because of their unique and unbiased capacity to identify physical and socioeconomic phenomena. The integrated multi-source satellite imagery data such as the normalized difference vegetation index (NDVI) for detecting rural areas based on vegetation, built-up index (BUI) for identifying urban areas through building distribution, normalized difference water index (NDWI) for land cover detection, day time land surface temperature (LST) for identifying urban regions based on surface temperature, and pollutants such as carbon monoxide (CO), nitrogen dioxide (NO2), and sulfur dioxide (SO2) to evaluate economic activities based on pollution. Additionally, point of interest (POI) density and minimum POI distance are used to measure area accessibility. Therefore, the contribution of this research is to implement the utilization of geospatial big data to estimate the numbers of poverties at a granular level to the 666 sub-districts in East Java Province using machine learning and deep learning regression models. The evaluation results to estimate sub-district level poverty shows that the best model development using Support Vector Regression (SVR) in machine learning was the best root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) values of 0.365, 0.293, and 0.032 with R-squared of 0.59 and MLP in deep learning algorithm with 0.444, 0.345, and 0.039 values of RMSE, MAE, and MAPE with R2 0.52. In addition, the results of visual identification revealed that high estimates of lower poverty are typically found in urban areas with high accessibility, and these areas are not spatially deprived areas with limited accessibility.