Proceedings of The International Conference on Data Science and Official Statistics https://proceedings.stis.ac.id/icdsos Politeknik Statistika STIS en-US Proceedings of The International Conference on Data Science and Official Statistics 2809-9842 Nowcasting of Chili Pepper (Capsicum frutescens L.) Prices in East Java Province Using Multi-Layer Perceptron Method https://proceedings.stis.ac.id/icdsos/article/view/274 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;The aims of study is to predict the price of chili pepper at the provincial level in East Java by looking for the best input variable from three types of input variables, price of chili pepper at the regency and city levels, natural factors, and word search index on Google Trends as an approach to the causes of chili pepper price fluctuations. The Multi-Layer Perceptron method, accompanied by a search for the best combination of model parameters is selected to get the model with the best nowcasting ability. The result shows that the best model for nowcasting is characterized by: the input variable is price of chili pepper at the regency and city levels with three hidden layers and 32, 45, and 51 neurons in each hidden layer, maximum iteration is 200 iterations, maximum iteration when the model not increase in performance for applying early stopping is 20 iterations, non-linear activation used is RELU (Rectified Linear Unit), and optimization function used is ADAM optimizer. The accuracy of nowcasting in this study is highly accurated with MAPE smaller than 10%.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">The aims of study is to predict the price of chili pepper at the provincial level in East Java by looking for the best input variable from three types of input variables, price of chili pepper at the regency and city levels, natural factors, and word search index on Google Trends as an approach to the causes of chili pepper price fluctuations. The Multi-Layer Perceptron method, accompanied by a search for the best combination of model parameters is selected to get the model with the best nowcasting ability. The result shows that the best model for nowcasting is characterized by: the input variable is price of chili pepper at the regency and city levels with three hidden layers and 32, 45, and 51 neurons in each hidden layer, maximum iteration is 200 iterations, maximum iteration when the model not increase in performance for applying early stopping is 20 iterations, non-linear activation used is RELU (Rectified Linear Unit), and optimization function used is ADAM optimizer. The accuracy of nowcasting in this study is highly accurated with MAPE smaller than 10%.</span></p> Mohamad Choirul Zamzami Nucke Widowati Kusumo Projo Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 2 12 10.34123/icdsos.v2023i1.274 RegTech Solutions: Generic Business Process Analysis and Modeling https://proceedings.stis.ac.id/icdsos/article/view/276 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Regulatory Technology, often known as RegTech, is an innovative strategy developed in the finance industry to streamline the regulatory compliance process. RegTech takes advantage of various new technologies such as artificial intelligence (AI), machine learning (ML), big data analytics (BD), cloud computing (CC), robotic process automation (RPA), and various other new technologies. RegTech can be applied to other industries where compliance with and oversight by rules are necessary. To support this, one way to be done is to analyze and model business processes for RegTech solutions in a generic style so that they can be adopted by various other fields in executing these solutions. In this research, a generic business process analysis of RegTech solutions was carried out through a literature study related to the use of RegTech in the financial industry. Next, modeling the analysis results using Business Process Model and Notation (BPMN) with the support of the Bizagi application. The modeling results are then tested for validity and applied to a logical scenario related to non-financial regulatory compliance. The test results show that the business process modeling results are valid and can be used as a reference in implementing RegTech solutions outside the financial sector.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Regulatory Technology, often known as RegTech, is an innovative strategy developed in the finance industry to streamline the regulatory compliance process. RegTech takes advantage of various new technologies such as artificial intelligence (AI), machine learning (ML), big data analytics (BD), cloud computing (CC), robotic process automation (RPA), and various other new technologies. RegTech can be applied to other industries where compliance with and oversight by rules are necessary. To support this, one way to be done is to analyze and model business processes for RegTech solutions in a generic style so that they can be adopted by various other fields in executing these solutions. In this research, a generic business process analysis of RegTech solutions was carried out through a literature study related to the use of RegTech in the financial industry. Next, modeling the analysis results using Business Process Model and Notation (BPMN) with the support of the Bizagi application. The modeling results are then tested for validity and applied to a logical scenario related to non-financial regulatory compliance. The test results show that the business process modeling results are valid and can be used as a reference in implementing RegTech solutions outside the financial sector.</span></p> Benny Firmansyah Arry Akhmad Arman Widya Sri Wahyuni Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 13 25 10.34123/icdsos.v2023i1.276 Deep Learning Approaches for Predicting Intraday Price Movements: An Evaluation of RNN Variants on High-Frequency Stock Data https://proceedings.stis.ac.id/icdsos/article/view/278 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;This study discusses the comparison of four recurrent neural networks (RNN) models: Simple RNN, Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), and Bidirectional RNN (BiRNN), in forecasting minute-level stock price time series data. The performance of these four models is evaluated using the Mean Absolute Percentage Error (MAPE) on a stock dataset from Bank Central Asia (BBCA.JK). The experimental results reveal that the GRU model exhibits the best performance with an average MAPE of 0.0255%, followed by the LSTM model with an average MAPE of 0.0377%. The BiRNN model also demonstrates good performance with an average MAPE of 0.0668%, while the Simple RNN has the highest average MAPE at 0.5118%. This suggests that more complex recurrent architectures like GRU and LSTM have better capabilities in capturing patterns in high-frequency time series data. This study can be expanded by exploring other models such as CNN, conducting tests on diverse datasets, and experimenting with a wider range of hyperparameter variations. Additional variables such as economic indicators, global market data, and social data can also offer a more comprehensive understanding of factors influencing stock prices.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">This study discusses the comparison of four recurrent neural networks (RNN) models: Simple RNN, Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), and Bidirectional RNN (BiRNN), in forecasting minute-level stock price time series data. The performance of these four models is evaluated using the Mean Absolute Percentage Error (MAPE) on a stock dataset from Bank Central Asia (BBCA.JK). The experimental results reveal that the GRU model exhibits the best performance with an average MAPE of 0.0255%, followed by the LSTM model with an average MAPE of 0.0377%. The BiRNN model also demonstrates good performance with an average MAPE of 0.0668%, while the Simple RNN has the highest average MAPE at 0.5118%. This suggests that more complex recurrent architectures like GRU and LSTM have better capabilities in capturing patterns in high-frequency time series data. This study can be expanded by exploring other models such as CNN, conducting tests on diverse datasets, and experimenting with a wider range of hyperparameter variations. Additional variables such as economic indicators, global market data, and social data can also offer a more comprehensive understanding of factors influencing stock prices.</span></p> Mochamad Ridwan Kusman Sadik Farit Mochamad Afendi Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 26 37 10.34123/icdsos.v2023i1.278 Exploration of Resnet Variants in High Spatial Resolution Domain Adaptation https://proceedings.stis.ac.id/icdsos/article/view/280 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Land cover is nowadays mapped mostly from airborne and space-borne data. Because of the difference in sensors, large spectral differences and inconsistent spatial resolution may arise between these two data sources. Consequently, the same object may exhibit completely different features. In this case, models trained from annotated airborne and ineffective when applied to space-borne data. Cross-Sensor Land-COVER (LoveCS) shows good results in overcoming this problem. LoveCS leverages small-scale aerial image annotations to promote land cover mapping on large-scale spacecraft. LoveCS uses ResNet50 as its encoder. In recent years, many studies have tried to develop other variants of ResNet, such as ResNeXt, ResNeSt, Res2Net, and Res2NeXt. These variants turned out to give better results in a variety of tasks compared to ResNet. Therefore, in this study we modified the LoveCS encoder by replacing ResNet50 with ResNet variants such as ResNeXt, ResNeSt, Res2Net, and Res2NeXt in an effort to improve LoveCS accuracy. Our evaluation shows that Res2Net50 as an encoder improves the performance of LoveCS. The average F1 increases by 1.38%, OA by 1.96%, and Kappa by 2.75% from the baseline method.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Land cover is nowadays mapped mostly from airborne and space-borne data. Because of the difference in sensors, large spectral differences and inconsistent spatial resolution may arise between these two data sources. Consequently, the same object may exhibit completely different features. In this case, models trained from annotated airborne and ineffective when applied to space-borne data. Cross-Sensor Land-COVER (LoveCS) shows good results in overcoming this problem. LoveCS leverages small-scale aerial image annotations to promote land cover mapping on large-scale spacecraft. LoveCS uses ResNet50 as its encoder. In recent years, many studies have tried to develop other variants of ResNet, such as ResNeXt, ResNeSt, Res2Net, and Res2NeXt. These variants turned out to give better results in a variety of tasks compared to ResNet. Therefore, in this study we modified the LoveCS encoder by replacing ResNet50 with ResNet variants such as ResNeXt, ResNeSt, Res2Net, and Res2NeXt in an effort to improve LoveCS accuracy. Our evaluation shows that Res2Net50 as an encoder improves the performance of LoveCS. The average F1 increases by 1.38%, OA by 1.96%, and Kappa by 2.75% from the baseline method.</span></p> Sulisetyo Puji Widodo Nur Rachmawati Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 38 46 10.34123/icdsos.v2023i1.280 Pojok Statistik Virtual Improvement: Development of Online Consultation and Scientific Articles Modules https://proceedings.stis.ac.id/icdsos/article/view/284 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;With a commitment to improving statistics literacy in Indonesia, BPS Statistics Indonesia built a Pojok Statistik. Pojok Statistik is a collaborative service between BPS Statistics Indonesia and Universities initiated to answer the needs of academics and students for statistics. Due to the effects of the pandemic and the increasing interest from students, a virtual version of Pojok Statistik (Pojok Statistik Virtual) was built to meet all the needs of the offline version. However, the features are still limited and do not represent the criteria of the Pojok Statistik Offline. From the results of interviews with the Pojok Statistik Team, several plans exist to improve the Pojok Statistik Virtual, including adding online consultation and scientific articles modules. Therefore, this study aims to add online consulting service features and scientific articles modules to the Pojok Statistik Virtual. The system development process uses the Prototyping Methodology. System evaluation is conducted by black-box testing and usability testing using the USE Questionnaire. The evaluation results show that all functions in both modules are running and functioning properly. The usability test results using the USE Questionnaire reveal that these two features are feasible or usable to support the statistical needs of academics and students.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">With a commitment to improving statistics literacy in Indonesia, BPS Statistics Indonesia built a Pojok Statistik. Pojok Statistik is a collaborative service between BPS Statistics Indonesia and Universities initiated to answer the needs of academics and students for statistics. Due to the effects of the pandemic and the increasing interest from students, a virtual version of Pojok Statistik (Pojok Statistik Virtual) was built to meet all the needs of the offline version. However, the features are still limited and do not represent the criteria of the Pojok Statistik Offline. From the results of interviews with the Pojok Statistik Team, several plans exist to improve the Pojok Statistik Virtual, including adding online consultation and scientific articles modules. Therefore, this study aims to add online consulting service features and scientific articles modules to the Pojok Statistik Virtual. The system development process uses the Prototyping Methodology. System evaluation is conducted by black-box testing and usability testing using the USE Questionnaire. The evaluation results show that all functions in both modules are running and functioning properly. The usability test results using the USE Questionnaire reveal that these two features are feasible or usable to support the statistical needs of academics and students.</span></p> Fahmi Muhammad Sahal Nori Wilantika Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 47 57 10.34123/icdsos.v2023i1.284 Development of Student’s Uniform Compliance Detection System Using Real Time Image Recognition at Politeknik Statistika STIS https://proceedings.stis.ac.id/icdsos/article/view/298 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Regulations in the Politeknik Statistika STIS (hereinafter called Polstat STIS) aims to produce graduates who are qualified, with integrity and trusted. In enforcing regulations in Polstat STIS, there are a student squad of regulations enforcement, which is called Satuan Penegak Disiplin or SPD in Indonesian, which aims to maintain the order, discipline and student ethics during their activities on and off campus. In upholding the regulations, SPD carries out surprise inspection and during the weekly morning assembly to check completeness and tidiness of student’s uniform as well as his/her look. However, previous research related to the student’s commitment to the campus regulations shows that half of the students have low commitment. This is partly due to the lack of supervision of students. Therefore, it is necessary to monitor the discipline and neatness of students on an ongoing basis. In order to conduct the monitoring and inspection on a more regular basis, the method of image recognition can be used to assist in overseeing student discipline and neatness. In this study, we developed a system which can detect in a real-time manner the completeness of attributes the student wears. The system we developed uses object detection to detect the completeness of student attributes. The system shows and records student(s) whose attributes are incomplete. The system expected to improve the discipline and neatness of the students.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Regulations in the Politeknik Statistika STIS (hereinafter called Polstat STIS) aims to produce graduates who are qualified, with integrity and trusted. In enforcing regulations in Polstat STIS, there are a student squad of regulations enforcement, which is called Satuan Penegak Disiplin or SPD in Indonesian, which aims to maintain the order, discipline and student ethics during their activities on and off campus. In upholding the regulations, SPD carries out surprise inspection and during the weekly morning assembly to check completeness and tidiness of student’s uniform as well as his/her look. However, previous research related to the student’s commitment to the campus regulations shows that half of the students have low commitment. This is partly due to the lack of supervision of students. Therefore, it is necessary to monitor the discipline and neatness of students on an ongoing basis. In order to conduct the monitoring and inspection on a more regular basis, the method of image recognition can be used to assist in overseeing student discipline and neatness. In this study, we developed a system which can detect in a real-time manner the completeness of attributes the student wears. The system we developed uses object detection to detect the completeness of student attributes. The system shows and records student(s) whose attributes are incomplete. The system expected to improve the discipline and neatness of the students.</span></p> Ardian Fajri Saputra Yunarso Anang Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 58 72 10.34123/icdsos.v2023i1.298 A Sentiment Analysis and Topic Modelling of The Socio-Economic Registration 2022 https://proceedings.stis.ac.id/icdsos/article/view/301 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Socio-Economic Registration or Regsosek is an activity of Statistics Indonesia (BPS) that aims to collect data related to the profile, social and economic conditions, and welfare levels of all residents in 514 regencies/cities in Indonesia. One indicator of the success of Regsosek 2022 is the response and opinion from the community regarding the activity. The response and opinion can provide an overview of the implementation of Regsosek 2022 so that the picture can be used as a lesson learned to carry out the following population data collection. This study uses several methods to analyze the results of community responses and opinions on Regsosek activities, especially on Twitter social media. The method used in this research is sentiment analysis classification with four techniques: Naïve Bayes, Nearest Centroid, K-Nearest Neighbors, and Support Vector Machine. Then, the performance of the four techniques will be compared. In addition, the topic modeling method will also be used with two techniques, namely Latent Semantic Analysis and Latent Dirichlet Allocation. Data is collected using web scraping techniques. The results obtained from the sentiment analysis classification are that the Nearest Centroid method provides the best results with a relatively high and balanced f1-score value in positive and negative sentiments, which are 59% and 66%, respectively. Moreover, LDA modeling results are better than the LSA method for topic modeling results.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Socio-Economic Registration or Regsosek is an activity of Statistics Indonesia (BPS) that aims to collect data related to the profile, social and economic conditions, and welfare levels of all residents in 514 regencies/cities in Indonesia. One indicator of the success of Regsosek 2022 is the response and opinion from the community regarding the activity. The response and opinion can provide an overview of the implementation of Regsosek 2022 so that the picture can be used as a lesson learned to carry out the following population data collection. This study uses several methods to analyze the results of community responses and opinions on Regsosek activities, especially on Twitter social media. The method used in this research is sentiment analysis classification with four techniques: Naïve Bayes, Nearest Centroid, K-Nearest Neighbors, and Support Vector Machine. Then, the performance of the four techniques will be compared. In addition, the topic modeling method will also be used with two techniques, namely Latent Semantic Analysis and Latent Dirichlet Allocation. Data is collected using web scraping techniques. The results obtained from the sentiment analysis classification are that the Nearest Centroid method provides the best results with a relatively high and balanced f1-score value in positive and negative sentiments, which are 59% and 66%, respectively. Moreover, LDA modeling results are better than the LSA method for topic modeling results.</span></p> Indah Simbolon Nicholas H Manurung Sukma Andini Lya Hulliyyatus Suadaa Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 73 83 10.34123/icdsos.v2023i1.301 Opportunities and Challenges of Remote Sensing, Geospatial Data, and Machine Learning in Obtaining Accessibility and Location Information for Sustainable Development in Indonesia https://proceedings.stis.ac.id/icdsos/article/view/309 <p class="Abstract" style="margin-bottom: 1.0cm;"><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;With the advancement of technologies so does the data collection method which creates a large, rapid, and diverse stream of data. Statistic Indonesia (BPS) has also encouraged to utilize this by starting to collect geospatial information on respondents and public facilities. To keep up with this a change needs to be made in processing methods to accommodate massive, high-dimensional, and multiform data collected in different forms such as machine learning. This progression also opens up a new opportunity for tackling various statistical data problems such as accessibility and location data. Remote sensing is one of the big data sources that undergoes a lot of changes shown in the high spatial and temporal resolution satellite imagery availability, together with the BPS geotagging data shows great promise in classifying land use and geospatial analysis. Even so, there are still some challenges in remote sensing as well as other geospatial data utilization. The goals of this review paper are to study the opportunities and challenges in utilizing remote sensing, geospatial data, and machine learning for accessibility and location information. In this paper, we explore the possibilities and limitations in its implementation into SDGs indicators that involve accessibility and location such as indicators 9.1.1, 11.1.1, 11.2.1, 11.3.1, and 11.7.1 including other variables needed for the calculation like access to public facilities. Moreover, our experiment using geotagging data shows potential in improving proportion estimation when compared to using a simple ratio. Our DEGURBA following the UN definition using machine learning LULC for dasymetric mapping also provides more insight compared to the existing data. We can conclude that there are great opportunities in applying remote sensing and other geospatial data to monitor the accessibility and location to further sustainable development in Indonesia.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">With the advancement of technologies so does the data collection method which creates a large, rapid, and diverse stream of data. Statistic Indonesia (BPS) has also encouraged to utilize this by starting to collect geospatial information on respondents and public facilities. To keep up with this a change needs to be made in processing methods to accommodate massive, high-dimensional, and multiform data collected in different forms such as machine learning. This progression also opens up a new opportunity for tackling various statistical data problems such as accessibility and location data. Remote sensing is one of the big data sources that undergoes a lot of changes shown in the high spatial and temporal resolution satellite imagery availability, together with the BPS geotagging data shows great promise in classifying land use and geospatial analysis. Even so, there are still some challenges in remote sensing as well as other geospatial data utilization. The goals of this review paper are to study the opportunities and challenges in utilizing remote sensing, geospatial data, and machine learning for accessibility and location information. In this paper, we explore the possibilities and limitations in its implementation into SDGs indicators that involve accessibility and location such as indicators 9.1.1, 11.1.1, 11.2.1, 11.3.1, and 11.7.1 including other variables needed for the calculation like access to public facilities. Moreover, our experiment using geotagging data shows potential in improving proportion estimation when compared to using a simple ratio. Our DEGURBA following the UN definition using machine learning LULC for dasymetric mapping also provides more insight compared to the existing data. We can conclude that there are great opportunities in applying remote sensing and other geospatial data to monitor the accessibility and location to further sustainable development in Indonesia.</span></p> Terry Devara Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 84 95 10.34123/icdsos.v2023i1.309 Automated Indonesian Text Augmentation with Web-Based Application Using Flask Framework https://proceedings.stis.ac.id/icdsos/article/view/324 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;In real world, data and resources available for text classification are limited. One of issues on labelled data is imbalanced data. Problem of imbalanced data affects performance and accuracy of model because the model only focuses on data with majority label. Therefore, the measure of model accuracy cannot describe the true quality of model. To overcome this, an oversampling approach is carried out. Text-based oversampling is known as text augmentation. However, NLP resources for Indonesian, especially in performing text augmentation, are still limited. Therefore, this research conducts development of a web application to augment Indonesian text automatically. The application was bulit using prototype method. The application was successfully built and can facilitate users to perform augmentation automatically for all texts in the dataset. Users can select preferred augmentation technique and are required to upload datasets as input. The output of application is same dataset file as input with an additional column containing synthetic text augmented by the application. This application can contribute to further research in performing text augmentation for Indonesians.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">In real world, data and resources available for text classification are limited. One of issues on labelled data is imbalanced data. Problem of imbalanced data affects performance and accuracy of model because the model only focuses on data with majority label. Therefore, the measure of model accuracy cannot describe the true quality of model. To overcome this, an oversampling approach is carried out. Text-based oversampling is known as text augmentation. However, NLP resources for Indonesian, especially in performing text augmentation, are still limited. Therefore, this research conducts development of a web application to augment Indonesian text automatically. The application was bulit using prototype method. The application was successfully built and can facilitate users to perform augmentation automatically for all texts in the dataset. Users can select preferred augmentation technique and are required to upload datasets as input. The output of application is same dataset file as input with an additional column containing synthetic text augmented by the application. This application can contribute to further research in performing text augmentation for Indonesians.</span></p> Iftitah Athiyyah Rahma Lya Hulliyyatus Suadaa Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 96 108 10.34123/icdsos.v2023i1.324 Comparison of Naive Bayes, K-Nearest Neighbor, and Support Vector Machine Classification Methods in Semi-Supervised Learning for Sentiment Analysis of Kereta Cepat Jakarta Bandung (KCJB) https://proceedings.stis.ac.id/icdsos/article/view/332 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Transportation technology has developed very rapidly in the 21st century; one of them is high-speed trains. Currently, the Indonesian government is implementing the construction of the Kereta Cepat Jakarta-Bandung (KCJB) project in collaboration with China. The construction of this fast train project has attracted various comments and opinions from the public on Twitter and social media. This research aims to compare the classification methods of Naïve Bayes, K-Nearest Neighbor (K-NN), and Support Vector Machine (SVM) in classifying sentiment in tweets about high-speed trains obtained by scraping Twitter. The comparison process was carried out using semi-supervised learning, and the results showed that the semi-supervised SVM model had the best performance with an average accuracy of 86%, followed by the semi-supervised Naïve Bayes model and semi-supervised K-NN with an average accuracy of 81% and 58% respectively. Overall, the prediction results from the three models conclude that there are more tweets with negative sentiment than tweets with positive and neutral sentiment.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Transportation technology has developed very rapidly in the 21st century; one of them is high-speed trains. Currently, the Indonesian government is implementing the construction of the Kereta Cepat Jakarta-Bandung (KCJB) project in collaboration with China. The construction of this fast train project has attracted various comments and opinions from the public on Twitter and social media. This research aims to compare the classification methods of Naïve Bayes, K-Nearest Neighbor (K-NN), and Support Vector Machine (SVM) in classifying sentiment in tweets about high-speed trains obtained by scraping Twitter. The comparison process was carried out using semi-supervised learning, and the results showed that the semi-supervised SVM model had the best performance with an average accuracy of 86%, followed by the semi-supervised Naïve Bayes model and semi-supervised K-NN with an average accuracy of 81% and 58% respectively. Overall, the prediction results from the three models conclude that there are more tweets with negative sentiment than tweets with positive and neutral sentiment.</span></p> Muhammad Farhan Renata De La Rosa Manik Hana Raihanatul Jannah Lya Hulliyyatus Suadaa Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 109 120 10.34123/icdsos.v2023i1.332 GLMM and GLMMTree for Modelling Poverty in Indonesia https://proceedings.stis.ac.id/icdsos/article/view/333 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;GLMMTree is a tree-based algorithm that can detect interaction and find subgroups in the GLMM to improve fixed effect estimation. This study uses GLMMTree for the actual data applications of poverty in Indonesia and confirms that the GLMMTree algorithm method has better precision than GLMM. The significant predictors that affect poverty in Indonesia are the unemployment rate and the GRDP at a constant price. GLMMTree algorithm enriches the analysis by finding subgroups of provinces with electricity lighting access and clean drinking water sources variables.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">GLMMTree is a tree-based algorithm that can detect interaction and find subgroups in the GLMM to improve fixed effect estimation. This study uses GLMMTree for the actual data applications of poverty in Indonesia and confirms that the GLMMTree algorithm method has better precision than GLMM. The significant predictors that affect poverty in Indonesia are the unemployment rate and the GRDP at a constant price. GLMMTree algorithm enriches the analysis by finding subgroups of provinces with electricity lighting access and clean drinking water sources variables.</span></p> Suseno Bayu Khairil Anwar Notodiputro Bagus Sartono Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 121 131 10.34123/icdsos.v2023i1.333 A Land cover change analysis of buffer areas in New Capital City of Nusantara, Indonesia: A cellular automata approach on satellite imageries data https://proceedings.stis.ac.id/icdsos/article/view/338 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;The proposed plan to move Indonesia's capital city to the New Capital City of Nusantara in East Kalimantan Province undoubtedly requires careful efforts to ensure food supply for the population. Population migration to the new capital may pose a food security challenge. To address this fundamental issue, one of the most crucial approaches is to establish buffer areas that can support the food needs of the new capital. The currently existing official Area Sampling Frame survey conducted by the government to assess food vulnerability faced several limitations, including weather conditions, field terrain variations, and high cost. In this study, we propose the utilization of remote sensing satellite imagery data in buffer areas to analyze changes and predict future land cover, which can provide valuable data for assessing food availability. We investigate the integration of a Cellular Automata method with the two most popular analytical methods of classical Logistic Regression and data-driven Artificial Neural Networks, known as CA-LR and CA-ANN, to identify and map land cover changes in the new capital buffer zones. Our findings reveal that both combined methods, CA-LR and CA-ANN, yield fairly promising results, with correctness and kappa statistic values exceeding 80%. Prediction results indicate that buffer areas are predominantly covered by trees, while built-up areas are still limited. The flooded vegetation cover, including rice fields, is predicted to decrease by 2024. This should be a matter of concern for stakeholders, considering the construction of the new capital city is still ongoing and the number of migrants is expected to keep rising.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">The proposed plan to move Indonesia's capital city to the New Capital City of Nusantara in East Kalimantan Province undoubtedly requires careful efforts to ensure food supply for the population. Population migration to the new capital may pose a food security challenge. To address this fundamental issue, one of the most crucial approaches is to establish buffer areas that can support the food needs of the new capital. The currently existing official Area Sampling Frame survey conducted by the government to assess food vulnerability faced several limitations, including weather conditions, field terrain variations, and high cost. In this study, we propose the utilization of remote sensing satellite imagery data in buffer areas to analyze changes and predict future land cover, which can provide valuable data for assessing food availability. We investigate the integration of a Cellular Automata method with the two most popular analytical methods of classical Logistic Regression and data-driven Artificial Neural Networks, known as CA-LR and CA-ANN, to identify and map land cover changes in the new capital buffer zones. Our findings reveal that both combined methods, CA-LR and CA-ANN, yield fairly promising results, with correctness and kappa statistic values exceeding 80%. Prediction results indicate that buffer areas are predominantly covered by trees, while built-up areas are still limited. The flooded vegetation cover, including rice fields, is predicted to decrease by 2024. This should be a matter of concern for stakeholders, considering the construction of the new capital city is still ongoing and the number of migrants is expected to keep rising.</span></p> Maria Shawna Cinnamon Claire Salwa Rizqina Putri Arie Wahyu Wijayanto Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 132 149 10.34123/icdsos.v2023i1.338 Air Pollution in Jakarta, Indonesia Under Spotlight: An AI-Assisted Semi-Supervised Learning Approach https://proceedings.stis.ac.id/icdsos/article/view/348 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;The air quality in the Jakarta area is examined in this study using artificial intelligence (AI) to assist a semi-supervised learning technique. The clustering approach is used in this article to separate air pollution into three main categories moderate, low, and high levels. This clustering helps identify shared characteristics among measures like particulates (PM10 and PM2.5), sulfur dioxide (SO2), nitrogen dioxide (NO2), carbon monoxide (CO), and ozone (O3), even when air quality labels are not always accessible. Using the Random Forest method, the air quality will be categorized in this experiment with an accuracy rate of 93%. Additionally, the results of variable significance analysis are examined on this article to identify the variables with the biggest effects on air quality, notably PM10, SO2, and NO2. This study demonstrates the enormous potential of applying machine learning techniques, particularly semi-supervised learning approaches, to assist sustainable environmental regulations while also monitoring and enhancing Jakarta's air quality. We describe the experimental procedures, the findings, and the implications of our research for comprehending and addressing urban air pollution in this article&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">The air quality in the Jakarta area is examined in this study using artificial intelligence (AI) to assist a semi-supervised learning technique. The clustering approach is used in this article to separate air pollution into three main categories moderate, low, and high levels. This clustering helps identify shared characteristics among measures like particulates (PM10 and PM2.5), sulfur dioxide (SO2), nitrogen dioxide (NO2), carbon monoxide (CO), and ozone (O3), even when air quality labels are not always accessible. Using the Random Forest method, the air quality will be categorized in this experiment with an accuracy rate of 93%. Additionally, the results of variable significance analysis are examined on this article to identify the variables with the biggest effects on air quality, notably PM10, SO2, and NO2. This study demonstrates the enormous potential of applying machine learning techniques, particularly semi-supervised learning approaches, to assist sustainable environmental regulations while also monitoring and enhancing Jakarta's air quality. We describe the experimental procedures, the findings, and the implications of our research for comprehending and addressing urban air pollution in this article.</span></p> HARUN AL AZIES Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 150 161 10.34123/icdsos.v2023i1.348 Automatic Detection and Counting of Urban Housing and Settlement in Depok City, Indonesia: An Object-Based Deep Learning Model on Optical Satellite Imageries and Points of Interests https://proceedings.stis.ac.id/icdsos/article/view/349 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Detecting urban housing and settlements has a substantial position in decision-\nmaking problems such as monitoring housing and development, not to mention the widely\nrequired urban mapping application. One of the most important goals in the United Nations\nSustainable Development Goals (SDGs) is to improve urban living conditions globally by\n2030. We propose an automatic detection of urban housing and settlements on remote sensing\nsatellite imagery data using object detection-based deep learning using semantic segmentation\nand the potential availability of remote sensing datasets at high spatial resolutions, Open Street\nMap (OSM) geolocation point of interest dataset, and Sentinel-2 optical satellite imagery data.\nThe detection model using Mask Region-based Convolutional Neural Networks (Mask R-\nCNN) is implemented in Depok City, Indonesia. These regions were chosen because it is the\nsecond most populous suburb in Indonesia and the tenth most populous globally and, making it\nchallenging to extract building features from satellite imagery. This model categorizes dense,\nmoderate, and sparse conditions and has a promising result of an average precision of 100%\nand an F1-score of 67% with evaluation performance metrics only considering points\nassociated with buildings, not building boundaries or the intersection over union (IoU). The\nmodel performance has been compared to ground check results of field surveys, and it\nperforms best in sparse conditions. Our findings offer the potential implementation of the\nmodel for fast and accurate monitoring of housing, settlement, and regional planning in urban\nareas. &quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Detecting urban housing and settlements has a substantial position in decision-<br />making problems such as monitoring housing and development, not to mention the widely<br />required urban mapping application. One of the most important goals in the United Nations<br />Sustainable Development Goals (SDGs) is to improve urban living conditions globally by<br />2030. We propose an automatic detection of urban housing and settlements on remote sensing<br />satellite imagery data using object detection-based deep learning using semantic segmentation<br />and the potential availability of remote sensing datasets at high spatial resolutions, Open Street<br />Map (OSM) geolocation point of interest dataset, and Sentinel-2 optical satellite imagery data.<br />The detection model using Mask Region-based Convolutional Neural Networks (Mask R-<br />CNN) is implemented in Depok City, Indonesia. These regions were chosen because it is the<br />second most populous suburb in Indonesia and the tenth most populous globally and, making it<br />challenging to extract building features from satellite imagery. This model categorizes dense,<br />moderate, and sparse conditions and has a promising result of an average precision of 100%<br />and an F1-score of 67% with evaluation performance metrics only considering points<br />associated with buildings, not building boundaries or the intersection over union (IoU). The<br />model performance has been compared to ground check results of field surveys, and it<br />performs best in sparse conditions. Our findings offer the potential implementation of the<br />model for fast and accurate monitoring of housing, settlement, and regional planning in urban<br />areas. </span></p> Atut Pindarwati Arie Wahyu Wijayanto Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 162 176 10.34123/icdsos.v2023i1.349 Text Analysis Study on Urban farming News Toward Food Security in Indonesia: Sentiment Analysis, Named Entity Recognition, Topic Modelling, and Social Network Analysis https://proceedings.stis.ac.id/icdsos/article/view/352 <p><span style="font-weight: 400;">Urban farming is an increasingly popular trend in agricultural activities. Urban farming is an attempt to achieve urban sustainability from an environmental, social and economic perspective. In order to understand the phenomenon of urban farming in society, one of the media used is a news portal. This research aims to gain an in-depth understanding of community perceptions, social networks and issues related to the urban farming phenomenon. Data was collected using the web-scraping method on three national news portals in Indonesia. Data analysis was carried out using sentiment analysis, NER, topic modelling and social network analysis methods. Sentiment analysis shows that there is a generally positive sentiment towards urban farming. Government officials and environmental activists are frequently mentioned as supporting and promoting urban agriculture. Social network analysis reveals interactions between government agencies, non-governmental organisations and the media. The relationships between these stakeholders form a network that plays a role in building awareness, cooperation and knowledge exchange to strengthen food security through urban agriculture.</span></p> Dewi Krismawati Satria Bagus Panuntun Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 177 185 10.34123/icdsos.v2023i1.352 Geospatial Big Data Approaches to Estimate Granular Level Poverty Distribution in East Java, Indonesia using Machine Learning and Deep Learning Regressions https://proceedings.stis.ac.id/icdsos/article/view/359 <p>One of the economic development the focus of the Indonesian government's efforts is for reducing poverty. In Indonesia, collecting poverty data uses the conventional method, the name is National Socio-Economic Survey (SUSENAS) which takes a large cost, time, and effort. To overcome these limitations, there is a need for additional data to provide more detailed poverty data. Recent studies show that the use of geospatial big data could identify poverty at a granular level, with a lower cost and faster update because of their unique and unbiased capacity to identify physical and socioeconomic phenomena. The integrated multi-source satellite imagery data such as the normalized difference vegetation index (NDVI) for detecting rural areas based on vegetation, built-up index (BUI) for identifying urban areas through building distribution, normalized difference water index (NDWI) for land cover detection, day time land surface temperature (LST) for identifying urban regions based on surface temperature, and pollutants such as carbon monoxide (CO), nitrogen dioxide (NO2), and sulfur dioxide (SO2) to evaluate economic activities based on pollution. Additionally, point of interest (POI) density and minimum POI distance are used to measure area accessibility. Therefore, the contribution of this research is to implement the utilization of geospatial big data to estimate the numbers of poverties at a granular level to the 666 sub-districts in East Java Province using machine learning and deep learning regression models. The evaluation results to estimate sub-district level poverty shows that the best model development using Support Vector Regression (SVR) in machine learning was the best root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) values of 0.365, 0.293, and 0.032 with R-squared of 0.59 and MLP in deep learning algorithm with 0.444, 0.345, and 0.039 values of RMSE, MAE, and MAPE with R<sup>2</sup> 0.52. In addition, the results of visual identification revealed that high estimates of lower poverty are typically found in urban areas with high accessibility, and these areas are not spatially deprived areas with limited accessibility.</p> Rifqi Ramadhan Arie Wahyu Wijayanto Setia Pramana Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 186 200 10.34123/icdsos.v2023i1.359 Sentiment Classification of Community towards COVID-19 Issues on Twitter (Case Study: Indonesia, March-May 2020) https://proceedings.stis.ac.id/icdsos/article/view/360 <p>This study examines sentiment analysis related to COVID-19 in Indonesia (March-May 2020) using InSet Lexicon as training data in supervised machine learning models. The dataset comprises 7,967 tweets, divided into 90% training data and 10% testing data. The results reveal that Support Vector Machine (SVM) and Random Forest (RF) are the most effective methods, achieving accuracy above 80%, with SVM reaching 87% and RF at 86%. InSet Lexicon itself attains an accuracy of 75%, a macro average of 69%, and a weighted average of 74%, making it an effective alternative for large-scale data labeling. Research recommendations support further development of InSet Lexicon for sentiment classification and expansion of the lexicon for foreign languages to enhance sentiment analysis accuracy in a global context. This study provides valuable insights into understanding public sentiment regarding crucial issues such as COVID-19 in Indonesia.</p> Nur Ainun Daulay Rifqi Ramadhan Lya Hulliyyatus Suadaa Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 201 217 10.34123/icdsos.v2023i1.360 Design and Implementation of an Interactive Visualization Dashboard for Monitoring the Flood Vulnerability and Mapping https://proceedings.stis.ac.id/icdsos/article/view/362 <p>This study aims to build a web-based interactive visualization dashboard from granular flood vulnerability index estimation maps using data from satellite imagery. The approach used to build this visualization dashboard is a two-dimensional (2D) approach created with the qgis2web python plugin facilitated with a JavaScript leaflet. Raw data from satellite imagery consisting of indicators of the causes of flooding are extracted in comma-separated value (CSV) format. Furthermore, the data is integrated based on its spatial attributes and stored in Geographic JavaScript Object Notation (GeoJSON) format to produce a visualization of the flood vulnerability index map. In web views, dashboards are built by utilizing hypertext markup language (HTML), cascading style sheets (CSS), and JavaScript (JS). This interactive dashboard has several useful features in helping the process of monitoring the flood vulnerability of an area such as zoom, "show me where I am", measure distance, search, legend, and change year. Thus, the flood vulnerability estimation map dashboard is expected to assist the government in monitoring areas with extreme flood vulnerability and support the decision-making process related to mitigation of areas that have high flood vulnerability.</p> Windy Rahmatul Azizah Arie Wahyu Wijayanto Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 218 232 10.34123/icdsos.v2023i1.362 A Geovisualization Dashboard of Granular Food Security Index Map using GIS for Monitoring the Provincial Level Food Security Status https://proceedings.stis.ac.id/icdsos/article/view/364 <p>This study aims to build a web-based interactive geovisualization dashboard from a granular food security index map using satellite imagery and other geospatial big data. The map dashboard is built using a two-dimensional (2D) data visualization approach. Making a two-dimensional map using QuantumGIS (QGIS) tools, displayed in the form of WebGIS with the plugin used "Qgis2web" based on javascript leaflets. Once included in WebGIS, interactive visualizations are displayed on websites with interfaces based on hypertext markup language (HTML), cascading style sheets (CSS), and JavaScript (JS). The dashboard map is equipped with interactive features such as legend, click grid, zoom, show me where I am, measure distance, and search. Therefore, the dashboard map can be used to monitor the food security index, search for food security index areas, as well as geographical identification of food security index areas which are useful for supporting the analysis of decision-making or policies by the government regarding food security strategies.</p> Dwi Karunia Syaputri Bony Parulian Josaphat Arie Wahyu Wijayanto Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 233 247 10.34123/icdsos.v2023i1.364 Implementation of User-Oriented Geovisualization Web Dashboard for Monitoring Access to Improved Water using Satellite Imageries Data https://proceedings.stis.ac.id/icdsos/article/view/365 <p>This study aims to develop an engaging, web-based visualization dashboard for improved water access in Indonesia. The dashboard map was made using three technologies: the Qgis2web Python plugin for producing two-dimensional (2D) dashboard maps, JavaScript leaflets for map visualization, and Hypertext Markup Language (HTML), Cascade Stylesheet (CSS), and JavaScript for the user interface. The built-in map dashboard has several features, including grid click, legend, zoom, search, and measure distance, which are meant to help users determine the location of the nearest water treatment facilities, identify geographical features, and keep track of areas that have poor access to improved water. Evaluation using the system usability scale (SUS) concludes the dashboard is acceptable with an excellent rating. Our results reiterate and enhance support for government institution and relevant stakeholders in providing sustainable access to public water.</p> Fauzan Faldy Anggita Arie Wahyu Wijayanto Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 248 264 10.34123/icdsos.v2023i1.365 Modeling Coastal Area Change Analysis of Coastal Urban Areas at Semarang City, Indonesia https://proceedings.stis.ac.id/icdsos/article/view/367 <p>A coastal area is defined as the boundary between land and sea. Coastal urban areas are susceptible to various hazards that are becoming more severe, such as flooding, erosion, and subsidence due to a mix of man-made and natural factors, including urbanization and climate change. Regardless of the high importance of coastal area monitoring, conducting field surveys is expensive, time-consuming, and geographically limited to non-remote regions. Semarang City is one of the cities in Indonesia that is at risk of changes in its coastline and causes various natural problems. This research aims to estimate changes in the coastal land area in Semarang City. In observing the phenomenon of changes in area in coastal areas in Semarang City, remote sensing technology with Sentinel-2 satellite imagery was used. This research implements and compares the Random Forest (RF) and Support Vector Machine (SVM) machine learning methods in building classification models. From the results of land area in 2019, 2021, and 2023 with the best classification model, namely SVM, information was obtained on an increase in coastal area of 387.94 ha in 2021, then a change in area decrease of 417.32 ha in 2023.</p> Renata De La Rosa Manik Arie Wahyu Wijayanto Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 265 273 10.34123/icdsos.v2023i1.367 Integrating Satellite Imageries and Multiple Geospatial Big Data for Granular Mapping of Spatial Distribution of Human Development Index in East Java, Indonesia https://proceedings.stis.ac.id/icdsos/article/view/369 <p>The availability of data on the Human Development Index (HDI) is crucial as a gauge of regional performance, particularly in terms of assessing the development of human resources. In Indonesia, the collecting of HDI data usesthe conventional method, such as undirect estimation, National Socio-Economic Survey (SUSENAS), The Ministry of Religion, or inventory of sectoral data that used the large cost, time, and effort. Additional data are required to provide more detailed poverty data at a lower cost and with more recent information to overcome these limitations. According to recent studies, the quality of life for measuring HDI can be identified down to the granular level using geospatial big data. Therefore, the contribution of this research is to implement the use of geospatial big data, such as integrated multi-source satellite imagery data and Point of Interest (POI). Besides that, this study develops the relative spatial human development index in 11 km x 11 km resolution for the granular mapping of the quality of life to measure the HDI in East Java, Indonesia. The kinds of weighted sum models used in this study such as equal weight (EWS), Pearson (PCCWS), Spearman (SCCWS) correlation-based weight, and Principal Component Analysis (PCA)-based weight (PCAWS). The best RSHDI PCCWS for representing the human development index in East Java in 2022, which was determined using a weight-sum model based on Pearson correlation, has a correlation coefficient of 0.7858 (p-value = 5.078 x 10-9) and is highly correlated with official HDI data. The use of this RSHDI as a predictor variable in the estimation of HDI data shows the ideal model had an RMSE of 3.098% and an R2 of up to 61.75% using RSHDI PCCWS. According to the findings of the descriptive analysis of this map, areas with low RSHDI scores typically in some regencies areas in Madura Island and the east area of East Java with geographically depressed, while areas with high RSHDI scores typically have dense populations and have better accessibility such as urban area in Surabaya and Kota Malang. As a result, the official human development index data can be supported by the RSHDI's ability to map spatially deprive areas</p> Rifqi Ramadhan Arie Wahyu Wijayanto Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 274 295 10.34123/icdsos.v2023i1.369 Cost-Sensitive Boosting Algorithm for Classifying Underdeveloped Regions in Indonesia https://proceedings.stis.ac.id/icdsos/article/view/373 <p>Imbalanced classes are indicated by having more instances of some classes than others. The cost-sensitive boosting algorithm is a modification of the AdaBoost algorithm, which aims to solve the problem of imbalanced classes. In this study, we evaluate the cost-sensitive Boosting algorithm AdaC2 using Indonesia's underdeveloped region's data. This study confirms that the cost-sensitive boosting algorithm (AdaC2) performs better in classifying the instances in the minority classes than standard classifiers algorithms.</p> Bayu Suseno Bagus Sartono Khairil Anwar Notodiputro Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 296 308 10.34123/icdsos.v2023i1.373 Comparison Of Kernel Support Vector Machine In Stroke Risk Classification (Case Study:IFLS data) https://proceedings.stis.ac.id/icdsos/article/view/381 <p>Stroke s a disability main source and main disability source to lost years of disability-adjusted life. Currently the information technology development, especially the field of machine learning has an important role in early warning of various diseases, such as strokes. One of the methods used for stroke classifying is Support Vector Machine (SVM). In this study, we aim to compare several kernel functions in SVM such as linear, radial basis function(RBF), polynomial, and sigmoid for classifying stroke risk. We determine the best kernel based on accuracy, sensitivity, and specificity values. The result of this study shows that linear kernel function gives the best performance in classifying with values of classification accuracy 99.0%, specificity 100.0%, ,and sensitivity 97.0%. Those scores are the highest scores among the other kernel , that means the linear kernel function is the best method for classifying strokes risk.</p> Lensa Rosdiana Safitri Nur Chamidah Toha Saifudin Gaos Tipki Alpandi Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 309 316 10.34123/icdsos.v2023i1.381 Implementation of Machine Learning and Its Interpretation for Mapping Social Welfare Policy in Indonesia https://proceedings.stis.ac.id/icdsos/article/view/383 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;This research leverages data from the 2022 Early Socio-Economic Registration (Regsosek) activity to develop a machine learning model capable of predicting family expenditure levels based on the Proxy Mean Test (PMT) with high accuracy. By integrating the SHAP (SHapley Additive exPlanations) method for model interpretation, we identify the contributions of socio-economic features to expenditure predictions and link them to relevant social assistance programs. We compare two regions, Kulonprogo Regency and Yogyakarta City, representing varying poverty levels, and identify unique characteristics influencing family welfare in each area. The results highlight that effective policy interventions must be tailored to the unique characteristics of each region and family, taking into account dimensions such as housing, education, income, and community expenditures. This research provides valuable insights for policymakers, demonstrating that successful poverty alleviation policies are data-driven and adaptable to the diverse socio-economic realities across regions.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">This research leverages data from the 2022 Early Socio-Economic Registration (Regsosek) activity to develop a machine learning model capable of predicting family expenditure levels based on the Proxy Mean Test (PMT) with high accuracy. By integrating the SHAP (SHapley Additive exPlanations) method for model interpretation, we identify the contributions of socio-economic features to expenditure predictions and link them to relevant social assistance programs. We compare two regions, Kulonprogo Regency and Yogyakarta City, representing varying poverty levels, and identify unique characteristics influencing family welfare in each area. The results highlight that effective policy interventions must be tailored to the unique characteristics of each region and family, taking into account dimensions such as housing, education, income, and community expenditures. This research provides valuable insights for policymakers, demonstrating that successful poverty alleviation policies are data-driven and adaptable to the diverse socio-economic realities across regions.</span></p> Aldo Leofiro Irfiansyah Ari Rismansyah Novia Permatasari Isnaeni Noviyanti Atqo Mardiyanto Ade Koswara Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 317 336 10.34123/icdsos.v2023i1.383 Comparative Analysis of Retriever and Reader for Open Domain Questions Answering on BPS Knowledge in Indonesian https://proceedings.stis.ac.id/icdsos/article/view/384 <p>Enumerators from Badan Pusat Statistik (BPS) still often encounter problems in finding solutions to cases encountered during censuses or surveys. Even though knowledge lists have been created and collected in various systems such as QA and knowledge management systems, enumerators still need to find appropriate answers from long and complex knowledge search results. On the other hand, Open-domain Question Answering (OpenQA) is capable of identifying answers to natural questions based on large-scale documents. OpenQA has main components, namely Retriever and Reader. For Retriever tasks, Dense Retrieval (DR) is proven to outperform traditional sparse retrieval such as TF-IDF or BM25. However, other research actually shows that BM25 is superior to DR in terms of accuracy. In this study, we compared DR and BM25 separately and DR+BM25 as a retriever. Additionally, we combine and evaluate several enhanced language models as Readers. In this way, a model with the best combination of Retriever and Reader can be obtained to be implemented in search systems such as QA and knowledge management systems.</p> Sulisetyo Puji Widodo Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 337 343 10.34123/icdsos.v2023i1.384 Time-Series Clustering of the Regencies Hotel Room Occupancy Rate in Indonesia after the COVID-19 Pandemic https://proceedings.stis.ac.id/icdsos/article/view/387 <p>After COVID-19 pandemic, Indonesia entering the recovery era. The government provides incentives for tourism industry recovery. This policy was created because the impact of COVID-19 pandemic on tourism industry at each regencies/cities are different. This study investigates a different recovery pattern at regencies/cities across Indonesia. The data of this study consist of the room occupancy rate (ROR) from Badan Pusat Statistik (BPS) Indonesia and from web scraping monthly data from Agoda website between 1 January 2021 until 1 August 2023. The regencies/cities are clustered by ROR category using the dynamic time warping method. The result of study, there is a difference of tourism industry recovery at regencies/cities across Indonesia, which is the speed are fast, medium, or slow. This could be the result of differences of different policy in each regency/city to respond COVID-19 pandemic on their tourism industry.</p> Ladisa Busaina Setia Pramana Satria Bagus Panuntun Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 344 353 10.34123/icdsos.v2023i1.387 Lean User Experience (Lean UX) Approach in the Redesign of the SOBAT BPS Application https://proceedings.stis.ac.id/icdsos/article/view/398 <p>SOBAT BPS is a service provided by BPS to be used by partners and prospective partners of BPS throughout Indonesia. Alongside the utilization of the SOBAT BPS application, user reviews and assessments become significant elements in measuring the quality and success of this application. Feedback obtained from these assessments indicates that a redesign of the SOBAT BPS application is necessary to provide a better user experience. Prior to redesigning the SOBAT BPS application, a preliminary survey was conducted to understand user perceptions of the current system using heuristic evaluation and the user experience questionnaire (UEQ). Based on the preliminary survey results, there are issues related to the implementation of heuristic principles in the SOBAT BPS application, and only the UEQ stimulation scale received a good ranking. Therefore, the aim of this research is to redesign the SOBAT BPS application using the Lean UX method and to evaluate the redesigned results using heuristic evaluation and UEQ. The evaluation results of the redesigned SOBAT BPS application indicate that the redesign is superior to the current SOBAT BPS application.</p> Migunani Puspita Eugenia Lutfi Rahmatuti Maghfiroh Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 354 367 10.34123/icdsos.v2023i1.398 Curating Multimodal Satellite Imagery for Precision Agriculture Datasets with Google Earth Engine https://proceedings.stis.ac.id/icdsos/article/view/399 <p>In the era of modern agriculture, satellite imagery has been widely used to monitor crops, one of which is paddy. This paper tries to describe the vegetation indices, climate, and soil index features related to paddy plants and curates a collection of satellite imagery on the Google Earth Engine (GEE). This paper reveals how GEE can be used to collect and process multimodal satellite imagery to form a precision agriculture dataset. The objective of this study is to establish a comprehensive precision agriculture dataset by leveraging multimodal satellite imagery to monitor paddy crops. The data collected as a dataset originates from 306 locations in Karawang Regency, Indonesia, during the 2019-2020 period. In the first step, we identify the relevant features essential for paddy crop analysis. Subsequently, we carefully select image collections within GEE based on these features. Afterward, we perform data acquisition and necessary preprocessing through the Google Colab environment. The results showed that satellite imagery from Sentinel-2 outperforms Landsat 8 in terms of spatial and temporal resolution. Apart from that, the generated dataset successfully captures the growth patterns of paddy plants.</p> Bagus Setyawan Wijaya Rinaldi Munir Nugraha Priya Utama Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 368 381 10.34123/icdsos.v2023i1.399 FORECASTING USING SARIMA AND BAYESIAN STRUCTURAL TIME SERIES METHOD FOR RANGE SEASONAL TIME https://proceedings.stis.ac.id/icdsos/article/view/402 <p>Data containing seasonal patterns, the SARIMA and Bayesian Structural Time Series methods, are time series methods that can be used on this type of data. This research aims to determine the steps of the SARIMA model and Bayesian Structural Time Series, applying the SARIMA model and Structural Bayesians Time Series, get the forecasting results of the SARIMA model and Bayesian Structural Time Series with MAPE measurements. The research method used is a quantitative method applied to data on the number of PT KAI train passengers in the Java region for 2006-2019. The results of this research show that the best model for forecasting the number of PT KAI train passengers in the Java region in 2006-2019 is SARIMA (2,1,0)(0,1,2)[12] with a MAPE value of 4.77% compared to the Bayesian method structural time series [12] namely 5.25%.</p> MUHAMMAD RIZAL Sri Utami Zuliana Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 382 391 10.34123/icdsos.v2023i1.402 Forest Cover Mapping Using Interactive Dashboards with Google Earth Engine on Sentinel-2 Satellite Imagery https://proceedings.stis.ac.id/icdsos/article/view/409 <p>The study aims to develop an attractive web-based visualization dashboard for mapping forest land cover around the world. The dashboard map was created using the Google Earth Engine application with JavaScript programming language. The built-in map dashboard has several interactive features, including legend, zoom, search, composite index view selection, visualization date selection, and wipers. The results of the dashboard black box test show that the dashboard works well and provides good visualization in mapping forest land cover for better monitoring and analysis.</p> Nora Dzulvawan Arie Wahyu Wijayanto Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 392 401 10.34123/icdsos.v2023i1.409 SPATIAL ANALYSIS OF FIRE OCCURRENCE IN JAKARTA, INDONESIA https://proceedings.stis.ac.id/icdsos/article/view/275 <p class="Abstract" style="margin-bottom: 28.35pt;"><span lang="EN-GB" style="font-family: 'Times New Roman',serif;">The occurrence of fire incidents in urban villages of Jakarta Special Capital Region significantly impacted losses, necessitating prevention and handling efforts. Therefore, this study aims to analyze the spatial influence of social and physical variables (independent variables) such as sex ratio, vulnerable age population, number of buildings, and size of slum areas on fires (dependent variable) in Jakarta Special Capital Region. The analysis area includes five municipalities of Jakarta Special Capital Region. Secondary data were obtained from Central Agency of Statistics of Jakarta Special Capital Region, maps from the official site jakartasatu.jakarta.go.id, and publication data from Government of Jakarta Special Capital Region for 2020. Furthermore, the quantitative approach in descriptive and inferential analysis, determined using Microsoft Excel and GeoDa version 1.20.0.10, was used to evaluate the spatial relationships between adjacent sub-districts. Although the regression data processing results using GeoDa were significant, the spatial regression results with Lagrange Multiplier (LM) Lag and Lagrange Multiplier (LM) error &gt; 0.05 were insignificant and significant when using the parameter 0.1. This means fire symptoms in Jakarta Special Capital Region do not have a spatial effect, contrary to the clustering observed between dependent and independent variables using Morans'I and Scatter Plots. The results of this study can aid the Jakarta provincial government in preventing and handling potential fires by restructuring slum areas to minimize the likelihood of such incidents.</span></p> Ika Rosantiningsih Chotib Chotib Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 403 416 10.34123/icdsos.v2023i1.275 Analysis of the Effect of Technology on the Growth of the Information and Communication Sector in the Bali Province 2016-2021 https://proceedings.stis.ac.id/icdsos/article/view/291 <p>Technology continues to develop and drive economic growth. Bali, as a province that is open to foreign tourists in Indonesia, has a great opportunity to adopt technology more quickly. This study aims to analyze the effect of technology development as well as other variables such as the number of workers in the ICT sector, household consumption for Information and Communication Technology (ICT), and the amount of accommodation for Gross Regional Domestic Product (GRDP) of the ICT Sector in districts/cities of Bali Province. The analysis to be used is descriptive and inferential analysis with panel data regression from regencies/cities in Bali Province period 2016-2021. The model used is the fixed effect model. In general, the GRDP of the ICT Sector continues to increase, but its growth is decreasing every year. Meanwhile, technological developments in Bali Province tend to increase every year. With a significance level of 5 percent, the percentage of e-commerce users, the percentage of ownership of cell phones, and the number of accommodations have a significant positive effect on the GRDP of the ICT sector. Technological progress has not been fully utilized, therefore the GRDP growth of the ICT sector tends to decrease every year.</p> Milan Puji Astuti Krismanti Tri Wahyuni Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 417 429 10.34123/icdsos.v2023i1.291 Study of Economic Vulnerability and Its Influence on the Economy in Sumatera Island Using the Household Consumption Expenditure Approach https://proceedings.stis.ac.id/icdsos/article/view/293 <p>The readiness of a region to face shocks and spillover effects from the surrounding area needed to be developed early. Each region had different economic structure so that the policy and strategy that was used to deal with current and future global uncertainties should be different as well. This study aimed to analyze economic vulnerability and the characteristics of its grouping, and analyze the effect of inflation, unemployment rate, foreign investment, and economic vulnerability towards the economy of provinces in Sumatera. The method performed in this study was Cluster Analysis for grouping and creating economic vulnerability variable, Panel Regression Analysis to analyze the effect between variables in general, and GWPR (Geographically Weighted Panel Regression) analysis to analyze spatial effect of regions. The result showed that the variable of economic vulnerability had a negative and significant effect on household consumption expenditures, especially in the Province of Lampung and Sumatera Selatan.</p> Adin Nugroho Prientananda Ghina Salsabila Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 430 445 10.34123/icdsos.v2023i1.293 Competitiveness and Factors Influencing Indonesian Clove Exports to Eight Export Destination Countries from 2005-2020 https://proceedings.stis.ac.id/icdsos/article/view/299 <p>Indonesia is the largest clove producer and exporter in the world, but from 2005 to 2020 the average clove export was dominated by Madagascar. As the largest clove producer, Indonesia should be able to dominate the export market, especially cloves. Therefore, this study aims to determine the competitiveness position of Indonesian cloves and analyze the economic factors that affect Indonesian cloves exports. In this study, the analysis method use Revealed Comparative Advantage (RCA), Export Product Dynamics (EPD), and a Fixed Effect Model (FEM) for panel data of eight export destination countries from 2005-2020. The results show that the competitiveness of Indonesian cloves is above the world average. The competitive position of Indonesia's clove exports in the Netherlands, Pakistan, Saudi Arabia, United Arab Emirates, United States, and Vietnam is a rising star. At the same time, the other two markets (India and Singapore) are falling stars. In addition, the export prices have a significant effect on the volume of Indonesian clove exports. Indonesian clove production and destination countries' GDP per capita have a positive effect, while economic distance has a negative effect on the volume of Indonesian clove exports.</p> Siti Ainia Hidayati Ekaria Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 446 457 10.34123/icdsos.v2023i1.299 The Socio-Economic Factors Influencing Sugar-Sweetened Beverages (SSB’s) Consumption in Household of DKI Jakarta Province in 2020 https://proceedings.stis.ac.id/icdsos/article/view/300 <p>Non-communicable diseases (NCDs) are responsible for causing 41 million deaths annually, constituting approximately 74% of all global fatalities. One of the key factors contributing to the elevated risk of NCD’s is the excessive consumption of sugary beverages, which encompass a variety of liquid products containing added sugars. This research endeavor seeks to identify the socioeconomic factors that perform in shaping the consumption patterns of sugary beverages within households residing in Province of DKI Jakarta. This study using data from the 2020 Susenas survey, contain a total of 5,456 sampled households. Binary logistic regression is used for modelling whether households had consumed sugary beverages during the preceding week or not. Variables such as marital status, gender, age, educational attainment, employment status of the household head, as well as internet accessibility, economic status, internet usage motives, and household size, were found to influence the likelihood of consuming sugar-sweetened beverages (SSBs) in DKI Jakarta Province. Based on these findings, it is recommended to enhance the use of the internet for promoting healthy lifestyles.</p> Muhammad Gozali Yahya Arya Candra Kusuma Norvan Bagus Ramadhan Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 458 473 10.34123/icdsos.v2023i1.300 Is The Wealth Index Better than The Proxy Means Test in Poverty Targeting? A Study in Brebes and East Jakarta https://proceedings.stis.ac.id/icdsos/article/view/302 <p>The ranking of household welfare status in targeting recipients of social protection programs is important and needs attention. Appropriate welfare status ranking is one of the keys for making the various types of programs designed by the government right on target. The Proxy Means Test method is popular in Indonesia in the 2015 Integrated Database Updating. Based on another popular statistical approach to ranking welfare status, the Wealth Index method is also known. Global surveys, such as Demographic and Health Surveys, Multiple Indicator Cluster Surveys, and World Food Program Surveys, have always used the Wealth Index to rank household welfare. Using Susenas data from March 2017 to March 2022, this study found that the Proxy Means Test method is better than the Wealth Index method in both Brebes Regency and East Jakarta City. The value of the classification error rate in Brebes Regency and East Jakarta City using the Proxy Means Test method is 13.94 percent and 10.37 percent, respectively. In comparison, the Wealth Index method is 25.12 percent and 14.74 percent. This research emphasizes that the results of the ranking of household welfare status are not only influenced by the method used but also by the socioeconomic conditions and characteristics of households data in the areas targeted by the program.</p> Nuri Taufiq I Made Giri Suyasa Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 474 493 10.34123/icdsos.v2023i1.302 Trajectory of life expectancy and its relation with socio-economic indicators among developing countries in Southeast Asian https://proceedings.stis.ac.id/icdsos/article/view/307 <p>Life expectancy is a one of key global health indicators and plays an important role in health policy measures. The status of a country indirectly influences the life expectancy of a nation.&nbsp; Developing countries have slower economic progress compared to developed countries, which in turn affects the well-being of the population. Therefore, this study aims to analyze the trend of life expectancy among developing countries in Southeast Asian and assess the influence of socio-economic indicators in life expectancy. Linear mixed effects model is used to model the association between socioeconomic factors and life expectancy. The results indicate that GDP growth rate, GDP per capita, and unemployment rate have significant impact on life expectancy and the impacts depend on gender. Life expectancy among females is generally higher than males. Prediction of life expectancy in males in year 2025 is found the lowest in Myanmar with average of 64.2 years (95%CI: 60.8-77.1) and the highest in Thailand with average of 76.2 years (95%CI: 60.7-76.9). Meanwhile, prediction of life expectancy in females is found the lowest in Timor Leste with average of 71.1 years (95%CI: 67.8-83.9) and the highest in Thailand with average of 84.3 years (95%CI: 68.7-84.9).</p> Madona Yunita Wijaya Yanne Irene Iqbal Rachadi Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 494 500 10.34123/icdsos.v2023i1.307 Comparison of Kernel Smoothing and Local Polynomial Smoothing Method in Overcoming Age Heaping https://proceedings.stis.ac.id/icdsos/article/view/312 <p><em>Age data plays an important role in every aspect yet there are found age misreporting. It involves digit preference that causes build up in a certain age. Digit preference in demography is called age heaping that often happens at age with 0 and 5 as the last digit. Age heaping induces poor data quality and data bias that could influence government policy making. Two indicators used to detect age heaping are Whipple Index (WI) and Myers Blended Index (MBI). Methods to cope with age heaping are nonparametric regression approaches which are Kernel Smoothing and Local Polynomial Smoothing. The objective of this research is to measure and elevate the quality of population age data and population mortality data in Sensus Penduduk (SP) 2020 as well as comparing methods between Kernel Smoothing and Local Polynomial Smoothing. The data being used in this paper is SP2020 which the research variables are age population, age of death, and total population. The result shows that the data quality of total population death is inaccurate compared to total population thus needs a smoothing process to improve age data to population data accuration. The method that has better accuracy is the Local Polynomial Smoothing method</em>.</p> Nadia Arsyta Putri Erni Tri Astuti Lalu Moh Arsal Fadila Salsabil Syadza Hafizhah Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 501 515 10.34123/icdsos.v2023i1.312 High-resolution-gridded rainfall dataset derived from surface observation by adjustment of satellite rainfall product https://proceedings.stis.ac.id/icdsos/article/view/314 <p>A high-resolution-gridded rainfall dataset is essential for many purposes. Such as analysis of extreme weather conditions, natural-disaster mitigation, or to be used as an input to the hydrological model. Satellite-based rainfall products (e.g., Global Satellite Mapping of Precipitation-GSMaP) can solve the spatial and temporal issues despite their rainfall intensity often being under or overestimated. This research aims to provide a high-resolution rainfall dataset by adjusting the 0.1 deg GSMaP rainfall data to the surface rainfall data from several observation points in the greater Jakarta area (Jabodetabek) during January 2020 when several flooding occurred in Jakarta. The adjustment process includes calculating the bias between the satellite estimation in the nearest observation point and interpolating the error back to the 0.01 deg grid by using radial basis function (RBF) to obtain the correction factor in every grid point, GSMaP data then adjusted by the correction factor. We implemented the method in January 2020 when several floods occurred in Jakarta. The result reveals a more realistic rainfall spatial distribution than regularly interpolating the observation data. The validation of adjusted rainfall estimation at the verification points also shows a reduction in domain-wide RMSE by 30 – 80%.</p> Achmad Rifani Muhammad Rezza Ferdiansyah Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 516 523 10.34123/icdsos.v2023i1.314 Does Farm Size Matter for Food Security Among Agricultural Households? Analysis of Indonesia’s Agricultural Integrated Survey Results https://proceedings.stis.ac.id/icdsos/article/view/318 <p>Most agricultural households in Indonesia are small-scale farmers making them prone to food insecurity. Until recently, no study has assessed the impact of farm size and sociodemographic characteristics on the food insecurity status of agricultural households using a nationwide agricultural household survey in Indonesia. Our study aims to address this gap by utilizing the results of the first Indonesian Agricultural Integrated Survey conducted by BPS in 2021. Applying the Rasch Model, Multinomial Logistic Regression, and Ordinary Least Squares Regression, we found that the farm size has a positive impact in lowering the likelihood of experiencing moderate or severe levels of food insecurity among agricultural households. Our study also found that agricultural households with a higher probability of being food insecure are characterized by having higher members of households, relying only on agricultural activities for their livelihood, lower education attainment of household heads, and being led by female farmers.</p> Kadir Ruslan Octavia Rizky Prasetyo Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 524 536 10.34123/icdsos.v2023i1.318 Identification of factors affecting the cases of under-age female marriage using geographically weighted panel regression approach in south kalimantan province. https://proceedings.stis.ac.id/icdsos/article/view/337 <p>The topic of this study was chosen because the percentage of underage female marriages in South Kalimantan Province was the highest in Indonesia over the last five years, from 2018 to 2022. This signifies that there are social issues in the local community that the government must address. One possible answer is to identify the factors that contribute to the creation of these conditions in each region. Using the Geographically Weighted Panel Regression (GWPR) method, this study attempts to determine the factors that influence the rise of underage female marriage instances in South Kalimantan Province. The number of poor individuals, population density, average duration of schooling, adjusted per capita expenditure, and total population were chosen as independent variables. Data acquired from South Kalimantan Province's Central Bureau of Statistics' periodic releases. Because there was high spatial heterogeneity between each location in this study, it was quite practical to employ the GWPR approach in developing a conjectural model. The results of evaluating the GWPR model with adaptive Gaussian kernel weights provide significant results and the model can explain the variance of data by 55 percent. Testing the parameters of the GWPR model reveals two (two) regional groupings with distinct influencing variables. The first group consists of ten (ten) regions that are considerably impacted by both the number of impoverished people and the average length of schooling, whereas the second group consists of three (three) regions that are impacted solely by the average length of schooling.</p> ABDULLAH RIFQI Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 537 545 10.34123/icdsos.v2023i1.337 Construction of Green City Index in Indonesian Metropolitan Districts/Cities https://proceedings.stis.ac.id/icdsos/article/view/342 <p>Urbanization in Indonesia resulted in population density in urban areas, which has the potential for economic growth, marked by increased population income followed by changes in consumption patterns that will cause environmental problems in urban areas. Seeing environmental issues that occur in urban areas, it is necessary to have a green city concept city planning as a sustainable city planning solution without damaging the environment. The measurement of green city achievement has yet to be carried out in Indonesia. This study aims to measure the Green City Index (GCI) in metropolitan districts/cities in Indonesia using Partial Least Squares-Structural Equation Modeling (PLS-SEM). It examines the GCI achievements in Indonesian metropolitan districts/cities. The GCI is formed by a socioeconomic dimension of two indicators and an environmental dimension of eleven indicators. Generally, the highest GCI achievements are in the Bogor District, with a score of 74.3 percent. Bangkalan District achieved the highest socioeconomic dimension index, and Bogor District completed the highest environmental dimension index. In addition, there is a significant and negative relationship between GCI and the Human Development Index (HDI) and economic growth. It is hoped that the government and the community can pay attention to the balance of the environment in their activities.</p> Vina Astriani Risni Julaeni Yuhan Bony Parulian Josaphat Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 546 561 10.34123/icdsos.v2023i1.342 hyper-Poisson Model for Overdispersed and Underdispersed Count Data https://proceedings.stis.ac.id/icdsos/article/view/344 <p>The Poisson model is commonly used for modelling count data. However, it has a limitation, namely the equality between the mean and variance (equidispersion) of the data to be modeled. Unfortunately, overdispersion (variance greater than the mean) and underdispersion (variance smaller than the mean) are more often to be found in real cases. Therefore, different models need to be used to handle data with these cases. The hyper-Poisson model is one model that can be used to handle overdispersion or underdispersion cases flexibly. This paper describes the hyper-Poisson model and its application on overdispersed and underdispersed count data.</p> Venda Damianus Situmorang Siti Nurrohmah Ida Fithriani Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 562 571 10.34123/icdsos.v2023i1.344 Vine Copula Model: Application to Chemical Elements in Water Samples https://proceedings.stis.ac.id/icdsos/article/view/346 <p>Copula can link the bivariate distribution function with marginal distribution functions without requiring specific information about the interdependence among random variables. There are several types of copulas, such as elliptical copulas, Archimedean copulas, and extreme value copulas. However, in multivariate modeling, each type of copula has limitations in modeling complex dependence structures in terms of symmetry and tail dependence properties. The class of vine copulas overcomes these limitations by constructing multivariate models using bivariate copulas in a tree-like structure. The bivariate copulas used in this study include the Clayton, Gumbel, Frank, Gaussian, and Student's t copula families. This study discusses the construction of vine copula models, parameter estimation, and their applications. The construction of vine copulas is done through the decomposition of conditional probability density functions and substituting bivariate copula density functions into the decomposition results. The data used in the study is the logarithm of the concentration of chemical elements in water samples in Colorado. The parameter estimation method used is pseudo-maximum likelihood with sequential estimation. Model selection is then performed using the Akaike information criterion (AIC) to determine the most suitable model. The results indicate that Caesium and Titanium have a dependency relationship with Scandium. Moreover, Scandium and Titanium exhibit the strongest dependence compared to other variable pairs.</p> Salsabila Zahra Aminullah Mila Novita Ida Fithriani Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 572 585 10.34123/icdsos.v2023i1.346 Achievement of Creative Economy Dimensions in Regional Development Indonesia in 2021 https://proceedings.stis.ac.id/icdsos/article/view/354 <p>So far, measurement for knowing development of creative economy in Indonesia has only seen from GDP and number of workers. Even though there are many other factors used to determine development of creative economy that are not included in these two measurements. Therefore, this study aims to develop a measurement that can be used as a tool for assessing and analyzing the state of creative economy in 34 provinces in Indonesia and for comparison. Data used is secondary that sourced from BPS and several agencies. Creative Economy Index (CEI) refers to Global Innovation Index which is composed from seven dimensions, institution, human capital and research, infrastructure, market sophistication, business sophistication, knowledge and technology outputs, creative outputs. Analysis method used is factor analysis to validate dimensions of CEI based on their indicators. Results showed that Indonesia`s CEI is relatively low. Dimension with highest achievement is institution, while lowest achievement is market sophistication. When compared by region, CEI in Western Region is higher than Eastern Region. There are also similarities with HDI and ICT-DI.</p> Anisa Nur Jannah Ekaria Ekaria Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 586 599 10.34123/icdsos.v2023i1.354 The Effect of Company Performance on Stock Returns in the LQ45 Stock Cluster in 2020-2022 https://proceedings.stis.ac.id/icdsos/article/view/368 <p>In the Indonesian capital market, various securities are traded, but stock investors dominate. The LQ45 index illustrates the Indonesian capital market condition is better than the JCI (Jakarta Composite Index). Stocks on the LQ45 index have large market capitalization, high liquidity, and good company fundamentals, but have varying returns. It is necessary to group the stock returns of LQ45 index companies. The method used is time series clustering in the 2020-2022 period. Furthermore, the logistic regression analysis is used to determine the effect of company performance that is consistent in the LQ45 index on stock return status. The results showed that the selected algorithm for clustering was K-Means with 2 optimal numbers of clusters characterized as lagging stock and leading stock. Then, company stocks in the LQ45 index for the 2020-2022 period tend to be classified as leading stocks if they have a low Debt to Equity Ratio but have a high Net Profit Margin and Price Earnings Ratio.</p> Auliya' Jami'atus Saufi Ekaria Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 600 612 10.34123/icdsos.v2023i1.368 Analysis of Spotify's Audio Features Trends using Time Series Decomposition and Vector Autoregressive (VAR) Model https://proceedings.stis.ac.id/icdsos/article/view/375 <div> <p class="Abstract"><span lang="EN-US">Streaming is the most popular music consumption method of the current times. As the biggest streaming platform based on subscriber number, Spotify stores miscellaneous information regarding the music in the platform, including audio features. Spotify’s audio features are descriptions of songs features in form of variables such as danceability, duration, and tempo. These features are accessible via Application Programming Interface (API). On the other hand, Spotify also publishes their own charts consisting of 200 most streamed songs on the platform (based on regions) which are updated daily. By combining Spotify’s song charts and the songs’ respective audio features, this research conducted analysis on musical trends using time series modelling. First, the combined data is decomposed to extract the trend features. Second, a Vector Autoregressive (VAR) model is built and followed by forecasting of the audio features. Lastly, the performance of forecasted values and the actual observations is evaluated. As a result, this research has proven that musical trends can be forecasted in the future for a short period by using VAR model with relatively low error. </span></p> </div> Daffa Adra Ghifari Machmudin Mila Novita Gianinna Ardaneswari Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 613 627 10.34123/icdsos.v2023i1.375 Interest Rate Transmission on Indonesia’s Monetary Policy Analysis: Case of Banking Interest Rate https://proceedings.stis.ac.id/icdsos/article/view/378 <p>Indonesia's economic stability should be achieved by implementing monetary and fiscal policies, for instance, setting the interest rate by Bank Indonesia (BI) as policy rate of central bank, which should be followed by other banking institutions. Unfortunately, this interest rate regulation by BI had not been able to achieve the goal of restoring economic stability since it always had long time lag. This happened because the policy of increasing interest rates had not been followed up spontaneously by other banking institutions. In fact, time lag might cause disadvantages such as long-lasting high inflation, increased poverty, and severe economy vulnerability. This research was conducted to analyze the time lag of the transmission of Bank Indonesia's interest rate monetary policy and the response of banking institutions in Indonesia. The method used in this study was survival analysis. The results indicated that the time lag of monetary policy transmission using the interest rate in Indonesia needed to be improved to double adjustment speed to reach the optimal point. The response of <br>banking institutions could be improved because there was still asymmetry response in all aspects including types of interest rates, allocations, and change direction. Meanwhile, from the aspect of ownership, both state-owned and private-owned banks had shown in line response of time lag performance.</p> Adin Nugroho Prientananda Ghina Salsabila Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 628 638 10.34123/icdsos.v2023i1.378 Analysis of Factors Affecting the Open Unemployment Rate (UOR) 2022 : A Case of Banten in Indonesia https://proceedings.stis.ac.id/icdsos/article/view/386 <p>Unemployment is one of the many economic problems. One type of unemployment is open unemployment. The Open Unemployment Rate (UOR) in Banten Province has the largest rate in Indonesia. In this study, exploratory factor analysis was used with the aim of finding out the factors that contribute to UOR in Banten Province. The data source used is secondary data obtained from publications by Badan Pusat Statistik. Factor 1 (human development) includes the Human Development Index, Economic Growth, Population Growth Rate, Literacy Rate and Mean Years of Schooling (MYS) and Factor 2 (population) consists of only one variable, that is total population. The results of this research show that total population, HDI, and MYS have the largest contribution to UOR in Banten Province. It is hoped that the government can increase business opportunities, employment opportunities and mature human development planning to reduce UOR in Banten Province.</p> Aqilla Haya Risma Dwi Lestari Tengku Mashitah Crisanty Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 639 649 10.34123/icdsos.v2023i1.386 Energy Poverty and Its Determinants at Subnational Level of Indonesia in 2021 https://proceedings.stis.ac.id/icdsos/article/view/389 <p class="Abstract" style="margin-bottom: 1.0cm;"><span lang="EN-GB" style="font-family: 'Times New Roman',serif;">In the coming decades, the energy sector will soon be faced with three major transformations, one of which is energy poverty. The World Economic Forum defines energy poverty as people's limited access to modern energy services and products. Access to modern energy has not been fully met for all regions in Indonesia and disparities between regions still occur. For this reason, indicators are needed to measure the level of energy poverty at both the national and district/city levels. This study aims to analyze energy poverty in Indonesia and determine its determinants using the Multidimensional Energy Poverty Index (MEPI) approach. The data used is the March National Socio-Economic Survey and BPS Village Potential in 2021. This research uses Geographically Weighted Regression (GWR) to determine the determinants of Indonesia's multidimensional energy poverty at the district/city level in 2021. It was found that there were still inequalities in energy poverty conditions in most of Indonesia's districts/cities. Analysis using the GWR model resulted in 66 regional groups that were grouped based on the similarity of variables that had a significant effect. The level of influence of the independent variables vary across districts/cities as consequence of spatial heterogeneity in the data.</span></p> Salbila Anandia Ramadanti Wahyuni Andriana Sofa Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 650 670 10.34123/icdsos.v2023i1.389 The Effect of Financial Development on Economic Growth in East Kalimantan in 2013-2021 https://proceedings.stis.ac.id/icdsos/article/view/392 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Indonesia has a strong commitment to realizing inclusive and sustainable economic growth. The 8th SDGs achievement program has become the government's main program implemented in all provinces in Indonesia. The economic growth of a region can be measured using growth of Gross Regional Domestic Product (GRDP). East Kalimantan is one of the largest GRDP contributing provinces in Indonesia with the mining and quarrying sector as the leading sector. However, economic growth in the province is still relatively low and has never reached national figures. This forces the government to consider and develop the potential of other sectors. The Fiscal Policy Agency stated that the financial sector with its development has driven Indonesia's economic growth in the last few decades. This study aims to analyze the general picture of economic growth and financial development as well as the influence of financial development factors on the economic growth of districts/cities in East Kalimantan Province in 2013-2021. The analytical method used in this research is panel data regression. The results obtained are number of bank offices per population, number of cooperatives per population, credit distribution per GRDP, and number of workers have a positive effect on the economic growth of districts/cities in East Kalimantan Province in 2013-2021.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Indonesia has a strong commitment to realizing inclusive and sustainable economic growth. The 8th SDGs achievement program has become the government's main program implemented in all provinces in Indonesia. The economic growth of a region can be measured using growth of Gross Regional Domestic Product (GRDP). East Kalimantan is one of the largest GRDP contributing provinces in Indonesia with the mining and quarrying sector as the leading sector. However, economic growth in the province is still relatively low and has never reached national figures. This forces the government to consider and develop the potential of other sectors. The Fiscal Policy Agency stated that the financial sector with its development has driven Indonesia's economic growth in the last few decades. This study aims to analyze the general picture of economic growth and financial development as well as the influence of financial development factors on the economic growth of districts/cities in East Kalimantan Province in 2013-2021. The analytical method used in this research is panel data regression. The results obtained are number of bank offices per population, number of cooperatives per population, credit distribution per GRDP, and number of workers have a positive effect on the economic growth of districts/cities in East Kalimantan Province in 2013-2021.</span></p> Raihan Hibatullah Aisyah Fitri Yuniasih S.S.T., S.E., M.Si. Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 671 682 10.34123/icdsos.v2023i1.392 Unlocking potential of data: A localized data-driven approach for stunting reduction in South Kalimantan Province https://proceedings.stis.ac.id/icdsos/article/view/394 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;This study addresses the issue of stunting in South Kalimantan Province, where high stunting prevalence rates persist. Through a comprehensive analysis of factors influencing stunting prevalence, predictive modeling using machine learning, and clustering analysis of districts based on stunting rates, the research aims to support the provincial government in formulating effective and sustainable strategies. The findings highlight influential factors such as HDI, poverty rates, immunization coverage, breasfed babies, number of uninhabitable houses, and access to clean water. The study also utilise machine learning to build model that aids in predicting future stunting prevalence, while clustering analysis categorizes districts into distinct groups. These insights guide the government in prioritizing interventions, setting prevalence targets, and determining strategic areas for stunting reduction efforts.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">This study addresses the issue of stunting in South Kalimantan Province, where high stunting prevalence rates persist. Through a comprehensive analysis of factors influencing stunting prevalence, predictive modeling using machine learning, and clustering analysis of districts based on stunting rates, the research aims to support the provincial government in formulating effective and sustainable strategies. The findings highlight influential factors such as HDI, poverty rates, immunization coverage, breasfed babies, number of uninhabitable houses, and access to clean water. The study also utilise machine learning to build model that aids in predicting future stunting prevalence, while clustering analysis categorizes districts into distinct groups. These insights guide the government in prioritizing interventions, setting prevalence targets, and determining strategic areas for stunting reduction efforts.</span></p> Farah Rizkiah Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 683 697 10.34123/icdsos.v2023i1.394 Role of E-Commerce on Entrepreneurial Welfare in Indonesia https://proceedings.stis.ac.id/icdsos/article/view/397 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;This study aims to determine the role of the use of e-commerce on the welfare of entrepreneurs in Indonesia during the Covid-19 pandemic. Based on the August 2021 Sakernas data sourced from BPS, the estimation results using binomial logistic regression show that e-commerce has an important role in increasing the welfare of entrepreneurs in Indonesia during the Covid-19 pandemic. The use of e-commerce was able to increase the income of entrepreneurs in Indonesia. Entrepreneurial activities using e-commerce are quite promising in the midst of limited business fields and post-pandemic economic recovery conditions in Indonesia, so the government needs to provide economic support and training to develop digital entrepreneurship activities in the labor force in Indonesia.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">This study aims to determine the role of the use of e-commerce on the welfare of entrepreneurs in Indonesia during the Covid-19 pandemic. Based on the August 2021 Sakernas data sourced from BPS, the estimation results using binomial logistic regression show that e-commerce has an important role in increasing the welfare of entrepreneurs in Indonesia during the Covid-19 pandemic. The use of e-commerce was able to increase the income of entrepreneurs in Indonesia. Entrepreneurial activities using e-commerce are quite promising in the midst of limited business fields and post-pandemic economic recovery conditions in Indonesia, so the government needs to provide economic support and training to develop digital entrepreneurship activities in the labor force in Indonesia.</span></p> Fitriani Aditya Putri Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 698 707 10.34123/icdsos.v2023i1.397 Agricultural Digitalization: Can This Transformation Increase Farmers' Income In East Java? https://proceedings.stis.ac.id/icdsos/article/view/412 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;The era of the industrial revolution 4.0 has encouraged various economic sectors to utilize technology and information in their activities, including the agricultural sector. This study provides an overview of the impact of agricultural digitization on farmers' income and examines the characteristics of farmers in East Java who have and have not utilized agricultural digitalization as a first step toward agricultural extension targets. The data comes from the August 2022 National Labor Force Survey in East Java conducted by BPS-Statistics Indonesia with a sample size of 7.852 farmers carrying out agricultural businesses. The t-Student test results show that farmers who utilize agricultural digitization have an average income higher than those who do not utilize it. The binary logistic regression results also show that digitization of agriculture, gender, education, agricultural business field, and business status also affect farmers' income. The results random undersampling analysis and random oversampling classification and regression trees results show that there are two types of characteristics of farmers in East Java who take advantage of agricultural digitization, namely farmers who graduated at least junior high school and farmers who graduated elementary school/equivalent, come from X, Y, or Z generations, and work assisted by permanent workers/paid workers.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">The era of the industrial revolution 4.0 has encouraged various economic sectors to utilize technology and information in their activities, including the agricultural sector. This study provides an overview of the impact of agricultural digitization on farmers' income and examines the characteristics of farmers in East Java who have and have not utilized agricultural digitalization as a first step toward agricultural extension targets. The data comes from the August 2022 National Labor Force Survey in East Java conducted by BPS-Statistics Indonesia with a sample size of 7.852 farmers carrying out agricultural businesses. The t-Student test results show that farmers who utilize agricultural digitization have an average income higher than those who do not utilize it. The binary logistic regression results also show that digitization of agriculture, gender, education, agricultural business field, and business status also affect farmers' income. The results random undersampling analysis and random oversampling classification and regression trees results show that there are two types of characteristics of farmers in East Java who take advantage of agricultural digitization, namely farmers who graduated at least junior high school and farmers who graduated elementary school/equivalent, come from X, Y, or Z generations, and work assisted by permanent workers/paid workers.</span></p> Reni Amelia Akhmad Munim Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 708 720 10.34123/icdsos.v2023i1.412 Using Data Science to Assess the Impact of Disaster Event on Climate Change Belief: Case of Australian Bushfire Catastrophe https://proceedings.stis.ac.id/icdsos/article/view/413 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Australia, vulnerable to bushfire incidents due to its unique climatic conditions, witnessed a transformative event in the 2019-2020 bushfire season. This research examines the impact of these bushfires on public perception of climate change. Leveraging robust statistical techniques, including McNemar's hypothesis testing and logistic regression, the study deciphers survey data collated pre and post these fires. The study's hypothesis that post-fire respondents are more likely to acknowledge climate change's role is confirmed. Factors such as education, political affiliation, and support for fossil fuel reduction are identified as influential predictors of climate change belief. The analysis also highlights the complex interplay of demographic characteristics and media exposure in shaping attitudes. Notably, direct firebush exposure showed a nuanced relationship with belief. The research underscores a significant shift in Australian attitudes toward climate change following the bushfires. These findings contribute to our understanding of public opinion dynamics and the role of experiential factors in climate change belief.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Australia, vulnerable to bushfire incidents due to its unique climatic conditions, witnessed a transformative event in the 2019-2020 bushfire season. This research examines the impact of these bushfires on public perception of climate change. Leveraging robust statistical techniques, including McNemar's hypothesis testing and logistic regression, the study deciphers survey data collated pre and post these fires. The study's hypothesis that post-fire respondents are more likely to acknowledge climate change's role is confirmed. Factors such as education, political affiliation, and support for fossil fuel reduction are identified as influential predictors of climate change belief. The analysis also highlights the complex interplay of demographic characteristics and media exposure in shaping attitudes. Notably, direct firebush exposure showed a nuanced relationship with belief. The research underscores a significant shift in Australian attitudes toward climate change following the bushfires. These findings contribute to our understanding of public opinion dynamics and the role of experiential factors in climate change belief.</span></p> Diaz Prasetyo Trisna Mulyati Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 721 736 10.34123/icdsos.v2023i1.413 Formulation of Kumaraswamy Generalized Inverse Lomax Distribution https://proceedings.stis.ac.id/icdsos/article/view/416 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Lifetime data is a type of data that consists of a waiting time until an event occurs and modelled by numerous distributions. One of its characteristics that is interesting to be studied is the hazard function due to the flexibility that it has compared to other characteristics of distribution. Inverse Lomax (IL) distribution is one of the distributions considered to have advantages in modelling hazard shape and extended in several ways to address the problem of non-monotone hazard which is often encountered in real life data. However, it needs to be extended to another family of distribution to increase its modelling potential and Kumaraswamy Generalized (KG) family of distribution is used as it adds two more parameters to the distribution. The newly developed distribution is called the Kumaraswamy Generalized Inverse Lomax (KGIL) distribution. The main characteristics of KGIL distribution will be derived, such as cumulative distribution function (cdf), probability density function (pdf), hazard function, and survival function. Maximum likelihood method will also be used to estimate the parameters. The application of the new model is based on head-and-neck cancer lifetime data set. The modelling results show that the KGIL distribution is the best to capture important details of the data set considered&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Lifetime data is a type of data that consists of a waiting time until an event occurs and modelled by numerous distributions. One of its characteristics that is interesting to be studied is the hazard function due to the flexibility that it has compared to other characteristics of distribution. Inverse Lomax (IL) distribution is one of the distributions considered to have advantages in modelling hazard shape and extended in several ways to address the problem of non-monotone hazard which is often encountered in real life data. However, it needs to be extended to another family of distribution to increase its modelling potential and Kumaraswamy Generalized (KG) family of distribution is used as it adds two more parameters to the distribution. The newly developed distribution is called the Kumaraswamy Generalized Inverse Lomax (KGIL) distribution. The main characteristics of KGIL distribution will be derived, such as cumulative distribution function (cdf), probability density function (pdf), hazard function, and survival function. Maximum likelihood method will also be used to estimate the parameters. The application of the new model is based on head-and-neck cancer lifetime data set. The modelling results show that the KGIL distribution is the best to capture important details of the data set considered</span></p> Andrew Bony Nabasar Manurung Siti Nurrohmah Ida Fithriani Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 737 744 10.34123/icdsos.v2023i1.416 Can Paddy Growing Phase Produce an Accurate Forecast of Paddy Harvested Area in Indonesia? Analysis of the Area Sampling Frame Results https://proceedings.stis.ac.id/icdsos/article/view/316 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Our study aims to evaluate the accuracy of the forecasts produced based on the paddy growing phase obtained from the results of the Area Sampling Frame (ASF) Survey and, as a comparison, proposes an alternative forecast method taking into account the seasonal pattern and hierarchical structure of the national paddy harvested area estimation obtained from the ASF to improve the accuracy. In doing so, we calculated the MAPE by comparing the realization of paddy harvested area during the period January to September 2022 with their forecasts produced from the area of generative, late vegetative, and early vegetative phases. We also implemented a Hierarchical forecasting method on monthly data of the harvested area from January 2018 to August 2022 for all provinces. Specifically, we applied the bottom-up method for the reconciliation and the rolling window method to produce a three-consecutive month forecast for the period January to September 2022. We found that the accuracy prediction based on the paddy growing phase is moderately accurate. The combination of the bottom-up reconciliation method and the SARIMA model produces a much better accuracy for the national figure of paddy harvested area as shown by a lower MAPE. Our findings suggest that the Hierarchical forecasting method could be an alternative for the prediction of harvested area based on the ASF results other than the prediction obtained from the standing crops.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Our study aims to evaluate the accuracy of the forecasts produced based on the paddy growing phase obtained from the results of the Area Sampling Frame (ASF) Survey and, as a comparison, proposes an alternative forecast method taking into account the seasonal pattern and hierarchical structure of the national paddy harvested area estimation obtained from the ASF to improve the accuracy. In doing so, we calculated the MAPE by comparing the realization of paddy harvested area during the period January to September 2022 with their forecasts produced from the area of generative, late vegetative, and early vegetative phases. We also implemented a Hierarchical forecasting method on monthly data of the harvested area from January 2018 to August 2022 for all provinces. Specifically, we applied the bottom-up method for the reconciliation and the rolling window method to produce a three-consecutive month forecast for the period January to September 2022. We found that the accuracy prediction based on the paddy growing phase is moderately accurate. The combination of the bottom-up reconciliation method and the SARIMA model produces a much better accuracy for the national figure of paddy harvested area as shown by a lower MAPE. Our findings suggest that the Hierarchical forecasting method could be an alternative for the prediction of harvested area based on the ASF results other than the prediction obtained from the standing crops.</span></p> Kadir Ruslan Octavia Rizky Prasetyo Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 746 755 10.34123/icdsos.v2023i1.316 Harnessing Blockchain in BPS Microdata Dissemination https://proceedings.stis.ac.id/icdsos/article/view/325 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Towards achieving BPS-Statistics Indonesia missions, the dissemination process of statistical products must be carried out well. One of BPS-Statistics Indonesia statistical products is the microdata. In this case, the activity of disseminating microdata should be conducted through the implementation of feasible best practices. Related to that, in a fast-paced of the ever-changing world that heavily relies on the evolution of technology, the process of bringing out the best efforts in disseminating microdata must as well follow the rhythm of the moving technology to meet the current needs of the digital society, because otherwise it will be obsolete as time goes by. One of the important issues is the limitation of the existing system in tracking microdata to ensure its authenticity and integrity, in case where the users have purchased the microdata from BPS-Statistics Indonesia. In addressing this traceability issue, a solution through the implementation of the cutting-edge Blockchain technology is considered. A design is proposed to incorporate Blockchain into the existing mechanism of BPS-Statistics Indonesia microdata dissemination. Therefore, a system architecture and a schema for smart contract utilization are proposed to reinforce the microdata tracking.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Towards achieving BPS-Statistics Indonesia missions, the dissemination process of statistical products must be carried out well. One of BPS-Statistics Indonesia statistical products is the microdata. In this case, the activity of disseminating microdata should be conducted through the implementation of feasible best practices. Related to that, in a fast-paced of the ever-changing world that heavily relies on the evolution of technology, the process of bringing out the best efforts in disseminating microdata must as well follow the rhythm of the moving technology to meet the current needs of the digital society, because otherwise it will be obsolete as time goes by. One of the important issues is the limitation of the existing system in tracking microdata to ensure its authenticity and integrity, in case where the users have purchased the microdata from BPS-Statistics Indonesia. In addressing this traceability issue, a solution through the implementation of the cutting-edge Blockchain technology is considered. A design is proposed to incorporate Blockchain into the existing mechanism of BPS-Statistics Indonesia microdata dissemination. Therefore, a system architecture and a schema for smart contract utilization are proposed to reinforce the microdata tracking.</span></p> Florencia Satwika Genah Dea Venditama Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 756 766 10.34123/icdsos.v2023i1.325 Development of Paddy Yield Gap Between Java and Outside Java: Does It Have a Contribution to Paddy Yield Improvement from 2018 to 2021? https://proceedings.stis.ac.id/icdsos/article/view/330 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Increasing the paddy yield is crucial for Indonesia to maintain its national rice sufficiency amid the consistent depletion of wetland paddy areas. In this regard, the yield disparities between regions are challenging, particularly between Java and outside Java. Our study aims to examine the development of the paddy yield gap between the two regions from 2018 to 2021 and its contribution to paddy yield improvement during the period. Using the results of the National Crop-cutting Survey, we found that while the paddy yield in Java outperformed the paddy yield outside Java, the yield difference between the two regions narrowed from around 26 per cent in 2018 to 22 per cent in 2021 due to the increase of the yield outside Java. The results of the Blinder-Oaxaca decomposition suggested that the narrowing gap has a significant contribution to the national paddy yield increase from 2018 to 2021. Our finding confirms that narrowing the yield gap between the two regions by increasing the yield outside Java is crucial to improving paddy yield in Indonesia. Our study also pointed out that improvement in irrigation systems, fertilizer use, and fertilizer assistance are important factors in maintaining the paddy yield and narrowing the gap.&quot;}" data-sheets-userformat="{&quot;2&quot;:513,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0}">Increasing the paddy yield is crucial for Indonesia to maintain its national rice sufficiency amid the consistent depletion of wetland paddy areas. In this regard, the yield disparities between regions are challenging, particularly between Java and outside Java. Our study aims to examine the development of the paddy yield gap between the two regions from 2018 to 2021 and its contribution to paddy yield improvement during the period. Using the results of the National Crop-cutting Survey, we found that while the paddy yield in Java outperformed the paddy yield outside Java, the yield difference between the two regions narrowed from around 26 per cent in 2018 to 22 per cent in 2021 due to the increase of the yield outside Java. The results of the Blinder-Oaxaca decomposition suggested that the narrowing gap has a significant contribution to the national paddy yield increase from 2018 to 2021. Our finding confirms that narrowing the yield gap between the two regions by increasing the yield outside Java is crucial to improving paddy yield in Indonesia. Our study also pointed out that improvement in irrigation systems, fertilizer use, and fertilizer assistance are important factors in maintaining the paddy yield and narrowing the gap.</span></p> Kadir Ruslan Octavia Rizky Prasetyo Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 767 779 10.34123/icdsos.v2023i1.330 Analysis of Indonesian Domestic Tourist Quality https://proceedings.stis.ac.id/icdsos/article/view/350 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Domestic tourism is being the main focus of the government strategy to revitalize the tourism sector. It is then crucial to consider elements that can raise the quality of tourism, in terms of domestic tourists and increase the added value rather than merely number of trips. The analysis of quality is important in tourism to support the idea of sustainable tourism, which is promoted in the 8th agenda of Sustainable Development Goals (SDGs). Quality analysis must be done in micro modelling that takes into account tourist characteristics and particular travel-related features because this sector depends on tourism demand and tourist expenditure in tourist locations. Thus, the goal of this study is to give a general overview of the qualities and characteristics of domestic tourists and to examine how these attributes affect their quality. The results of descriptive analysis method indicate that Indonesian domestic visitors’ quality remains poor. Age, genders, education level, employment status, transportation mode, accommodation type and travel companion affect the quality of domestic tourists.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Domestic tourism is being the main focus of the government strategy to revitalize the tourism sector. It is then crucial to consider elements that can raise the quality of tourism, in terms of domestic tourists and increase the added value rather than merely number of trips. The analysis of quality is important in tourism to support the idea of sustainable tourism, which is promoted in the 8th agenda of Sustainable Development Goals (SDGs). Quality analysis must be done in micro modelling that takes into account tourist characteristics and particular travel-related features because this sector depends on tourism demand and tourist expenditure in tourist locations. Thus, the goal of this study is to give a general overview of the qualities and characteristics of domestic tourists and to examine how these attributes affect their quality. The results of descriptive analysis method indicate that Indonesian domestic visitors’ quality remains poor. Age, genders, education level, employment status, transportation mode, accommodation type and travel companion affect the quality of domestic tourists.</span></p> Martha Zalukhu Neli Agustina Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 780 791 10.34123/icdsos.v2023i1.350 Prediction of Central Java’s Number of Exports to Four ASEAN Countries Using the Markov Chain Analysis https://proceedings.stis.ac.id/icdsos/article/view/371 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;Central Java is one of the provinces that has many of natural resources and extraordinary industrial potential, able to offer reliable prospects to various developed countries in ASEAN, namely Singapore, Brunei Darussalam, Malaysia, and Thailand, to become the focus of exploration attention. Therefore, a prediction is made of Central Java's exports to the four ASEAN countries in 2022 and 2023 by applying the Markov chain analysis method. The prediction results obtained that the total exports to Singapore, Brunei Darussalam, Malaysia and Thailand in a row in 2022 are 0.701, 0.001, 0.239, and 0.058. While the predictions for 2023 for the four countries are 0.540, 0.001, 0.409, and 0.050 respectively. Meanwhile, the steady state of the Markov chain is 0.3595 for Singapore, 0.0013 for Brunei Darussalam, 0.6001 for Malaysia, and 0.0389 for Thailand. The results of this prediction can assist parties involved in making economic decisions related to Central Java's exports to developed countries in ASEAN. Information regarding predictions of an increase or decrease in exports from one year to the next can be used as a reference for business people, governments and related organizations to plan more appropriate and efficient economic strategies and policies.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">Central Java is one of the provinces that has many of natural resources and extraordinary industrial potential, able to offer reliable prospects to various developed countries in ASEAN, namely Singapore, Brunei Darussalam, Malaysia, and Thailand, to become the focus of exploration attention. Therefore, a prediction is made of Central Java's exports to the four ASEAN countries in 2022 and 2023 by applying the Markov chain analysis method. The prediction results obtained that the total exports to Singapore, Brunei Darussalam, Malaysia and Thailand in a row in 2022 are 0.701, 0.001, 0.239, and 0.058. While the predictions for 2023 for the four countries are 0.540, 0.001, 0.409, and 0.050 respectively. Meanwhile, the steady state of the Markov chain is 0.3595 for Singapore, 0.0013 for Brunei Darussalam, 0.6001 for Malaysia, and 0.0389 for Thailand. The results of this prediction can assist parties involved in making economic decisions related to Central Java's exports to developed countries in ASEAN. Information regarding predictions of an increase or decrease in exports from one year to the next can be used as a reference for business people, governments and related organizations to plan more appropriate and efficient economic strategies and policies.</span></p> Ria Novita Awalia Ramadhani Andreas Rony Wijaya ALIFIA ZAHRA WINESTI DESTY MAYANG PRATIWI Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 792 797 10.34123/icdsos.v2023i1.371 Development of FASIH Application for the Badan Pusat Statistisk using Flutter Framework https://proceedings.stis.ac.id/icdsos/article/view/404 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;One of the data collection methods used by the Badan Pusat Statistik (BPS) is Computer Assisted Personal Interviewing (CAPI). Currently, CAPI, known as FASIH, is continuously updated by BPS using the Kotlin programming language, which can run on the Android platform. It is possible that FASIH will be needed in a multiplatform form. However, there is an alternative for multiplatform application development, namely Flutter, which can be used in the development of FASIH. Nevertheless, BPS has not conducted any study on the development of the FASIH application using Flutter, hence the strengths and weaknesses of implementing this technology in FASIH application development remain unknown. Therefore, the author aims to conduct a study on the development of the FASIH application utilizing Flutter. The application development is carried out using the Rapid Application Development (RAD) Prototyping method. The resulting application is tested using black box testing and performance testing using a third-party application, Apptim. The black box testing results indicate that the application meets the functional requirements of stakeholders. In terms of performance, the Kotlin version of FASIH outperforms the Flutter version. However, Flutter has an advantage in accelerating development time. Additionally, concerning user interface development, the Flutter version of the FASIH application can run on multiple platforms. Nevertheless, further integration is required to ensure the proper functioning of the Flutter version of the FASIH application.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">One of the data collection methods used by the Badan Pusat Statistik (BPS) is Computer Assisted Personal Interviewing (CAPI). Currently, CAPI, known as FASIH, is continuously updated by BPS using the Kotlin programming language, which can run on the Android platform. It is possible that FASIH will be needed in a multiplatform form. However, there is an alternative for multiplatform application development, namely Flutter, which can be used in the development of FASIH. Nevertheless, BPS has not conducted any study on the development of the FASIH application using Flutter, hence the strengths and weaknesses of implementing this technology in FASIH application development remain unknown. Therefore, the author aims to conduct a study on the development of the FASIH application utilizing Flutter. The application development is carried out using the Rapid Application Development (RAD) Prototyping method. The resulting application is tested using black box testing and performance testing using a third-party application, Apptim. The black box testing results indicate that the application meets the functional requirements of stakeholders. In terms of performance, the Kotlin version of FASIH outperforms the Flutter version. However, Flutter has an advantage in accelerating development time. Additionally, concerning user interface development, the Flutter version of the FASIH application can run on multiple platforms. Nevertheless, further integration is required to ensure the proper functioning of the Flutter version of the FASIH application.</span></p> Riofebri Prasetia Prasetia Lutfi Rahmatuti Maghfiroh Maghfiroh Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 798 809 10.34123/icdsos.v2023i1.404 Small Area Estimation of Multidimensional Poverty in East Java Province Using Satellite Imagery https://proceedings.stis.ac.id/icdsos/article/view/417 <p><span data-sheets-root="1" data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;The government has so far focused on a monetary approach to overcoming poverty, while poverty is multidimensional. Holistic and accurate poverty indicators are needed as material for policy formulation, such as the Multidimensional Poverty Index (IKM), which is calculated from raw data from the National Socioeconomic Survey (SUSENAS). However, the direct estimation of the multidimensional poverty headcount (AKM) is only accurate at the provincial level, as seen from the relative standard error (RSE) of several districts and cities, which is still above 25 percent. Increasing the sample size requires time, effort, and cost, so the Small Area Estimation (SAE) method can be an alternative. Apart from using official statistics for accompanying variables, satellite imagery has the advantage of being up-to-date and available up to a granular level. This study aims to estimate the AKM at the district/city level in East Java Province by utilizing satellite imagery and official statistics in SAE. The results showed that SAE HB Beta-logistics, with the accompanying variables combined with satellite imagery and official statistics, has a higher accuracy than direct estimation.&quot;}" data-sheets-userformat="{&quot;2&quot;:33554945,&quot;3&quot;:{&quot;1&quot;:0},&quot;12&quot;:0,&quot;28&quot;:1}">The government has so far focused on a monetary approach to overcoming poverty, while poverty is multidimensional. Holistic and accurate poverty indicators are needed as material for policy formulation, such as the Multidimensional Poverty Index (IKM), which is calculated from raw data from the National Socioeconomic Survey (SUSENAS). However, the direct estimation of the multidimensional poverty headcount (AKM) is only accurate at the provincial level, as seen from the relative standard error (RSE) of several districts and cities, which is still above 25 percent. Increasing the sample size requires time, effort, and cost, so the Small Area Estimation (SAE) method can be an alternative. Apart from using official statistics for accompanying variables, satellite imagery has the advantage of being up-to-date and available up to a granular level. This study aims to estimate the AKM at the district/city level in East Java Province by utilizing satellite imagery and official statistics in SAE. The results showed that SAE HB Beta-logistics, with the accompanying variables combined with satellite imagery and official statistics, has a higher accuracy than direct estimation.</span></p> Helen Cantika Laura Aisyatul Ridho Rindang Bangun Prasetyo Copyright (c) 2023 Proceedings of The International Conference on Data Science and Official Statistics 2023-12-29 2023-12-29 2023 1 810 829 10.34123/icdsos.v2023i1.417