Proceedings of The International Conference on Data Science and Official Statistics 2022-01-04T00:00:00+00:00 Open Journal Systems Extracting Consumer Opinion on Indonesian E-Commerce: A Rating Evaluation and Lexicon-Based Sentiment Analysis 2021-08-15T03:52:07+00:00 Arbi Setiyawan Arie Wahyu Wijayanto He Youshi <p><span class="NormalTextRun BCX0 SCXW127065436" data-ccp-parastyle="Abstract">E-commerce as a business platform offers abundant advantages in modern life all over the world. Sellers and buyers at online marketplaces may get benefits and advantages from e-commerce. One of the advantages is that e-commerce can be accessed anywhere and anytime. Despite providing advantages, e-commerce also has disadvantages including product quality fraud and data theft. Online marketplaces provide facilities for consumer evaluation, through star rating and consumer reviews. In this paper, we focus on the Business-to-Consumer (B2C) e-commerce type and extract consumer opinion data from a leading online marketplace in Indonesia and use text mining approaches to compare the rating evaluation and sentiment analysis on </span><span class="NormalTextRun BCX0 SCXW127065436" data-ccp-parastyle="Abstract">consumer</span><span class="NormalTextRun BCX0 SCXW127065436" data-ccp-parastyle="Abstract"> reviews. With 2,937 records, we investigate the relationship between star rating and lexicon-based sentiment analysis. From the results, we found that most consumers do not hesitantly provide a good evaluation indicated by a 5-star rating and positive sentiment of reviews. A quite polarized rating distribution is found and indicates a straightforward consumer opinion. However, a further examination of the relation between rating and review, we discover inconsistencies in consumer opinion where the good rating may also contain negative reviews. Our result findings provide an insight to build a more integrated consumer opinion indicator in e-commerce and that online marketplace sellers need to look deeper at the detailed reviews rating.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Knowledge Management System in Official Statistics: An Empirical Investigation on Indonesia Population Census 2021-08-10T04:26:48+00:00 Achmad Muchlis Abdi Putra Arie Wahyu Wijayanto <p>National statistical offices around the world show a strong interest in producing reliable, objective, and accurate information in compliance with a high level of professional and scientific standards. Such a set of information provided by government agencies is known as the official statistics. To support the potential of knowledge-based business processes and deliver high-quality public services, knowledge management systems (KMS) are undoubtedly required. In this work, we study the impact of embracing KMS in one of the most massive scale statistical census in South East Asia, the 2020 Indonesia Population Census (IPC2020). The regression analysis is utilized in this study where the perceived usefulness is the dependent variable and the perceived ease of use become the independent variable. Our findings reveal that KMS utilization gains a positive influence on the perceived ease of use and usefulness among the stakeholders and organizing personnel. This provides an incentive to enlarge the range of implementation and improve the system and infrastructure capability to better support the knowledge-driven collaboration among stakeholders of the statistical office.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Optimization of Waste Transportation Routes using Multi-objective Non-dominated Sorting Genetic Algorithm II (MNSGA-II) in the Eastern and Southern Regions of Bandung City, Indonesia 2021-10-07T16:40:23+00:00 Natasya Afira Arie Wahyu Wijayanto <p>Ensuring high-quality and effective urban waste management has been an important priority to achieve sustainable and environmental-friendly cities and communities mandated by Sustainable Development Goals (SDGs). The massively growing population in urban regions of developing countries, such as Bandung City, Indonesia, leads to the increasing volume of daily goods consumption and households waste production. The waste transportation route is one of the main determining factors for the cost of waste management. In this paper, we introduce the Multi-objective Non-dominated Sorting Genetic Algorithm II (MNSGA-II) to solve the waste transportation route optimization problem in the Eastern and Southern Regions of Bandung City, Indonesia. Compared to the existing traditional evolutionary algorithms, MNSGA-II offers three major important benefits: efficient computational complexity, no requirement of sharing parameters, and a non-elitism mechanism. Algorithm parameters include the number of generations, mutation rate, and crossover rate. Our extensive experiments suggest the best solution resulted in 14 routes with a total distance of 152,63 km. Further, our proposed route optimization is potentially beneficial to support the improvement of the sustainable waste management service system at Bandung City.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Preserving Women Public Restroom Privacy using Convolutional Neural Networks-Based Automatic Gender Detection 2021-10-07T16:39:59+00:00 Desi Kristiyani Arie Wahyu Wijayanto <p>Personal safety and privacy have been the significant concerns among women to use and access public restrooms/toilets, especially in developing countries such as Indonesia. Privacy-enhancing designs are unquestionably expected to ensure no men entering the rooms neither intentionally nor accidentally without prior notice. In this paper, we propose a facial recognition approach to ensure women's safety and privacy in public restroom areas using Convolutional Neural Networks (CNN) model as a gender classifier. Our main contributions are as follows: (1) a webcam feed automatic gender detection model using CNN which may further be connected to a security alarm (2) a publicly available gender-annotated image dataset that embraces Indonesian facial recognition samples. Supplementary Indonesian facial examples are taken from a government-affiliated college, Politeknik Statistika STIS students' photo datasets. The experimental results show a promising accuracy of our proposed model up to 95.84%. This study could be beneficial and useful for wider implementation in supporting the safety system of public universities, offices, and government buildings.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Enhancing Official Statistics Data Dissemination using Google Firebase Platform on Mobile Application: A User-Centered Design Approach 2021-10-07T16:39:34+00:00 Wisma Eka Nurcahyanti <p>The dissemination of official statistics as publicly available information has been mandated in the United Nations Fundamental Principles of Official Statistics (UNFPOS) to be highly accessible to all users. Recently, with an increasing volume of data and public demand, National Statistical Offices (NSO) including Statistics Indonesia (BPS) are being challenged to provide accurate, excellent-quality, and user-friendly information. In this paper, we introduce our attempts to enhance the official statistics data dissemination by developing an Android-based mobile application using a User-Centered Design (UCD) approach to meet the requirement of specifically targeted users. Google Firebase platform is utilized to improve the administrator-level usability in updating the disseminated information. The proposed mobile application is launched at BPS-Statistics Madiun Municipality, East Java Province called Batu Cadas (an acronym for BAca TUjuh CAtatan DAta Statistik). Further evaluations using Black-Box functionality testing, System Usability Scale, and specific needs comparison conclude that the proposed mobile application is sufficient to cover the gap between user needs and the currently existing applications.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Study of Handwriting Recognition Implementation in Data Entry of Survei Angkatan Kerja Nasional (SAKERNAS) using CNN 2021-10-07T16:39:12+00:00 Yusron Farid Mustafa Farid Ridho Siti Mariyah <p>The use of Paper and Pencil Interviewing (PAPI) at BPS requires manual data entry that cannot be separated from the human ability to recognize handwriting. For computers, handwriting recognition is complex work that requires complex algorithms. Convolutional Neural Network (CNN) is an algorithm that can accommodate the complexity of handwriting recognition. This research intends to conduct a study on the implementation of the handwriting recognition model using CNN in recognizing handwriting on the PAPI questionnaire in data entry activities. Handwriting recognition model was built using the EMNIST dataset separately according to its character type and provides 89% accuracy for characters in the form of letters and numbers, 95% for characters in the form of letters, and 99% for characters in the form of numbers. Implementation of the handwriting recognition on the questionnaire image shows good results with 83.33% accuracy. However, there are problems found in the process of character segmentation where characters are not segmented correctly because the line of writing continues on the character that should be separated and disconnected characters when they should be joined. The result obtained in this study is expected to be a consideration regarding the entry method data used by BPS later.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Bayesian Network Model to Distinguish COVID-19 for Illness with Similar Symptoms 2021-10-07T16:47:28+00:00 Emir Luthfi Arie Wahyu Wijayanto <p>Numerous diseases and illnesses exhibit similar physical and medical symptoms, such as COVID-19 and its similar disguised illness (common cold, flu, and seasonal allergies). In this study, we construct a Bayesian Network model to distinguish such symptom variables in a classification task. The Bayesian Network model has been widely used as a classifier comparable to machine learning models. We develop the model with a scoring-based method and implement it using a hill-climbing algorithm with the Bayesian information criterion (BIC) score approach. Experimental evaluations using publicly available Mayo Clinic based data using this Bayesian Network model that present Directed Acyclic Graph (DAG) which can show the relationship between the similar symptoms and the type of disease with Conditional Probability Table (CPT). This model shows a promising accuracy performance up to 93.14% which is better than the performance of other machine learning classifiers, including the Support Vector Machine (SVM) and the ensemble approaches such as Random Forest (RF), while slightly smaller than that of the neural networks (NN).</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Learning Bayesian Network for Rainfall Prediction Modeling in Urban Area using Remote Sensing Satellite Data (Case Study: Jakarta, Indonesia) 2021-08-15T04:03:14+00:00 Salwa Rizqina Putri Arie Wahyu Wijayanto <p>Rainfall modeling is one of the most critical factors in agricultural monitoring and statistics, transportation schedules, and urban flood prevention. Weather anomaly during the dry season in urban coastal areas of tropical countries such as Jakarta, Indonesia has become a challenging issue that causes unexpected changes in rain patterns. In this paper, we propose the Bayesian Network (BN) approach to model the probabilistic nature of rain patterns in urban areas and causal relationships among its predictor variables. Rain occurrences are predicted using temperature, relative humidity, mean-sea level (MSL) pressure, cloud cover, and precipitation variables. Data are obtained from the remote sensing sources of the National Oceanic and Atmospheric Administration (NOAA) satellite in Jakarta 2020-2021. We compare both of the score-based, i.e., Hill Climbing (HC), and hybrid structure learning algorithms of Bayesian Network including the techniques of Max-Min Hill Climbing (MMHC), General 2-Phase Restricted Maximization (RSMAX2), and Hybrid-Hybrid Parents &amp; Children (H2PC). Further, we also compare the performance of score-based model (Hill Climbing) under five different popular scorings: Bayesian Information Criterion (BIC), K2, Log-Likelihood, Bayesian Dirichlet Equivalent (BDE), and Akaike Information Criterion (AIC) methods. The main contributions of this study are as follows: (1) insights that the hybrid structure learning algorithms of Bayesian Network models are either superior in performance or at least comparable to its score-based counterparts (2) our proposed best performed Bayesian Network model that is able to predict the rain occurrences in Jakarta with a promising overall accuracy of more than 81 percent.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Classification of Paddy Growth Phase with Machine Learning Algorithms to Handle Imbalanced Multi-Class Big Data 2021-08-20T04:07:40+00:00 Hady Suryono Heri Kuswanto Nur Iriawan <p>The global Sustainable Development Goals (SDGs) adopted by countries in the world have significant implications for national development planning in Indonesia in the period 2015 to 2030. The Agricultural sector is one of the most important sectors in the world and has a very important contribution to achieving the goals. Availability of accurate paddy production data must be available to measure the level of food security. This can be done by monitoring the growth phase of paddy and predicting the classification of its growth phase accurately and precisely. The paddy growth phase has 6 classes with the number of class members usually not the same (imbalanced data). This study describes the results of the classification of paddy growth phases with imbalanced data in Bojonegoro Regency, East Java in 2019 using machine learning algorithms on the Google Earth Engine (GEE) platform. Classification is done by Classification and Regression Tree, Support Vector Machine, and Random Forest. Oversampling technique is used to deal the problem of imbalanced data. The Area Sampling Frame survey in 2019 conducted by BPS was used as a label for classification model training. The results showed that the overall accuracy (OA) using the Random Forest algorithm by modifying the dataset using oversampling was 82.30% and the kappa statistic was 0.76, outperforming the SVM and CART algorithms.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Knowledge-based Utilization in Organizational IT Support. A Case Study at BPS-Statistics Indonesia 2021-08-22T07:29:18+00:00 Herlambang Permadi Dana Indra Sensuse <p>Many problems in the IT sector are experienced by employees in carrying out daily government activities. The problems faced often disrupt government activities in providing services to the community. This study analyzes the IT problems that are often found in organizations and their impacts. As many as 43 people have participated in the survey to identify what problems are often experienced and the impact they have had. The survey started with 7 IT service groups and produced 37 IT problems. The result is an implementation of a knowledge-based system that can help employees in solving IT problems on their own in their work environment.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Estimation of Education Indicators in East Java Using Multivariate Fay-Herriot Model 2021-10-07T16:38:44+00:00 Novia Permatasari Azka Ubaidillah <p>Education is an important aspect in improving human resources. Data availability of education indicators in a low administrative level is needed as a basis for education planning in that region. The problem of sample size when provide a low administrative level data can be overcome by indirect estimation, namely Small Area Estimation (SAE). SAE is able to increase the effectiveness of the survey sample size by using the strength of neighbouring areas and information from auxiliary variables related to the variables of interest. We obtain simulation study to compare multivariate model to univariate model and implement multivariate model to estimate three education indicators which are obtained from the National Socio-Economic Surveys by Statistics Indonesia. Simulation results are in line with previous studies, where the multivariate Fay-Herriot model with <em>p</em> variable has smaller of mean squares error (MSE) than the univariate model. The model implementation to estimate Crude<br />Participation Rate (APK), School Participation Rate (APS), and Pure Participation Rate (APM) also shows that the multivariate model produces smaller RRMSE than the direct estimates. It can be concluded that multivariate model is able to produce more efficient estimates than direct estimation and univariate model.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Topic Modelling in Knowledge Management Documents BPS Statistics Indonesia 2021-09-13T15:22:14+00:00 Muhammad Yunus Hendrawan Nucke Widowati Kusumo Projo <p>Knowledge management is an important activity in improving the performance an organization. BPS Statistics Indonesia has recently implemented such a system to improve the quality and efficiency of business processes. The purposes of this research are: 1) implementing topic modelling on BPS Knowledge Management System to identify groups of document topics; 2) providing recommendations on which the best topic modelling; 3) building a web service function of topic modelling for BPS that includes data preprocessing function and topic group recommendation function. This study applies the Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) topic modelling methods to determine the best grouping techniques for knowledge management systems in BPS Statistics Indonesia. The results show that the LDA model using Mallet is the best model with 25 topic groups and a coherence score of 0.4803. The performance result suggest that the best modelling method is the LDA. The LDA model is then successfully implemented in RESTful web service to provide services in the preprocessing function and topic recommendations on documents entered into the Knowledge Management System BPS.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics What We Know from Telemedicine Data in Indonesia? Study case using Alodokter,, and Honestdocs 2021-08-29T08:04:08+00:00 Faza Nur Fuadina Nucke Widowati Kusumo Projo Siti Mariyah <p>The internet and technology development arise in various aspects of life in Indonesia, including in the health sector with e-health. Telemedicine utilization as a form of e-health was still rare among Indonesians because its existence is not as much as e-commerce that is more related to the economic sector. The COVID-19 pandemic has limited people's movement to get health care, but it made people use telemedicine in Indonesia. This research aims to analyze telemedicine utilization in Indonesia and see the health phenomena captured in the data. This research uses descriptive analysis and text mining to determine the utilization of telemedicine with the Named Entity Recognition (NER) and Latent Dirichlet Allocation (LDA) methods. In addition, a literature review is also used to identify the potential use of telemedicine data in collecting health statistics in Indonesia. The results show that telemedicine has been widely used in Indonesia. The clinical teleconsultation data and article titles on telemedicine produce various health topics. Therefore, telemedicine data can potentially be used as a source for collecting health statistics.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Do Tourist Attraction Objects Implement Health Protocols? Analysis of Tourist Attraction Object in East Java Province Using Google Maps Review 2021-08-31T08:59:04+00:00 Disya Pratistaning Ratriatmaja Nucke Widowati Kusumo Projo <p>The COVID-19 pandemic has impacted the tourism sector, particularly the Tourist Attraction Object (TAO) in Indonesia. This research aims: to analyse the implementation of health protocols and facility conditions at TAO, to analyse the change in visitor sentiment and rating towards TAO before and during the COVID-19 pandemic, to analyse the close relationship between ratings and reviews of visitor sentiment on TAO, to analyse the possibility of web scraping data to complement tourism data from BPS Statistics Indonesia. Using Google Maps review, this research uses the Multinomial Naïve Bayes (MNB), Term Frequency-Inverse Document Frequency (TF-IDF), pseudo-labelling, and word association methods. The results show that the health protocol has been implemented in TAO of East Java province, the available facilities are good, and there is no change in reviews during the TAO pandemic. The Stuart-Kendall Tau-c value shows a weak relationship in a positive direction between rating and review sentiment. According to Haversine, Jaro Winkler, and Levenshtein, the data calculation indicates that web scraping data can complement tourism data for BPS-Statistics Indonesia.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics R Package Development for Difference Benchmarking in Small Area Estimation Fay-Herriot Model 2021-08-26T13:08:01+00:00 Zaza Yuda Perwira Azka Ubaidillah <p>In recent decades, the use of small area estimation (SAE) for producing official statistics has been widely recognized by many National Statistics Offices including BPS-Statistics Indonesia. For official statistics usage, the aggregation of small area estimates is expected to be numerically consistent and more efficient than the aggregation of the unbiased direct estimates that cannot be guaranteed by Fay-Herriot model. Simulation experiments are performed to assess the behaviour of the difference benchmarking method Fay-Herriot model and to compare the mean squared error (MSE). The result shows that the difference benchmarking method can produce a consistent aggregation towards the direct estimation. Furthermore, an R package was built to implement the method that is easier to be used and is already available in the CRAN website. The package has been evaluated using validity (simulation), performance, case studies, and usability tests. These evaluations show that the package is suitable for use. Implementation of the methodology is also be applied to estimate average household consumption per capita expenditure in districts in D.I. Yogyakarta province, Indonesia 2019</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Entity Matching of Shop Accounts in Online Commerce Portals 2021-10-07T16:38:16+00:00 Dina Salsabila Takdir Takdir <p>Currently, online marketplace data are valuable data sources to be analyzed for<br />various purposes. In the data collecting phases, duplication of shop accounts was found, resulting in biased analysis. This study examines the development of a mechanism to identify duplicate entities, i.e. store accounts, between different online marketplaces, or commonly known as entity matching. Word similarity algorithms were adopted as the core elements of our approach. Additionally, we present an entity matching model by examining logistic<br />regression, naive Bayes, and random forest to find the best model for classifying store account similarities. Top online marketplaces in Indonesia are the object of our study, limited to one developing municipality, i.e. Sleman, DI Yogyakarta. The results show the best model has an accuracy value of 0.961, precision of 0.963, a recall of 0.958, and an F1-score of 0.962. Therefore, these results are acceptable for duplicate identification.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Study of Search Algorithm Optimization from Multi-Version Data Warehouse using NoSQL Non-relational Database 2021-10-03T09:35:10+00:00 Lutfi Rahmatuti Maghfiroh Ramadhan Azizulhakim Yusuf <p>Statistics Indonesia, which produces large-scale data, requires effective and optimal storage. Research related to Multi-Version Data Warehouse (MVDW), which utilizes document-based NoSQL itself, has attempted to be developed for the sake of BPS data storage and proposed an algorithm to store and search data. This paper is made to examine algorithm optimization methods to reduce the time used in the process of storing and searching data when needed. The algorithm proposed in this paper focuses on the data storage process by suggesting a storage model that generalizes the coding of variables in the data warehouse used so that later data searches can be carried out more easily and optimally. Other optimization methods are also carried out by applying query optimization methods to support and improve the optimization of the proposed algorithm. The results of the two optimization methods carried out can be said to be successful because the time used in the data search process by utilizing the algorithm after the application of the optimization method has been reduced when compared to the data search process using algorithms that have been developed by previous research.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Development of Question Answering System for Public Relation Division in Student Admission 2021-09-22T17:57:56+00:00 Lutfi Rahmatuti Maghfiroh Wahyudi Syahputra Ibnu Santoso <p>Politeknik Statistika STIS (Polstat STIS) holds the new students' admission (PMB) every year which aims to gather, test, and, select all of its applicants who want to continue their study at STIS. STIS establish a committee during this event named Public Relation (PR) Division. PR Division to be intermediaries between STIS and the applicants. One of many PR Division tasks is to reply to all the questions from applicants about administration, procedure, or other things about PMB and STIS. PR Division is facing some problems that can hinder its performance to do the tasks. How do we address the problem is the reason that this research begins in the first place. The goal of this research is to build and establish a web-based system that is capable to solve all the problems the current system has. The system is divided into two main functions, the first one is FAQ management by PR Division members. The other function is a chatbot that automatically answers the question by using the TF-IDF algorithm. The conclusion on all testing and evaluation is the system that being build is already fulfilled all its requirements also the system is feasible to be used.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Determining the Stopping Point on GPS Data Using Density Based Spatial Clustering of Application with Noise and Gaussian Mixture Model Cluster 2021-09-13T15:23:06+00:00 You Ari Faeni <p><span style="font-weight: 400;">GPS data is an interesting thing to research. Various studies have been conducted to find information based on GPS data. In this paper, we propose a novel model for determining the stopping point on a GPS data for cases of human movement without using transportation modes. Further, this information can be used to determines human behavior such as fraud and favorite spot. The GPS data used in this research is the travel data of the SUSENAS survey officers at the time of updating the census block for 27 households. Density Based Spatial Clustering Of Application With Noise (DBSCAN) And Gaussian Mixture Model (GMM) Clustering model is used to create the model. The model made using a flowchart and applied to the GPS data that has been collected. The results of the developed model show that the stopping points generated using the DBSCAN cluster model are better than the stopping points generated using the GMM cluster model</span><strong>. </strong><span style="font-weight: 400;">Furthermore, the results of this study will be used to make model of surveyor fraud.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Revisiting Local Walking Based on Social Network Trust (LWSNT): Friends Recommendation Algorithm in Facebook Social Networks 2021-09-02T16:36:22+00:00 Wahidya Nurkarim Arie Wahyu Wijayanto <p><span style="font-weight: 400;">In the last decades, the internet penetration rate and online social network users have grown very fast. Online social network, such as Facebook, is a platform where one can find friends without having to meet face to face. A social network is represented by a large graph because it involves many participants. Hence, it is hard to find potential friends who have the same thoughts and interests. The Local Walking Based on Social Network Trust (LWSNT) algorithm is one of the popular algorithms for social friend recommendation. This study re-examines whether the correlation between attributes gives un-match ranks in different cases (cases with and without correlation). We assess the performance of LWSNT in Facebook networks under the supervised manner by comparing its F-score against similar methods. By using Kendall’s tau correlation, the results show that the correlation of attributes has no significant effect on the order of friend recommendations. In addition, the LWSNT performance is quite inferior against the Common Neighbors algorithm and Jaccard index. </span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Mapping of the Reading Literacy Activity Index in East Java Province, Indonesia: an Unsupervised Learning Approach 2021-10-03T11:55:53+00:00 Harun Al Azies Ayu Febriana Dwi Rositawati <p><span style="font-weight: 400;">One of the educational problems that must be faced by East Java Province is the low reading culture of the community. The level of reading culture can be indicated by the Reading Literacy Activity Index (Alibaca Index). Alibaca Index of East Java is only 33.19 which value is included in the low category. So, this research uses the indicators that compose the Alibaca Index to classify regencies/cities in East Java Province. The analysis process carried out in this research uses one of the unsupervised learning algorithms, namely the K-Means algorithm. Analysis using the K-Means algorithm for grouping regencies/cities in East Java Province based on the indicators that compose the Alibaca index gives the results that the regencies/cities of East Java Province are divided into 3 clusters based on the optimal number of clusters according to the result of the elbow and silhouette method. Cluster 1 consists of 20 regencies and cities, cluster 2 consists of 10 regencies, and cluster 3 consists of 8 cities. Each cluster has different characteristics, cluster 1 is the cluster with the lowest skill dimension, while the cluster 2 area is an area that dominates the access dimension, alternative dimension, and cultural dimension, meanwhile, the third cluster does not have dominance in these 3 dimensions, which means that cluster 3 is the government's priority for improving reading activities, so the result of the analysis can help the government to develop strategic policies to achieve educational equity, especially concerning literacy levels based on the characteristics of each regency/city in East Java Province.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Antisocial Behavior Monitoring Services of Indonesian Public Twitter Using Machine Learning 2021-09-13T15:19:34+00:00 Fitri Andri Astuti <p><span style="font-weight: 400;">Antisocial behavior is a personality disorder that has characteristics such as repetitive actions that violate social norms, deceit and lying, impulsiveness, irritability and aggression, reckless disregard for the safety of oneself and others, consistently irresponsible, and lack of remorse. The cause can be from various factors, including genetics, psychological conditions, interactions in the environment, and wrong parenting. The impact of antisocial behavior on social life can cause people to tend to be aggressive and take it into action by not having feelings of guilt for their actions. Thus, a monitoring of antisocial behavior disorders is needed so that it can be a warning for the public to be more concerned about the difficulties experienced by each other. The potential gained from the availability of tweet data access from the Twitter API opens up opportunities for monitoring antisocial behavior. By utilizing traditional machine learning and deep learning methods, it can be an opportunity to automate labeling on Twitter data that contains elements of antisocial behavior. Based on the description of the problems and opportunities found, this study proposes a multi-class classification monitoring service to identify public antisocial behavior on Twitter Indonesia using machine learning.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Application of Resampling and Boosting Methods Using the C5.0 Algorithm 2021-10-03T12:04:23+00:00 Hedi Kuswanto Nurtiti Sunusi Siswanto Siswanto Nirwan Nirwan <p><span style="font-weight: 400;">Hypertension is a non-communicable disease that is characterized by an increase in systolic and diastolic blood pressure of more than 140 mmHg and or 90 mmHg. Hypertension needs to get more attention the condition is because hypertension will cause complications in the target organs and this disease does not appear to show significant symptoms at the beginning of the disease because it is called "silent disease". The study discusses the integration method of resampling and boosting in predicting hypertension status using the C5.0 algorithm. Classification of the C5.0 Algorithm by applying to resample increases performance specificity and AUC. Random oversampling (ROS) increased the specificity by 95.67% and AUC increased by 91.11%. Random over-under sampling (ROUS) increased specificity by 88.84% and AUC increased by 87.13%. In addition, applying boosting to the C5.0 algorithm that has been reapplied increases the accuracy performance. Random oversampling (ROS) increased accuracy by 93.86% and random over-under sampling (ROUS) increased accuracy by 89.98%. The response variables that contributed the most were high cholesterol and heart problems. The application of resampling and boosting to the contribution of high cholesterol and heart problems always topped the list.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Development of Student's Dropout Early Warning System Using Analytical Hierarchy Process 2021-09-24T01:16:03+00:00 Naflah Ariqah Yunarso Anang <p><span style="font-weight: 400;">As a higher education institution, Politeknik Statistika STIS also faces the same problems as universities in general, those are student failing to compare that year courses thus have to repeat those courses or student dropping out. To overcome this problem, this research proposes a </span><em><span style="font-weight: 400;">Dropout Early Warning System</span></em><span style="font-weight: 400;"> (DEWS) that can provide early warnings for dropouts and repeat a class. With this system, it is hoped that it can help institutions to identify students who have the potential to drop out or repeat a class. The purpose of making this system is to help academic supervisors and decision makers from Polstat STIS in knowing the potential for student. The potential for students to drop out and repeat a class is measured by a potential score obtained from the results of an assessment of 5 criteria consisting of GPA scores, gender, economic factors, violation points, and record of repeating class. Prediction results are presented in three categories consisting of low potential, medium potential, and high potential which are calculated from the results of weighting calculations using the </span><em><span style="font-weight: 400;">Analytical Hierarchy Process</span></em><span style="font-weight: 400;"> (AHP). The system is tested and verified using </span><em><span style="font-weight: 400;">Black Box</span></em><span style="font-weight: 400;"> test and the evaluation of the calculation method using </span><em><span style="font-weight: 400;">confusion matrix</span></em><span style="font-weight: 400;">. Based on the test results, the functions that exist in the system can function properly and can supply the needs.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Construction of Smart City Development Index in Indonesia 2021-10-07T00:49:24+00:00 Nabil Miftah Irfandha <p><span style="font-weight: 400;">Development in urban areas requires city management to solve problems that occur because of high population growth. The complexity of the issues in urban areas varies widely, including a decrease in the quality of public services, reduced availability of residential land, congestion on the highway, excessive energy consumption, waste accumulation, increased crime rates, and other social problems. City assessment tools can be used as support for decision-making in urban development as they provide assessment methodologies for cities to show progress towards defined targets. In the 21</span><span style="font-weight: 400;">st</span><span style="font-weight: 400;"> century, there has been a shift from sustainability assessment to developing smart cities. The construction of the Smart City Development Index (SCDI) is considered capable of providing a basis for formulating effective and efficient solutions in reducing existing city problems. The purpose of this study is to find out the general description and get the factors that form SCDI; get the results of SCDI measurements; examine the uncertainty analysis and sensitivity analysis of SCDI, and see the relationship between SCDI and HDI (Human Development Index). Based on the results of factor analysis, there are six factors formed where the highest SCDI with a population of fewer than 200,000 people in Madiun City (East Java Province), the highest SCDI with a population between 200,000 to 1,000,000 people in Yogyakarta City (DI Yogyakarta Province) and the highest SCDI with a total population of over than 1,000,000 people in Tangerang City (Banten Province). The results of uncertainty analysis and sensitivity analysis show that the formed SCDI is robust and reliable. In general, SCDI has a positive relationship to Human Development Index (HDI). The construction of this index aims to facilitate local and central governments in reviewing policies regarding the distribution of funds so that the smart city's development is by existing conditions.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Sentiment Analysis on PeduliLindungi Application Using TextBlob and VADER Library 2021-09-22T17:41:42+00:00 Fathonah Illia Migunani Puspita Eugenia Sita Aliya Rutba <p><span style="font-weight: 400;">The Covid-19 virus has become a global pandemic, including Indonesia. Various efforts have been made by the Government to reduce the negative impact by this pandemic, one of which is through the PeduliLindungi application. The research was conducted to obtain public sentiment towards the application by using twitter data. The data collection period is from August 31 to September 7, 2021, this period was chosen due to the emergence of news regarding vaccine data leaks associated with data leaks in the PeduliLindung application. Sentiment analysis is carried out using the TextBlob and VADER libraries. The results of this sentiment analysis are sufficient to display public opinion and it is hoped that decision makers can improve applications based on these opinions. Then, it was found that the VADER library can be said to be better in conducting sentiment analysis in research because the lexicon approach used is based on social media.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Detection of Public Sentiment Analysis Model on the Implementation of PPKM in Indonesia 2021-09-29T11:23:33+00:00 Renata Putri Henessa Muhammad Al-Fath Fisabilillah Windy Rahmatul Azizah <p><span style="font-weight: 400;">Covid-19 pandemic which has been being serious problem in Indonesia indirectly force Indonesian government to issue policies in order to decrease the number of Covid-19 spread. One of the policies is the Implementation of Restrictions on Community Activities (PPKM) in Java-Bali region from January 11-25, 2021. Due to its continued implementation, this policy raises pros and cons in the community. This research’s goal is to determine the best classification model and determine the effect of adding feature engineering in analyzing public sentiment on PPKM with scrapping data from Twitter so that with the best model, it is possible to classify public responses to PPKM automatically. The twitter scrapping dataset is preprocessed first, which includes case folding, tokenizing, filtering, stemming, and term weighting to clean the data. After preprocessing and through the analysis steps, it concludes that using feature engineering can increase the accuracy of the best selected four models. The logistic regression method with feature engineering with accuracy rate of 87.50% become the best method. In conclusion, the best suggested model to analyze public sentiment using Twitter scrappimg towards </span><em><span style="font-weight: 400;">PPKM </span></em><span style="font-weight: 400;">is by using the logistic regression.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics AMDA: Anchor Mobility Data Analytic for Determining Home-Work Location from Mobile Positioning Data 2021-10-03T12:12:53+00:00 Amanda Pratama Putra Wa Ode Zuhayeni Madjida Ignatius Aditya Setyadi Amin Rois Sinung Nugroho Alfatihah Reno MNSP Munaf <p><span style="font-weight: 400;">In conducting a mobility analysis using Mobile Positioning Data, the most critical step is to define each customer's usual environment. The initial concept of mobility used is the movement that occurs from and to every usual environment, so errors in determining the usual environment will cause incorrect mobility statistics. Therefore, Anchor Mobility Data Analytic (AMDA) is proposed for Home-Work Location Determination from Mobile Positioning Data. This algorithm uses clockwise reversal to make it easier to classify someone in their usual environment. Unfortunately, only about 80% of the raw data can be used to establish usual environments. The remaining 20% do not have sufficient data history. This study found that the accuracy of AMDA in determining monthly home location was 98.8% at the provincial level and 88.7% at the regency level. As for the determination of monthly work locations, 98.9% at the provincial level and 70.4% at the regency level.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics SMOTE and Nearmiss Methods for Disease Classification with Unbalanced Data 2021-09-22T17:51:35+00:00 Anas Rulloh Budi Alamsyah Salsabila Rahma Anisa Nadira Sri Belinda Adi Setiawan <p><span style="font-weight: 400;">Unbalanced data are often encountered in practice. They complicate the search for a model suitable for classification. This is because the number of individuals who have a history of a disease is less than the number of individuals who do not. We analyse the IFLS 5 data on medical history of a set of patients. We split the dataset in the proportion 80:20 to training and test subsets. Of course, both datasets are unbalanced, with only a small minority of patients who had a stroke. We apply the SMOTE and Nearmiss methods and evaluate the rate of correct classification. After being treated using the two methods, the training data was transformed into balanced data. The classification process is carried out to test the comparison of the effectiveness of the two methods in solving the problem of unbalanced data. Based on the results obtained, it can be concluded that the Nearmiss method is better than SMOTE in balancing the data. It was obtained by comparing several measures such as accuracy, F-score, Kappa, sensitivity, and specificity on the SMOTE and Nearmiss methods. </span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Satu Data Indonesia in Sectoral Statistics: Concept of Satu Data Metadata Framework (SDMF) 2021-10-07T16:34:22+00:00 Hakiki Sandhika Raja Chaidir Arsyan Adlan <p><span style="font-weight: 400;">Satu Data Indonesia is a policy contrive to encourage the problem of inadequate data governance in Indonesia. This policy makes 4 main principles, namely metadata, data standards, reference codes, and interoperability as metrics of success in its implementation. In this study, we analyze the satu data Indonesia implementation in Kutai Timur Regency. We found that the integration of the satu data principle is challenging to apply technically because sectoral data in Indonesia has 2 characteristics based on the preparation of the list of data needs, namely the centralized data list, and the decentralized data list. Decentralized data list is a list of data that is partially prepared in each agency without any coordination with other stakeholders for completing the satu data principles. To accommodating this condition, we design the Satu Data Metadata Framework (SDMF) a data standard framework that is in accordance with the conditions of data governance in Indonesia. SDMF utilizes contextual layer and discovery layer of metadata to provide temporal attribute called Satu Data Resource Identifier (SDRI) for integration purpose</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Analysis of Rice Field Cluster in Indonesia as an Evaluation of Food Production Availability Using Fuzzy C-Means 2021-10-03T13:35:46+00:00 Heru Setiono Totok M Dianto <p><span style="font-weight: 400;">Rice fields area in Indonesia is getting narrower every year with the rampant construction of housing and buildings. It results in lower availability of food production hence to meet the needs we have to import rice from other countries. By clustering rice fields, it can be used as an evaluation material to increase food production in Indonesia so that the need for rice imports can be minimized. The method used in the grouping of Rice Fields is the Fuzzy C-means method, implementation of the Knime Tool with data training and testing. The Fuzzy C-Means program produces three data groups/clusters, namely wide, moderate, and narrow rice fields. The results of the clustering show that the most potential areas for food production from rice fields are East Java, Central Java, and West Java.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics The Best K-Exponential Moving Average with Missing Values: Gold Prices in Indonesia, Saudi Arabia, and Turkey during COVID-19 2021-08-26T02:30:54+00:00 Fadhlul Mubarak Atilla Aslanargun Ilyas Siklar <p>There have been missing values in the gold price data for Indonesia, Saudi Arabia, and Turkey at the weekend so that imputation techniques have been carried out to solve this problem. The imputation method of replacing NAs with the latest non-NA values also known as last observation carried forward (LOCF) made it a solution to overcome the missing values. This study selected the best -exponential moving average based on the smallest mean absolute percentage error (MAPE) from simulations. The 2-exponential moving average analysis was the best analysis for the price of gold which has missing values in Indonesia, Saudi Arabia, and Turkey during COVID-19, while the largest MAPE values are different for each country.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics A Simple Approach using Statistical-based Machine Learning to Predict the Weapon System Operational Readiness 2021-09-18T04:09:24+00:00 Arwin Datumaya Wahyudi Sumari Dimas Shella Charlinawati Yuri Ariyanto <p>Weapon system operational readiness is a critical requirement to ensure the combat readiness in order to guarantee the state defense sustainability time by time. Weapon systems are only operated by the military and their readiness are programmed every year based on some factors such as the amount of the allocated budget, the weapon system strength, and its circulation. Usually, the weapon system readiness is programmed based on the planner’s experiences that are inherited from time to time. In this research, we proposed a simple approach by using statistical-based machine learning method called linear regression for helping the planner to predict the weapon system operational readiness faced to its affecting factors such as scheduled and unscheduled maintenance. We used a dataset from a randomized primary data for 5 years from year 2016 to year 2020 to predict year 2021. To ensure the performance of the model, two measurements are used namely, Mean Absolute Percentage Error (MAPE) to measure its accuracy and goodness, and R-squared (R<sup>2</sup>) to measure the ability of the independent variables, the weapon system circulation, influences the dependent variable, the weapon system readiness. From the measurement results, the models, in general, are able to achieve MAPE as much as 1.99% that has interpretation as very accurate prediction with the accuracy of 98.02%. On the other hand, the system is able to achieve R<sup>2 </sup>as much as 84.15% that means the combination of the independent variables altogether have given a strong influence to the dependent variable. The higher the value of R<sup>2 </sup>the better the model is. Our research conclude that linear regression is the proper machine learning model for predicting the weapon system operational readiness.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Estimation of Air Pollutants using Time Series Model at Coalfield Site of India 2021-09-18T04:27:04+00:00 Arti Choudhary Pradeep Kumar <p class="Abstract" style="margin-bottom: 0cm;"><span style="font-family: 'Times New Roman',serif;" lang="EN-GB">Assessment of air pollutants and quality is an intricate task because of dynamic nature, unpredictability and high inconsistency in space and time. In this study, a time series moving average (MA) model is employed to estimate air pollutants (PM<sub>2.5</sub>, PM<sub>10</sub>, NO<sub>2</sub>, NO<sub>X</sub>, O<sub>3</sub>, SO<sub>2</sub> and CO) over the coalfield site of India. The estimated O<sub>3</sub> with Adj. R<sup>2</sup> = 0.958 was identified as the most accurate estimation followed by other estimated pollutants. Though, results for the estimated PM<sub>2.5 </sub>(Adj. R<sup>2</sup> = 0.950) and NO<sub>2</sub> (Adj. R<sup>2</sup> = 0.949) were found almost similar to the results of O<sub>3</sub> (Adj. R<sup>2</sup> = 0.958). The estimated CO with Adj. R<sup>2</sup> = 0.887 was identified lower among all the estimated pollutants was also found very well. The existing results of the study demonstrate that MA model permits us to precisely estimate daily basis pollutant concentrations, for the different sites of India.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics The Effect of Human Capital Inequality on Income Inequality: Evidence from Indonesia 2021-09-02T02:46:29+00:00 Hafizh Meyzar Aqil Dwi Wahyuniati <p class="Abstract" style="margin-bottom: 0cm;">Education inequality in Indonesia tends to experience a downward trend which indicates that the education distribution is more equally distributed from year to year. this phenomenon should lead to a reduction in income inequality. However, income inequality in Indonesia has increased compared to 9 years ago. This study intends to look at the human capital inequality condition in provinces in Indonesia and analyze the effect of human capital inequality on income inequality. The Gini coefficient concept is used to measure human capital inequality and income inequality. The annual panel data covered 34 provinces in Indonesia from 2015 – 2019. The analytical methods used dynamic panel data regression using the Generalized Method of Moment (GMM) Arellano-Bond approach. The results indicate income inequality with a lag of 1 year, literacy rate, and trade openness have a negative and significant effect on income inequality. Furthermore, the human capital inequality and the average years of schooling have a positive and significant effect on income inequality. So, to reduce income inequality, policymakers are advised to minimize human capital inequality, especially in the education sector by paying attention to conditions in priority provinces.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics The Effect of Shifting Large and Medium-Sized Industry Agglomeration on the Economic Development in Kanti Region in 2003-2018 2021-09-01T12:21:39+00:00 Ahmad Firman Maulana Ekaria Ekaria <p>The development of real Gross Domestic Regional Product (GDRP) 2010 of all cities in Kanti region increased during 2003-2018. However, when viewed the growth rate in aggregate, it slowed during the period 2010-2018. One of the causes is the shift of large and medium-size industry (LMI) agglomeration from Kanti region to Kangga region. This study aims to find out the location and the dynamics of the shift of LMI agglomeration using the Hoover-Balassa index that is presented through thematic maps. In addition, the study also analyses the effect of the shift of LMI agglomeration and other factors on economic growth in Kanti region using the regression analysis of panel data. The individual units used are five administrative cities in the Kanti region with annual units from 2003 to 2018. Fixed effect model with seemingly unrelated regression (FEM-SUR) is used to estimate the parameters of the economic growth model in Kanti region. The results showed that Kanti region was agglomerated in North Jakarta and East Jakarta. Labor-intensive potential factor has a negative and significant effect, while the labor productivity of LMI and domestic investment has a positive and significant effect on economic growth in Kanti region. North Jakarta is an area that despite the shift of LMI agglomeration but still able to increase its economic growth, while East Jakarta has decreased. So, the Provincial Government of Jakarta need to adapt the implementation of LMI agglomeration in North Jakarta to encourage economic growth in East Jakarta and West Jakarta in accordance with regional spatial planning for industry.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Covid-19 Vaccination: Health and Economic Correlations 2021-09-14T03:48:39+00:00 Zaky Musyarof Indira Nur Qomari <p>Vaccination program is an important strategy in eradicating Covid-19 pandemic. Vaccination can intervene to accelerate the formation of herd immunity. When herd immunity is formed later, it is believed that the Covid-19 virus will gradually eradicated. Furthermore, economic activity will return to normal. Then, has the vaccination program run by the Indonesian government had an impact on health and economic recovery? Some claim that this vaccination program has had a positive impact. However, in-depth research is felt to be done to really look at this impact. As a first step, it is necessary to look at the relationship between vaccination, health and economic development. This relationship will be an early indication of whether the vaccination program is successful or not. In fact, vaccination was strongly correlated with a decrease in the transmission of new cases and moderately correlated with the recovery rate. Overall, vaccination is strongly correlated with health based on the canonical correlation. Meanwhile, for the economy, vaccination has a weak correlation with the poverty rate and Gini ratio. However, overall based on the canonical correlation, vaccination is strongly correlated with the economy. Furthermore, the development of tourism shows an indication of a correlation with vaccination.</p> <p> </p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics The Effect of the Digital Economy on Indecent Work in Indonesia 2019 2021-09-18T02:58:26+00:00 Yuniar Putri Awaliyah Risky Nucke Widowati Kusumo Projo <p>The emergence of the digital economy is indicated to affect the employment sector. The job opportunities created by the digital economy provide an opportunity for workers to work in poor jobs, full of risks and indecent works. This study aims: first, &nbsp;to describe the economic digital and indecent work conditions in Indonesia. Second, to investigate the direct influence of infrastructure and digital media on the digital economy. Third, to examine the direct impact of the digital economy on indecent work. The data used is secondary data with observations from 34 provinces sourced from BPS and other ministries. Using the SEM-PLS analysis method, the results show that infrastructure and digital media positively impact the digital economy. Similarly, the digital economy, reflected by e-commerce sellers and buyers, has a positive and significant relationship to indecent work as reflected by Employment Excessive Working Time (EEWT), Precarious Employment Rate (PER), and non-union workers. It can be said that the increase in the digital economy influences the conditions of indecent work.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Does Palapa Ring Project Infrastructure Bridging Connectivity and Economic Activity? 2021-10-01T09:30:23+00:00 Realita Eschachasthi Taly Purwa Diyang Gita Cendekia <p>This study examines the impact of existence of the Indonesian Palapa Ring Project (PRP) infrastructure on connectivity and economic activities in 46 districts in the West, Central, and East package of PRP in 2015-2020. Connectivity is an internet activity that measured by using percentage of internet use and economic activity is measured by using Gross Regional Domestic Product (GRDP). The fixed effect staggered difference-in-difference is utilized to analyze the panel data obtained from Badan Pusat Statistik (BPS)-Statistics Indonesia. An examination of parallel trend assumptions, robustness check, and heterogeneity analysis are also presented. The results show that PRP infrastructure has a positive and significant impact on connectivity; yet has no significant effect on economic activity. In response to the findings, the policy should be designed by intensifying coverage and quality of the internet; proliferating Information Communication Technology (ICT) facilities in rural areas; and expanding education and digital literacy programs.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Study of Exchange Rate Volatility and Its Effect on Indonesian Economic Indicators With Potential Exchange Rate Crisis 2021-10-02T16:47:08+00:00 Adin Nugroho Nasrudin Nasrudin <p>Exchange rate volatility occurred when exchange rate movement was wildly fluctuating which could depict uncertainty. Since Indonesia used an open economy, exchange rate fluctuation became important to be maintained due to crisis potential. This research was conducted to analyze the effect or impact of exchange rate volatility on the Indonesian economy in general and few related case using time series analysis. ARIMA (<em>Autoregressive Integrated Moving Average</em>) and EGARCH (<em>Exponential Generalized Autoregressive Conditional Heteroscedasticity</em>) were used for measuring the volatility in the period between 1997-2021. Then, regressions were applied to analyze the impact of exchange rate volatility on few macroeconomic indicators. The result shows that exchange rate volatility yielded a significant negative effect on GDP Growth rate, export, and import. Logistic regression was used to analyze the factors that were affecting the crisis potential. The result showed only a negative GDP growth rate and high volatility that gave more risk which could lead to crisis. Therefore, it is important to keep exchange rate volatility stable.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Short-Term Forecasting of Air Travellers Outflows from Bali Using Web Search Data 2021-09-29T03:56:47+00:00 Parma Dwi Widy Oktama <p>Air travelers have become one of the strategic indicators in the transportation sector. The official data-released by Statistics Indonesia (BPS) for thirty days-lag, makes the condition of this indicator can’t be known in real-time. By the utilization of web search data that has been briskly evolving in recent years, this study aims to explore the possibility of using web search data in performing short-term forecasting to know the general outlook of the indicator earlier. Based on this study, web search data and official statistics figures show a strong correlation and having similar movement patterns over time. The application of web search data as a predictor in time series modeling, especially on time series regression and autoregressive model (SARIMA and SARIMAX), turn out a predicted value that well-approach the actual value of the response variable. In addition, it is proven that the use of web search data can increase model accuracy. The analysis results using SARIMAX model shows that the number of air traveller’s outflows from Bali in September and October 2021 will generally be higher than the number in August 2021. The increasing number of air travelers is thought due to a decrease in Covid-19 cases which has triggered the public's confidence in travelling about to rise again.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Determinants of Service Export in ASEAN Member Countries 2021-09-01T03:31:17+00:00 Pidyatama Putri Situmorang Neli Agustina <p>ASEAN economy has undergone a shift, as has the structure of the global economy. The shift in the economy that occurred was from the main share of the agricultural sector to the industrial sector and the service sector. From year to year, the ASEAN service sector continues to experience positive growth with an increasing contribution beyond that of other sectors. ASEAN's service exports grew higher than ASEAN's goods exports. There are still several ASEAN member countries that experience a trade deficit in services, which is interesting to investigate further. This study aims to analyze the performance of service exports in 10 ASEAN member countries from 2010 to 2019. The results of panel data regression using the WLS fixed effect model show that foreign direct investment, nominal exchange rate, gross domestic product, services value added, gross domestic product, labor force, human capital and communication facilities have a significant effect on ASEAN's service export. The development of communication technology, development of human resources and updating of important policies are considered by the government to improve the performance of service exports of ASEAN member countries.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Spatial Panel Data Approach on Environmental Quality in Indonesia 2021-09-29T02:53:29+00:00 Debita Tejo Saputri Anugrah Alief Pratama <p>Indonesia adopted a strategic long-term development plan (2005-2025) targeting to achieve a green and everlasting Indonesia through implementing various environmental policies. One of the mandatory matters for governments is to continue environmental control by constructing Environmental Quality Indexes (EQI). This study focuses on the relationship between regional output or real Regional GDP, level of population density, and the government expenditure on environment quality on EQI in 34 provinces in Indonesia by the time period 2015 to 2019 using a spatial panel data approach. Within the context of spatial modeling, the interaction between provinces depends on their geographical location and condition. Using the geographic information system (GIS) and stata attributes, the coordinates and distances can be mapped and then defined for observation units in space via the spatial weight matrix used. From the perspective of spatial geography, this paper verifies the spatial dependence of Indonesia’s Environmental Quality Index (EQI). Pesaran's CD test indicates the spatial effect on the model and SAR with random effect can be considered a better-fitting spatial panel regression model. The results of the econometric spatial panel using SAR panel with random effect analysis show that Indonesia’s EQI in the provinces was dependent on the spatial. It was also found that regional GDP has a significant and negative effect on EQI and population density has a negative and significant effect on EQI. While fiscal policy on the environmental area on improving environmental quality did not pass a significance test. Thus, it is recommended to look for ways to promote green growth in the country.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Working Age Population and CO2 Emissions in Indonesia: 2021-10-04T02:00:13+00:00 Mirta Dwi Wulandari <p><span style="font-weight: 400;">An increase in the working age population causes an increase in consumption which in turn will have an impact on increasing CO? emissions. The household is an element that must be responsible for increasing emissions of greenhouse gases because of their fossil fuels consumption. This study aims to observe the relationship of the working age population and the CO? emissions in households. This study use data from National Socio-Economic Survey (Susenas) 2019 with households consuming gasoline / diesel / kerosene for transportation, and LPG / kerosene for cooking as a unit of analysis. Apart from working age population as the main independent variable, socioeconomic characteristics (household size, income, residential area, poverty, age, sex, education, employment status, and access to modern fuels) are also used as control variables. Multiple regression analysis was used in this study. The results show that the working age population variable is positively correlated to total CO? emissions, transportation-related emissions, and cooking fuels emissions. Respectively, households dominated by members of working age (15-64 years) emitted 8.7%, 12.7%, 3.2% higher than households dominated by non working age (0-14 years and/or 65+ years). Providing sustainable transport system can be the best solution to reduce CO? emissions.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics The Potential of Palm Oil as An Economic Recovery in Central Kalimantan in The Era of A Pandemic: Typology Klassen Analysis 2021-09-30T09:06:59+00:00 Annisa Nur Fadhilah Lita Jowanti Atika Kautsar Ilafi <p class="Abstract" style="margin-bottom: .0001pt;"><span lang="EN-GB">Palm oil is one of Central Kalimantan's leading commodities. With a plantation area of almost two million hectares. Central Kalimantan is capable of producing up to eight million tons of palm oil annually. During the pandemic, Central Kalimantan's economy experienced the deepest contraction of up to 3.17 percent due to restrictive policies to prevent the spread of the virus. According to Statistics of Indonesia, the agriculture, forestry, and fisheries sectors are the most resilient sectors because they can grow positively amid a pandemic. The palm oil commodity could be a solution for boosting the economy of Central Kalimantan through appropriate management strategies. One strategy in recovering from the impact of the pandemic is through Small and Medium Enterprise's innovation. Based on the Klassen Typology analysis, Pulang Pisau Regency has the biggest potential for developing oil palm SMEs (quadrant I). In addition, Palangka Raya City and Kapuas Regency are in quadrant IV, which means they have the highest number of SMEs. However, their economic growth has contracted.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Analysis of Government Policy in Handling Covid-19 in Indonesia 2021-09-29T09:13:19+00:00 Atika Kautsar Ilafi Annisa Nur Fadhilah Lita Jowanti <p>The Covid-19 pandemic has affected the economy in many countries, including Indonesia. Until July 2021, the Government has implemented social activity policies for the community, starting from Large-Scale Social Restrictions in the first semester of last year to PPKM Level 4 to stop the spread of Covid-19. Responding to the Covid-19 pandemic, Google released data from people who access google applications using mobile devices. The Google Mobility report shows changes in population activity and mobility in several locations. This study aims to examine the effect of the PSBB and PPKM policies in Indonesia on the decline in COVID-19 cases in Indonesia using the Google Mobility Index and their impact on the economy in Indonesia. The analysis uses graphs and Pearson Correlation and Long Short-Term Memory (LSTM) method to predict Covid-19 cases and mobility data. The result shows that the mobility of people to five places has a significant effect on the number of daily cases of Covid-19, while there is a significant effect on three places of community mobility on Indonesian economic. As the results, controlling the spread of Covid-19 is better prioritized than economic condition.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Estimating Customer Lifetime Value in the E-Commerce Industry Using Multivariate Analysis 2021-09-27T00:05:21+00:00 Bagaskoro Cahyo Laksono Ika Yuni Wulansari <p>Companies can develop their business using big data to support decision-making. Big data in the e-commerce industry that includes size and speed of high transactions can be used to analyze customer behaviour and predict customer value. Nowadays, companies are starting to develop customer-oriented rather than product-oriented business interests. One way that can be used to determine customer value is by calculating Customer Lifetime Value (CLV). By knowing CLV at the individual level, it will be useful to help decision-makers to develop customer segmentation and resource allocation. It is important to do segmentation or customer grouping that describes customer loyalty groups. Therefore, this research aims to calculate CLV and customer segmentation using the RFM analysis method. The dimensions of forming CLV include the values of Recency, Frequency, and Monetary. In this study, concept of multivariate statistical analysis will be applied, namely K-Means Clustering and factor analysis. Segmentation is done to determine the level of customers. The higher the CLV value, more valuable customer is to maintain. In the end, the customer segmentation method built by author can be used to optimize company's strategy to get maximum profit. This method can be applied to various cases and other companies.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Analysis of The Impact of The Covid-19 Pandemic on The Performance of Indonesian Non Oil and Gas Exports 2021-09-29T03:21:40+00:00 I Gusti Bagus Ngurah Diksa Dewa Ayu Srijayanti <p>In 2020, Indonesia's exports decreased by 2.61 percent due to declining global and domestic demand during the COVID-19 pandemic. The decline in exports was not too deep due to the increase in non-oil exports by 16.73 percent, while non-oil exports fell by 10.10 percent. This shows the potential for non-oil exports to support the Indonesian economy during the pandemic. Seeing the impact of COVID-19 on export performance then used the ARIMA method. Based on the research, it was found that at the beginning of the COVID-19 pandemic, Indonesia experienced a slump in export performance, especially non-oil and gas. This is due to various policies regarding restrictions on mobility.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics The Impact of Domestic Investment, Foreign Investments, HDI, Export, and Import on the Economic Growth in Indonesia 2021-09-16T02:21:29+00:00 Lutfia Septiningrum Paramita Dewanti Fauzah Hikmawati <p class="Abstract" style="margin-bottom: .0001pt;"><span lang="EN-GB" style="font-family: 'Times New Roman',serif;">The aims of this study were to examine the causal relationship between domestic investment, foreign, Export, Import, HDI and their impact on Indonesia's economic growth measure with GDP. The data used was panel data from 18 provinces in 2016-2020 which was taken based on stratified random sampling. The model used to complete the purpose of this research was panel data regression. The results of the analysis show economic growth based on the value of GDP in each province tends to decline. Modelling of economic growth in Indonesia was used Panel Data Regression. In this research, Hausman Test is used to obtain the best model of panel data regression because the model contain of Random Effect Model. Based on Simultaneous test results obtained at least one significant variable to the model and based on partial test the GDP was significantly influenced by the variables of FDI, DDI, HDI and Import sectoral value. Variable Export has an effect on GDP but is not significant where R<sup>2</sup> shows the results of 98.9%</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Variables Affecting Eligible Women in Poor Households to Smoke in Indonesia 2017 2021-09-16T02:16:47+00:00 Maghfira Ramadhani Risni Julaeni Yuhan <p>Smoking is one of public health threats. Cigarette consumption does not only impact on a person's declining health but also social behavior. Smoking behavior in women, especially eligible women (15-49 years old) threatens women’s reproductive health and the condition of the fetus in the womb during pregnant, which may get worse in poor households. Aside from that, cigarette consumption in Indonesia occupies the second position in food consumption with a portion of 12.17 percent. Therefore, the purpose of this study is to examine the variables that affect eligible women in poor households to smoke in Indonesia. The sources of the research data are the 2017 Indonesia Demographic and Health Survey (2017 IDHS) with the Household and Eligible women questionnaires. The method of analysis used descriptive analysis and inferential analysis with binary logistic regression method in rare event with the firthlogit model. The results of the study show that eligible women in poor households in Indonesia would have a tendency to smoke when they live in urban areas, are more mature in age, their highest educational level is lower than junior high school, work, never access mass media, have partner who do not work and have a big number of household members.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Return to Education Estimation on Self-Employment Entrepreneurs and Their Comparison with Workers in Indonesia 2021-10-07T01:02:27+00:00 Dwi Wahyudi Muhammad Hanri <p>Entrepreneurship in various pieces of literature is mentioned as one aspect that adds value to a country's economy. Using Sakernas August 2019 data and the Mincer income model, this study estimates the educational investment in self-employed entrepreneurs. The results show a positive effect between years of schooling and income earned. Compared to workers, the level of assessment of entrepreneur education looks lower. In addition, this study also looks at how income among entrepreneurs. The Gini coefficient shows 0.47 for self-employed entrepreneurs and 0.41 for workers. There is a sizeable amount of income inequality for self-employed entrepreneurs.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Measuring The Economic Contribution of Tourism: An Improvement in Indonesia 2021-09-16T03:32:27+00:00 Akhmad Mun'im <p>The implementation of the international standard manual is an effort made by every national statistical office (NSO) in developing its official statistics so that they have comparability at the global level. The methods recommended in the international standard manual have also been refined and adapted to other standard manuals so that the resulting official statistics are consistent with each other. Statistics Indonesia (BPS) as the Indonesian NSO adopts various international standard manuals, including the International Recommendations for Tourism Statistics (IRTS) and Tourism Satellite Accounts: Recommended Methodological Framework (TSA:RMF) 2008 manuals recommended by UNWTO in calculating the tourism contribution in the Indonesian economy. Both recommend the utilization of the supply and use table (SUT) framework that explains tourism supply-demand in measuring tourism <br />contributions. This approach is an improvement from the previous approach which used shock analysis under input-output (I-O) framework in calculating tourism contributions. Through the supply-demand of tourism sector approach, the amount of tourism direct gross domestic product (TDGDP) is obtained which shows the contribution of tourism to the national economy. During 2016-2019, the tourism sector contributed around 4.6 – 4.9 percent to the Indonesian economy.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Impact Evaluation of Child Labor on Health in Next 7 and 14 Years in Indonesia 2021-10-01T09:26:05+00:00 Febri Hamdani Eny Sulistyaningrum <p><span style="font-weight: 400;">This study aims to determine the impact of child labor on children's health both in next 7 and 14 years. Using two health indicators, growth in height and lung capacity. Child labor indicator is using child working hours. Three waves of longitudinal data from the Indonesian Family Life Survey (IFLS) are used, IFLS-3, IFLS-4, and IFLS-5. </span><span style="font-weight: 400;">In addition to the child labor variable as the focus of this study, other variables are also used as control. The technique of analysis used is the Instrumental Variable where the head of the household’s education as the instrument variable. The robustness check is also performed to ensure the model. The analysis shows that in next 7 years, child labor has less effect on health.</span><span style="font-weight: 400;"> Child labor negatively affects height growth but does not affect lung capacity. However, in next 14 years child labor negatively affects health, for both height growth and lung capacity. This is confirmed by the result of the robustness check, where child labor is preponderant in next 14 years than 7 years.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Forecasting Hotel Occupancy Rate in Riau Province Using ARIMA and ARIMAX 2021-09-16T01:57:45+00:00 Fadlika Arsy Rizalde Sri Mulyani Nelayesiana Bachtiar <p>Hotel Occupancy Rate is one of the important leading indicators for calculating the Accommodation Sub-Category of Gross Regional Domestic Product (GRDP). By the extreme decline of the Hotel Occupancy Rate data due to COVID-19 and the unavailability of current data to counting GRDP quarterly, the Hotel Occupancy Rate prediction needs to do with the appropriate forecasting method. The authors use data from Google Trends as an additional variable in predicting the Hotel Occupancy Rate using the ARIMAX model and then compares it with the ARIMA model. The results showed that the ARIMAX model had better accuracy than ARIMA, with a MAPE value of 9.64 percent and an RMSE of 4.21 percent. This research concluded that if there is no change in government policy related to social restrictions until the end of the year, the ARIMAX model predicts the December 2021 Hotel Occupancy Rate of 38.59 percent.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Spatial-Temporal Analysis of Deforestation in Sumatera Island 2011-2019 2021-09-28T05:47:05+00:00 Andrian Dwi Putra Siskarossa Ika Oktora <p>The existence of forests is threatened with deforestation, which can affect climate disturbances and environmental decay. This study aims to analyze determinants of deforestation in Sumatera Island from 2011-2019. The dependent variable is deforestation, and the independent variables are population density, land fires, road length, GDP of agricultural, fisheries, and forestry, and GDP of mining and excavation. The results show that there is spatial-temporal heterogeneity in deforestation in Sumatera Island from 2011-2019. Furthermore, because of the normality violation, the Robust Geographically and Temporally Weighted Regression (RGTWR) method is used. Analysis shows variables affecting deforestation in Sumatera Island vary in each province and change annually. Land fires were always significant in every province and every year from 2011-2019. To overcome deforestation, the governments need to consider the varying causes of deforestation, more firmly to forestry regulation and establish cooperation with local communities in managing forest.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Measuring Child Poverty in Jakarta Metropolitan Area Using a Multidimensional Perspective 2021-09-21T23:35:51+00:00 Nigel Roy Tantan Erni Tri Astuti <p>This study aims to quantitatively uncover multidimensional child poverty in Jakarta Metropolitan Area, where Indonesia’s capital and its surrounding regions are located. It comprises 15 indicators in six dimensions of child wellbeing: housing, education, facility, food and nutrition, child protection, and health. It is a very alarming condition in the region that nearly one-fourth of children are deprived in at least three dimensions. These children experience, on average, 0.57 of all possible deprivations, or 3.4 deprivations, which indicates a massive high deprivation intensity. The overall deprived children are also almost two times larger than the poor children that suggest the lower monetary child poverty rate doesn’t guarantee to lower the multidimensional child poverty.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Impact of Information and Communication Technology on The Welfare of Population in Eastern Indonesia 2021-09-18T02:16:01+00:00 Dyah Makutaning Dewi Istu Indah Setyaningsih Ariful Romadhon <p><span style="font-weight: 400;">During the Covid-19 pandemic, many countries in the world are expected to experience a slowdown or decrease in Human Development Index (HDI) growth including Indonesia. The disparities in development among regions to be one of the main issues in Indonesia. The gaps are not only occurring in HDI, but also in Information and Communication Technology (ICT) facilities between Western Indonesia and Eastern Indonesia. The purpose of this study is to analyze how Information and Communication Technology affects the Welfare of the Population in Eastern Indonesia. Research methods use multiple linear regression methods. The data is sourced from the BPS-Statistics Indonesia which consists of 17 provinces. The results showed that the percentage of internet users had a positive and significant effect while the percentage of the poor population had a negative and significant effect on the welfare of the population in eastern Indonesia. Therefore, the distribution of infrastructure, especially ICT infrastructure, does not only focus on western Indonesia. Therefore, it is expected that the population welfare gap will be reduced. The increasing use of the internet during the Covid-19 pandemic can be used as an opportunity to be used as a bridge for the distribution of information, communication, and digital-based economic development in order to achieve equitable welfare, especially in eastern Indonesia.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Welfare Index of Person with Disabilities in Indonesia, 2018 2021-10-04T01:55:28+00:00 Ni Putu Ayu Chyntia Manikka Sari Tiodora Hadumaon Siagian <p><span style="font-weight: 400;">Most people with disabilities in Indonesia still live in vulnerable, backward, or poor conditions due to restrictions, obstacles, difficulties, and reduction or elimination of the rights of persons with disabilities. In realizing prosperity for all Indonesian people, it should be fair and not contain discriminatory elements, including persons with disabilities. For this reason, a measure of the achievement of the welfare of persons with disabilities is needed as evaluation support material in making plans and policies. This study aims to obtain the welfare factors of persons with disabilities and compile them into the Welfare Index of Persons with Disabilities (WIPD). The construction of the WIPD was carried out using the exploratory factor analysis method. Based on the results, 20 indicators formed five factors, namely accessibility, housing and access to information, physical and spiritual well-being, social relations and sanitation, and economic well-being. From the WIPD scores, it is known that there is a gap in WIPD achievement between the Western and the Eastern Region of Indonesia. For this reason, the government needs to prioritize inclusive development in provinces with very low and low WIPD achievements.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics A Spatial Analysis of Crime in East Java Province in 2019 2021-09-29T02:03:46+00:00 Choirul Ummah Rini Rahani <p><span style="font-weight: 400;">Crime is one of the consequences of fluctuations in the economic condition of a country. Crime incidents harm many parties. The number of criminal acts increased in 2019, especially in Sumatra and Java Island. Most provinces experienced an increasing number of criminal acts, one of them was East Java. East Java contributed more than a quarter of the number of crimes throughout Java Island. The number of criminal acts is count data with overdispersion because its variance is higher than its average. This study aims to analyze the number of criminal acts by applying Geographically Weighted Negative Binomial Regression (GWNBR). The result shows that GWNBR formed two regional groups based on significant variables. The four independent variables namely the unemployment rate, the number of poor people, the Gini ratio, and the police population ratio have a significant effect on all districts/cities. However, the mean year of schooling shows a significant effect only in certain districts/cities. The GWNBR is the best model in modelling the number of criminal acts in East Java.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Application of Logistic Regression Modeling for Complex Survey Data on Education Continuity of Poor Households Children 2021-09-30T08:52:18+00:00 Rudi Salam Ardi Adji <p>Many population-based surveys such as the National Socio-Economic Survey (Susenas) are built with complex sampling assumptions, namely probabilistic, stratified, and multistage sampling, with unequal weights for each observation. This complex design must be taken into account in order to have reliable results when doing modeling. The model that is often used when using survey data is logistic regression. The purpose of this study is to determine a logistic regression model with a complex sample design, and to show how it is estimated using a package survey from the R software. As an illustration, the 2019 Susenas data of East Java Province will be used as an application to correct the influence of the sample design in estimating risk factors related to the chances of children 7-18 years old in poor households continuing their education. The results show that the variables of gender and mother's education significantly affect the continuity of the education of children 7-18 years old in poor households.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics An insight into Youth Unemployment in Indonesia 2021-09-29T04:05:17+00:00 Ayuningtyas Yanindah <p>Youth unemployment in Indonesia has continued to remain at a high level relative to other age categories for several years. The case of Indonesia’s youth unemployment is grave with the presence of a low workforce participation rate, informal employment, and higher unemployment rates in young people compared with adults. Due to the lack of research on a country-wise view of youth unemployment, this study focuses on providing a much better understanding of the youth unemployment problem in emerging countries, especially Indonesia. The main aim of the paper is to bridge the research gap on youth unemployment with reference to microeconomic determinants, such as educational background and participation in training. This study utilized the August 2019 data of SAKERNAS (<em>Survei Angkatan Kerja Nasional</em>) and analyzed the data using the logistic regression method. Logistic regression is a special econometric model where the dependent variable is considered categorical and dichotomous (binary); in this case, it was unemployed (1) or working (0). The study found that training participation has a negative correlation with youth unemployment, while educational attainment generates mixed results.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Estimation of Inflation Threshold of Indonesia and Its Effect on Economic Growth Periode 1981-2019 2021-10-05T03:06:44+00:00 Monika Stevany Manurung Aisyah Fitri Yuniasih <p>Sustainable economic growth with a low and stable inflation rate is one of the goals of macroeconomic policy in improving people's welfare. High inflation can be detrimental to economic growth in the medium and long term, while a certain level of inflation is needed to move the economy. Therefore, the question arises about the level of inflation that does not have a negative impact on economic growth. This study aims to estimate the inflation threshold level and identify its effect on Indonesia's economic growth 1981-2019. The research begins by determining the best model among the models that regress inflation on economic growth with quadratic regression, Hansen's (2000) threshold regression, and Mubarik's (2005) threshold regression (2005). The best model is the Mubarik threshold regression model (2005) with an inflation threshold of 6.85 percent. Mubarik's (2005) threshold regression analysis was reused in the model involving the FDI variable, the inflation threshold was 7.11 percent, and FDI had a positive effect. Inflation below the inflation threshold encourages economic growth, while inflation above the inflation threshold is detrimental to economic growth. The result of the estimated threshold level is higher than the inflation target by BI, so that inflation targeting can be increased.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Resilience of Workers Affected by COVID-19 Outbreak in Maintaining Their Jobs, in Which Sector Survives Most Longer? 2021-09-17T09:25:18+00:00 Fenanda Dwitha Kurniasari Yulia Atma Putri <p class="Abstract"><span style="font-weight: 400;">Employment is one of the areas affected during the covid-19 outbreak. The government of Indonesia has taken numerous measures to restrain the growth rate of covid-19, such as the implementation of social restriction, which leads to a multidimensional problem – the employment problem. Indonesia’s unemployment in 2020 has increased compared to 2019. According to Statistics Indonesia, the open unemployment rate in August 2020 is about 1.84 percent higher than August 2019, and from the total working-age population in August 2020, 14.28 percent of them were affected by covid-19. This study aims to investigate the resilience of workers affected by the covid-19 outbreak in maintaining their jobs by comparing the survival rates in the sectors most affected by covid. The methodology used in this research is survival analysis in time resilience of workers affected by the covid-19 outbreak in maintaining their jobs. The conclusion obtained from this study is that the sector significantly influences worker’s time resilience (p-value &lt; 0.05). Among the six sectors most affected by covid-19, workers in the construction sector has the highest time resilience compared to 5 other sectors – most survive workers in maintaining their jobs during covid-19 outbreak, followed by the accommodation and food services, other services activities, manufacturing, wholesale and retail trade sectors. The most affected sector for the time resilience of workers during the COVID-19 outbreak is transportation and storage. </span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Changing in National Infrastructure Policy: How It Affect Indonesia’s Economy? 2021-10-04T23:54:09+00:00 Ade Marsinta Arsani Chaoqing Huang <p>This research would like to firstly figure out how new infrastructure policy affects national economic structure changes, and secondly figure out does the new policy effect on inter-regional economy linkage. This study uses economic structure, growth decomposition, location quotient, and linkage analysis on Input-output table to indicate national and inter-regional level economic changes between 2010 and 2016 in Indonesia. We find that economic structure generally remains the same, only transportation and real estate sector increased their contribution, this may indicate the beginning of infrastructure development stage. During 2010 to 2016, the growth was led by the expansion of domestic demand in almost all sectors, however in some sectors the technological changes have a negative contribution. Furthermore, the two most linked sectors are manufacturing and electricity sectors. Inter-regional analysis indicated that Java and Sumatera have more power and sensitivity level compared to other regions. The suggestion to booster economy development is to implement technological process and publish policy considering regional characteristics may accelerate economic equity across regions.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Wages of Workers Spatial Analysis in Indonesia Region 2019 2021-10-04T00:43:28+00:00 Maghfirah Maghfirah Omas Bulan Samosir <p><span data-offset-key="6arc0-0-0">The wage inequality of workers in Indonesia is one of the main problems and concerns that are important to be addressed by the government. The determination of the regional minimum wage by the local government has not been able to solve the problem of inequality. On a larger scal</span><span data-offset-key="6arc0-0-1">e, the wage inequality of workers can affect the stability of the national economy. Research on the spatial analysis of workers' wages is very important to be carried out as a basis for making appropriate policies by the government. In this study, we have succeeded in analyzing the dependence and spatial relationship of a region with the wages of its workers and have identified the factors that affect the wages of workers in a region. The result reveals the spatial dependences are detected among districts, followed by the spatial clusters and spatial outliers through global and local spatial autocorrelation. Applying two spatial autoregressive models, spatial autoregressive lag model (SAL) and spatial autoregressive error model (SEM), SAL confirmed that there are 4 significant independent variables with a level of 10 percent and have a positive relationship, namely education, age, internet, and sex ratio variables. And SEM confirmed that there are significants 5 significant independent variables with a level of 10 percent and have a positive relationship, namely education, age, technology, internet, and sex ratio variables. As the policy implication, since regional inequality in term of wage is still a major issue, it will be a call for better coordination and cooperation within and between regions.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Micro and Macro Determinants of Precarious Employment in Indonesia: An Empirical Study of Paid Workers using Multilevel Binary Logistic Regression 2021-09-13T12:55:29+00:00 Mohammad Rifky Pontoh Nucke Widowati Kusumo Projo <p>Decent work for all is one of the goals stated in the Sustainable Development Goals (SDGs). One indicator that can represent proper work conditions is the precarious employment rate (PER). In recent periods, the precarious employment rate in Indonesia has shown an increasing trend. It indicates a decent work deficit in Indonesia. In addition, the PER among provinces has a different figure. This study aims to analyze the micro and macro factors that influence the status of precarious employees in Indonesia. The analytical method used in this study is multilevel binary logistic regression. The results show that micro factors; namely the worker's characteristics, including age, education level, employment sector, previous work status, and urban-rural area; have a significant effect on the precarious status of employees. In terms of macro factors, it is found that an increase in the output of the industrial and construction sectors can reduce the tendency of a worker to become a precarious employee. Meanwhile, an increase in labour supply increases the likelihood of workers becoming precarious employees. Various parties, including society and government, have to put extra efforts to reduce the precarious employment rate by improving the quality of human capital and domestic products demand.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Application of The Sequential Hot-deck Imputation Method for Identification of Indonesian Standard Classification of Business Fields (KBLI) 2021-09-13T08:49:47+00:00 Iman Jihad Fadillah Chaterina Dwi Puspita <p>The Covid-19 pandemic requires the adjustment of new habits in daily life, including in a series of data collection processes. One of the new adjustments is to use alternative types of data collection other than face-to-face, such as the telephone and the web. Information collected through telephone interviews is less accurate than the same information collected through face-to-face interviews, such as the level of non-response, consistency between entries, and outliers in the data or often identified as missing values. Missing value will be very influential on data quality when it appears on important variables. One of these variables is the Standard Classification of Business Fields (KBLI). Imputation is one method that can be used to deal with this problem. One method that is quite popular is Sequential Hot-deck Imputation. Therefore, this study aims to facilitate the identification of 5-digit KBLI by utilizing the Sequential Hot-deck Imputation method. The results of this study indicate that the use of the Sequential Hot-deck Imputation method in the KBLI identification process gives very high accuracy results. In addition, the use of this method is very efficient in the identification process, because the time required is very short, even in large datasets.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Demographics Characteristics of Smoker in Poor Households in Riau Islands Province 2021-09-14T03:37:06+00:00 Dio Dwi Saputra <p>Smoking habits in Indonesia have been formed since the colonial era. Smoking habits that need attention are in poor households. In 2020, Riau Islands Province as the one of youngest provinces in Indonesia has a smoking prevalence of 26.16% and the percentage of poor people is 5.92%. This condition is the basis for researchers to conduct a study that aims to determine the demographics characteristics of smokers. This study uses raw data from the National Socio-Economic Survey (SUSENAS) in Riau Islands Province in March 2020. The variables used are smoking status, gender, age group, education level, region, and recent migrant. The output of the processing stage is that the prevalence of smoking will be greater in the male population (OR = 132.04), the age group of 46-65 (OR = 4.77), the age group of 66 and over (OR = 2.11), the junior high school level (OR = 4.66), the senior high school level (OR = 5.98), the college level (OR = 3.13), living in the urban area (OR = 1.22) and the recent migrant (OR = 3.12). Thus, it is necessary to make a specific policy following the above characteristics in reducing smoking habits among poor households.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Performance Comparison of Hot-Deck Imputation, K-Nearest Neighbor Imputation, and Predictive Mean Matching in Missing Value Handling, Case Study: March 2019 SUSENAS Kor Dataset 2021-09-30T08:57:08+00:00 Tsasya Raudhatunnisa Nori Wilantika <p>Missing value can cause bias and makes the dataset not represent the actual situation. The selection of methods for handling missing values is important because it will affect the estimated value generated. Therefore, this study aims to compare three imputation methods to handle missing values—Hot-Deck Imputation, K-Nearest Neighbor Imputation (KNNI), and Predictive Mean Matching (PMM). The difference in the way the three methods work causes the estimation results to be different. The criteria used to compare the three methods are the Root Mean Squared Error (RMSE), Unsupervised Classification Error (UCE), Supervised Classification Error (SCE), and the time used to run the algorithm. This study uses two pieces of analysis, comparison analysis, and scoring analysis. The comparative analysis applying a simulation that pays attention to the mechanism of missing value. The mechanism of the missing value used in the simulation is Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR). Then, scoring analysis aims to narrow down the results of comparative analysis by giving a score on the results of the imputation of the three methods. The result suggests Hot-Deck Imputation is the most excellent in dealing with a missing value based on the score.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Determinant of Labor Force Resilience Against The Employment Impact of The Covid-19 Pandemic in Bali Province, Indonesia: An Application of Survival Analysis 2021-10-03T08:59:04+00:00 Ni Luh Putu Yayang Septia Ningsih Mohammad Dokhi <p>The impact of coronavirus disease 2019 (Covid-19) pandemic is not only on health problems, but also has a negative impact on economic. The sector that economically worst affected by the pandemic is the tourism and its derivatives. As a result of depending heavily on the tourism sector, Bali is the province with the most labor force that has stopped working during the pandemic. In this study, data from the national labor force survey were analyzed using the Weibull-Gamma Shared Frailty Survival Model to explore the determinants of labor force resilience against the event of stop working due to the Covid-19 pandemic. The results show that gender, education level, experience in training, marital status, and age of labor force are variables that significantly affect on how quickly a labor force experiences an event of stop working. Moreover, variations among regions where they work (regencies/cities) also have a significant effect on stop working acceleration.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Individual and Province-level Determinants of Unemployed NEET as Young People’s Productivity Indicator in Indonesia During 2020: A Multilevel Analysis Approach 2021-09-30T08:55:38+00:00 Ni Putu Gita Naraswati Yogo Aryo Jatmiko <p>Nowadays, employment has become one of the focus of attention for developing countries, including Indonesia. This is one of the urgencies that must be addressed considering that the Indonesian population is entering the demographic divident period. Success in achieving the demographic divident is very dependent on the employment conditions of young people in realizing a low level of dependence. However, obstacles in terms of education and employment are still experienced by youth which can be seen from the percentage of NEET from Year-on-Year (YoY), especially in 2020 it is exacerbated by Covid-19 pandemic. Based on these problems, it is necessary to research NEET in Indonesia in 2020. This study uses 2020 National Labor Force Survey (Sakernas) data which is analyzed by using multilevel binary logistic regression analysis. The unemployed status of young NEETs is influenced by gender, age, marital status, highest education completed, disability status, classification of the area of residence, and recent migrant status. There is a multilevel effect in the NEET assessment of young people as evidenced by the influence of Gross Domestic Product (GDP) and Human Development Index (HDI). The research results are expected to be used as a reference in making policies to optimizing the mismatch program on the pre-employment card to bridge the young age of job seekers with available job opportunities and based on the province-level variable, the province government are expected to maximize the province-level variables that affect the tendency of NEETs to remain active in the labor market. that are targeted towards the NEET problem in Indonesia.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Estimation of Total Fertility Rate (TFR) Using Small Area Estimation (SAE) in Nusa Tenggara Timur (NTT) Province 2021-09-13T08:43:59+00:00 Mellinda Mellinda Cucu Sumarni <p>The large population in Indonesia has an impact on providing basic services for population which is not optimal so the condition and distribution of the population in a country must be addressed through fertility control methods. Total Fertility Rate (TFR) is one of fertility measures used in Indonesia. The estimation of TFR at the district level is very important, especially for the Nusa Tenggara Timur (NTT) Province as the province with the highest TFR in Indonesia. The availability of TFR data up to the district level is difficult to obtain every year due to data limitations. This study uses the National Socio-Economic Survey to address these problems. TFR estimation through survey data (direct estimation) generally results in a large Relative Standard Error (RSE) value, so it is necessary to estimate using an indirect estimate in the form of Small Area Estimation (SAE). By using SAERestricted Maximum Likelihood (REML) procedure, TFR with an RSE that is lower than the direct estimate is obtained. There are 5 district that have a medium-high TFR, namely: Sumba Barat Daya, Sumba Tengah, Sabu Raijua, Sumba Barat, and Manggarai Barat. The government is recommended to focus more on that 5 districts to suppress the high TFR in NTT.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Determine Sample Size for Precision Results on Quick Count 2021-10-03T09:08:05+00:00 Yusep Ridwan Rizqon Halal Syah Aji <p>This research aims to answer the problem of the appropriate sample size in the case of the quick count of the election so that the results obtained are close to the actual results. Although there are practical procedures that are widely used to calculate the sample size in the quick count methodology, in reality, the results obtained often deviate from the actual results, so the issue of precision is always an interesting discussion. The formulation of the problem regarding the size of the sample and how the level of precision of the forecast results are important issue to be discussed. This research method is included in experimental research where the analysis used is the Kruskal-Wallis test. The data used is primary data from the real count results of the regency election Sumedang by consultants and teams. The results showed that there was a significant difference between the seven sample size groups in vote acquisition and the percentage of votes at the polling station (TPS), where the sample sizes n=408, n=500, n=875 and n=1674 were the most appropriate sample sizes in the implementation of the quick count.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Personalized Route Recommendation Method for Field Survey Officers using Social Media Information and Administrative Border Maps 2021-09-16T07:32:29+00:00 Eko Hardiyanto <p>Changes in business processes in pandemic conditions are a must. The field survey were most affected, not only the interview process but also the route selection to survey location. To support the field survey officer, it is necessary to provide alternative route choices to the survey location as fast as possible. This research proposed a methodology that combines three information source, administrative border maps, google maps services, and information from social media that elaborated to provide best recommendation route to the assigned survey location. The combination of three different sources can enhance the current existing route that only relies on google map services. Our mechanism was tested on custom my maps application provided by Google and evaluated using system usability scale. This research aims to give the personalized route to field survey officers based on the assigned survey location and information from social media. The limitation of this research is that the social media channels used are still few, in the future, this research can be leveraged by integrating other platforms owned by the government and other public services to enrich the information.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Data Input Quality Metrics on Mobile Positioning Data (MPD) 2021-10-07T16:36:13+00:00 Alfatihah Reno MNSP Munaf Amanda Pratama Putra Wa Ode Zuhayeni Madjida Ignatius Aditya Setyadi Amin Rois Sinung Nugroho <p>Statistics Indonesia (BPS) has been using Mobile Positioning Data (MPD) to support official statistics since 2016. As a source of big data, MPD also has veracity characteristics, indicating uncertainty in the data. Therefore, it is necessary to check that the data are good enough to allow further analysis and the quality assurance process. Currently, there is no established international standard for quality assurance of MPD. This paper describes the quality matrix used by BPS in examining data from mobile operators. BPS uses thirteen indicators in conducting quality assurance, where the inspection uses several different methods, such as setting a threshold, checking data completeness, and checking the form of data distribution. Exploratory Data Analysis is carried out to determine whether the data meets the requirements for further analysis. We conducted this research on a mobile network operator data for June - July 2020 as the basis for MPD analysis in 2021. Based on the inspection during this period, BPS can cooperate with this cellular operator to conduct data analysis in 2021. However, the operator must repeat the calculation of the required matrix as quality assurance every month.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Development of The Welfare Index at Sub-district Level in West Java 2020: A Small Area Estimation Approach 2021-10-07T16:37:17+00:00 Nimas Ezra Monadiyan Faisal Haris <p>Welfare as an alternative measure for poverty, is an important indicator to measure. But there is no composite indicator that specifically measures the prosperity of a population yet. Considering the increase of data needed and the lack of available data at the smaller level, this study develops the Welfare Index at sub-district level for West Java in 2020 using a small area estimation approach to explain the condition of welfare of the population. The indicators in sub-district level formed in this study were created from two kinds of data. The first type of indicators were formed from SUSENAS using Small Area Estimation and the other type of indicators were formed from PODES aggregation. All the indicators were then processed with factor analysis to form the Welfare Index. The Welfare Index formed shows the range of 22.86 to 83.76 and generally higher in the northern part of West Java. This index has a correlation of 0.798 with the Human Development Index because of the components that defined both indexes. The existence of this correlation shows that the Welfare Index formed is able to explain the conditions/phenomena being measured.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Potencies and Threats of The Demographic Bonus on The Quality of Human Resources and Economy in Indonesia 2019 2021-10-01T22:51:28+00:00 Febby Risandini Rini Silvi <p>The success of Indonesia’s development is marked by increasing economic growth, which is in line with the demographic transition, where the number of people who are borne is less than the population which bears it. Indonesia will enter the peak of the demographic bonus in 2030, where every 100 productive aged people bear 46 to 47 non-productive-aged people. The demographic bonus can positively impact on the economy and the quality of human resources if its potential is adequately utilized but becomes a threat if not maximized. Therefore, path analysis is used in this study to analyze the potencies and threats of the demographic bonus and its effect on economic growth, either directly or indirectly through the quality of human resources. The results of this study are the potential index consisting of labor absorption, household savings, and women in the labor market does not significantly influence on the quality of human resources and economic growth. Meanwhile, the threats index, which consists of internet access, migration, and child marriage, has a significant positive direct effect on economic growth and a significant negative indirect effect on economic growth. These results indicate that the threat index has a greater influence than the potential and it is hoped that the government will focus on reducing the threat of the demographic bonus, but it must be accompanied by an increase in the quality of human resources.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Data Collection Improvement: Daily Self-Enumeration Accommodation Survey 2021-09-13T12:52:19+00:00 Ignatius Aditya Setyadi <p>Until now, BPS - Statistics Indonesia has conducted monthly accommodation surveys both for the star and non-star accommodation categories to provide information on commercial accommodation activities at the national and regional levels. Both star and non-star accommodation categories are done by complete enumeration in each region. Statistics include guest night and room capacity to obtain the occupancy rate of a hotel room. The data contains daily accommodation information that is collected every month, so then it will be entered completely in each region following the observation month. Due to the timeliness requirements for monthly press releases, BPS has implemented online data entry since 2017. It may seem obvious, regions that have more interest will have an impact on a bigger number of accommodations, which also affects the number of enumerators and may lead to such problems especially in response burden. Unfortunately, the same problem is also not easily avoided by regions with less accommodation, mostly due to the distance issues to the accommodation area and its spread in the region. Therefore, a new data collection strategy is required to provide respondents with convenience in order to increase response rates, as well as to reduce the workload of enumerators which also leads to the lower cost. <span data-offset-key="6arc0-0-0">The outbreak of COVID-19 has posed unprecedented problems for National Statistical Offices (NSOs) around the world, including BPS – Statistics Indonesia. This crisis has led us to think in new ways and make decisions that will change our statistical operations in order to meet on</span><span data-offset-key="6arc0-0-1">going data needs even throughout the epidemic. The purpose of this paper is to discuss the evolution of accommodation surveys, which are designed to not only solve problems but also achieve objectives. Currently, there are nearly 180 active users of this self-enumeration accommodation survey for about 142 distinct accommodations across Indonesia. Moreover, this addition has proven to have succeeded in increasing the response rate average from 57.17% in 2020 to 68,35% in 2021.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Multilevel Analysis: Household and Regional Factors Influence Agricultural Household Poverty in Indonesia, 2019 2021-10-04T00:51:19+00:00 Ariful Romadhon Diyana Indah Sari Bayu Rhamadani Wicaksono <p>The agriculture sector is not only a source of food but also a support for the economic activities of most people in Indonesia, especially in rural areas. Unfortunately, most of their life are still below the Poverty Line/<em>Garis Kemiskinan</em> (GK). The uniqueness of this study is that this study uses household and regional variables to see their effect on agricultural household poverty. Thus, the policies will be taken are not only from the micro-economic of the household but also from the macro-economic perspective. Using multilevel binary logistic regression analysis, this study aims to examine the household and regional factors that affect the household poverty in agriculture sector in 2019 as the potential sector to alleviate poverty. Household and regional factors that affect agricultural household poverty are education, household size, resident area, ownership of pension social security, ownership of social assistance, credit assistance for businesses, and Gross Regional Domestic Product (GRDP) agricultural per capita. The variation of agriculture household poverty due to differences in characteristics between 514 districts in Indonesia is 35.19 percent.</p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Measurement of Sustainable Agriculture at Household Level: Results of Indonesian Agriculture Integrated Survey (AGRIS) Pilot 2021-09-13T08:35:54+00:00 Kadir Kadir Isnaeni Nur Khasanah Eka Rudiana <p>This study aims to measure and analyzes the level of agricultural sustainability at the household level using the results of the Integrated Agricultural Survey (AGRIS) pilot conducted by Statistics Indonesia in 2020. <span data-offset-key="6arc0-0-0">Applying descriptive analysis on the computation results of eleven sub-indicators of the SDGs 2.4.1 indicator at the household level, we analyzed the proportion of agricultural households categorized as sustainable and unsustainable for each corresponding sub-indicator of sustain</span><span data-offset-key="6arc0-0-1">ability. We also estimated the average land area managed by agricultural households for each category in each sub-indicators. We found that most agricultural households in West Java, East Java and West Nusa Tenggara are categorized as unsustainable in agricultural practices regarding land productivity. The proportion of households practising unsustainable agriculture are also quite large regarding fertilizer use and decent employment. We also found that less land productivity and poor management of fertilizer use are the phenomena of a relatively large scale farm.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Application of Spatial Empirical Best Linear Unbiased Prediction (SEBLUP) of Open Unemployment Rate on Sub-District Level Estimation in Banten Province 2021-09-20T22:53:24+00:00 Apriliansyah Apriliansyah Ika Yuni Wulansari <p><span data-offset-key="6arc0-0-0">The open unemployement rate is an indicator for measuring unemployment. Banten Province recorded as the highest on open unemployment rate number in Indonesia on 2018. A high open unemployment rates indicate serious problems in society. This problem must be resolved synergisticall</span><span data-offset-key="6arc0-0-1">y from the national level to the level of small areas such as sub-districts. However, data for the small area level has not been fulfilled due to the insufficient number of samples. We apply spatial EBLUP to estimate the open unemployment rates in the districs of Banten. Such a method of small area estimation is essential because some districts have small labor forces and direct estimation for them is not reliable. SEBLUP takes advantage of the correlation of the neighboring districts. Data that used for direct estimation is from National Labor Survey (Sakernas) and Village Potential (Podes) 2018. This research showed that SEBLUP model can increased the precision from direct estimation method or EBLUP. There are two districts that have highest category of open unemployment rate which are Curugbitung, and Koroncong</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Information System Development of Master’s and Doctoral Study Task Recommendation Using Fuzzy AHP 2021-09-20T22:39:52+00:00 Nurul Ni'mah Yunarso Anang <p><span data-offset-key="6arc0-0-0">In improving quality of employees through education, Statistics Indonesia is assisted by Education and Training Centre, one of them through the Study Task program at Master's and Doctoral degrees. Due to the decrease of registrant as candidate participants of this program and to </span><span data-offset-key="6arc0-0-1">facilitate employees proposed, information system for providing recommendations Study Task program is needed. This research aims to provide recommendations to employees who are highly, moderately or not recommended being able to continue education with Study Task program using an information system. Decision-making in system is Fuzzy Analytic Hierarchy Process (Fuzzy AHP) method with assessment criteria, sub-criteria and their weights can be changed by certain actors. The system was built using the Framework for the Application of Systems Thinking (FAST) method. The system was evaluated by Black Box Testing with the results of all functions running well and System Usability Scale with results of 80.71 which means the system can be received by users. Finally, this system is expected to assist the selection of employees and provide opportunities for employees who are able to continue their education with Study Task program efficiently and reduce the subjectivity of the assessment.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Household Food Insecurity in DKI Jakarta Province at The Beginning of The Covid-19 Pandemic 2021-10-01T22:52:44+00:00 Lutfi Hamdani Sutikno Budiasih Budiasih <p><span data-offset-key="6arc0-0-0">Food insecurity is a global issue that’s concern not only in poor and developing countries, but also in developed countries. Its conditions have worsened since the beginning of the Covid-19 pandemic where social restrictions and economic contraction caused many people to lose the</span><span data-offset-key="6arc0-0-1">ir jobs, incomes, and increased poverty. DKI Jakarta was one of the most economically affected provinces at the beginning of the Covid-19 pandemic where economic growth in the first quarter of 2020 recorded grow 5.06 percent year on year (the lowest in the last ten years) and slowed down by 0.56 percent overall quarter to quarter, and an increase of poverty 1.11 percent, the highest in Indonesia. This study examines the effect of household characteristics in DKI Jakarta on their food insecurity status at the beginning of the Covid-19 pandemic. The data used is the March 2020 Susenas which was analyzed descriptively and inferentially using firth logistic regression. The results showed that there were 4.47 percent of households in DKI Jakarta had food insecurity status at the beginning of the Covid-19 pandemic. In general, households with food insecurity status are poor, don’t have social security, the head of the household doesn’t work and less than high school education.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Determinants of Unmet Need Family Planning Among Married Woman of Reproductive Age in North Sumatra (Susenas March 2019) 2021-10-01T22:53:41+00:00 Aprillia Anis Saputri Rini Rahani <p><span data-offset-key="6arc0-0-0">Unmet need is one of the obstacles of the family planning programs that can reduce contraceptive prevalence. The percentage of total unmet need in North Sumatra Province is 12.1 and comparable to the total national unmet need in 2019. This study aims to determine the factors that</span><span data-offset-key="6arc0-0-1"> influence family planning needs and the tendency of married women of reproductive age in North Sumatra Province in 2019 with multinomial logistic regression. The data used is sourced from the Susenas KOR 2019. Results show that married women of reproductive age having a greater tendency to experience the unmet need for limiting are characterize as 35-49 years old, living in urban areas, and with junior high/equivalent levels. Meanwhile, the characteristics of married women of reproductive age (WUS) who have a greater tendency to experience the unmet need for spacing such as aged 15-24 years, Age at First Marriage more than 18 years, and with a higher education level. Therefore, a more optimal commitment and support from family planning field workers in family planning counselling are needed and increase equitable access and quality of family planning services.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Big Data for Small Area Estimation: Happiness Index with Twitter Data 2021-10-01T22:54:15+00:00 Sheerin Dahwan Aziz Azka Ubaidillah <p><span data-offset-key="6arc0-0-0">Data availability for small area level is one of the keys to the success of regional development. However, direct estimation of small areas can produce high error due to inadequate sample sizes so the estimation is not reliable. One of alternative solution to this problem is to u</span><span data-offset-key="6arc0-0-1">se the Small Area Estimation (SAE) method which can improve precision by "borrows strength" of the corresponding region information or auxiliary variable information that is strongly related to the response variable. This study uses two SAE models, namely SAE EBLUP Fay-Herriot model with auxiliary variables Podes data and SAE with Error Measurement with auxiliary variable Twitter data. Estimation results using the SAE method are better than direct estimates. This is shown by the RSE value which produced from SAE method, both the EBLUP model and Measurement Error, is smaller than the direct estimate. Therefore, big data can be used as an alternative variable in the SAE model because the data is available in real-time, covers up to the smallest area, and relatively low cost.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Comparing Voluntary and Involuntary Part Time Female Workers in Maluku 2021-10-01T18:17:46+00:00 Muhamad Bagus Adji Briliyanto Titik Harsanti <p><span data-offset-key="6arc0-0-0">Maluku Province has the third highest average length of schooling (RLS) for women nationally, but the rate of female workers with below normal working hours (part-time workers) is quite high. This study aims to determine the general description of married women age 15-49 years as</span><span data-offset-key="6arc0-0-1"> part-time worker in Maluku and the determinants, also their tendency based on the significant variables using data from the National Labor Force Survey (Sakernas) August 2019. The analytical method used is multinomial logistic regression. The results of the study indicate the variables that significantly affect the part-time worker status of married women of reproductive age are employment status, income, and business field. The status of involuntary part-time worker (underemployed) significantly affected by age, work sector, disability, and the presence of toddlers. The status of voluntary part-time workers significantly affected by regional classification and education. The tendency to become underemployed is highest among those who have incomes below the minimum wage, work in agricultural sector, and work in informal sector. Meanwhile, the tendency to become voluntary part-time workers is highest among those who have incomes below the minimum wage, and work in the agricultural sector. So, policy makers must ensure married women get a decent paid job.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics Analysis of Input-Output Table: Identifying Leading Sectors in Indonesia (Case Study in 2010, 2016 and 2020) 2021-09-20T03:05:41+00:00 Yoga Dwi Nugroho <p><span data-offset-key="6arc0-0-0">According to Neoclassical theory every country has to maximize their own resources include labor, natural resources also physical resources for developing their economy. Sector-based economic development must be carried out using comprehensive economic indicators, not only lookin</span><span data-offset-key="6arc0-0-1">g at the economic structure but also being able to identify and analyze inter-industry relationships. One of the right indicators is through the analysis of the Input-Output Table. The I-O table used is this research are I-O Table 2010, 2016, and 2020 estimated. In this comprehensive analysis, the Forward and Backward Linkage Indexes were calculated so that the sectors that are included in the Leading sector can be identified. In addition, a good multiplier analysis is carried out include output, income, labor, and value-added multiplier to see the amount of output, income, labor, and value-added changes caused by the changes of final demand. The results of the research show that sector that is included as a lever sector is the manufacturing sector (sector 3) and the procurement of electricity and gas (sector 4). Sector 3 is the most potential sector as leading sector due to some reasons this sector has a large output, added value and input structure, and has a high multiplier for four types of multipliers and analysis of Forward linkages and Backward Linkage Indexes shows this sector has high value. Manufacturing industry is a strong leading sector, from this the recommendation is the government can increase output of the manufacturing industry by give subsidy or decrease the tax or government can decrease the price of another sector that be the intermediate sector for manufacturing industry by giving subsidy.</span></p> 2022-01-04T00:00:00+00:00 Copyright (c) 2022 Proceedings of The International Conference on Data Science and Official Statistics