Analyzing Infectious Disease in Multiple District in East Nusa Tenggara (ENT) using K-Means Clustering and Correspondence Analysis

Authors

  • Fadlan Adhari Mathematics master study program, faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Ganesa street No 10 Bandung, 40132, Indonesia
  • Gabriela Lintang Sulistyoreni Mathematics bachelor study program, faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Ganesa street No 10 Bandung, 40132, Indonesia
  • Jessica Jocelyn Jakson Mathematics bachelor study program, faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Ganesa street No 10 Bandung, 40132, Indonesia
  • Angelina Sekar Larissa Mathematics bachelor study program, faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Ganesa street No 10 Bandung, 40132, Indonesia
  • Yuli Sri Afrianti Statistics research division, faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Ganesa street No 10 Bandung, 40132, Indonesia
  • Fadhil Hanif Sulaiman omputational science master study program, faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung, Ganesa street No 10 Bandung, 40132, Indonesia

DOI:

https://doi.org/10.34123/icdsos.v2025i1.426

Keywords:

Correspondence analysis, East Nusa Tenggara, infectious diseases, K-means clustering

Abstract

Infectious diseases remain a major public health concern in Indonesia, particularly in East Nusa Tenggara (ENT), where tuberculosis (TBC), dengue haemorrhagic fever (DHF), and HIV/AIDS are obtaining high cases. These diseases are not only influenced by individual and environmental factors but also by spatial characteristics such as population distribution and regional infrastructure. Therefore, analyzing spatial factors is crucial to better understand and manage the spread of infectious diseases in ENT. This study uses data from 2023 to 2024 across 22 districts in ENT, focusing on the prevalence of TBC, DHF, and HIV/AIDS. K-means clustering is first applied to classify the districts into three groups based on area size and population, aiming to identify spatial patterns of disease severity. The clustering process yields a silhouette coefficient of 0.48, indicating moderately valid group separation. Subsequently, correspondence analysis is used to examine the relationship between the resulting clusters and the three diseases. The result reveals that Cluster A, which has the highest population density, shows a strong association with all three infectious diseases. These findings suggest that population density plays a significant role in the transmission of infectious diseases and should be considered in future health intervention strategies.

Downloads

Published

2025-12-22

How to Cite

Adhari, F., Lintang Sulistyoreni, G., Jocelyn Jakson, J., Sekar Larissa, A., Sri Afrianti, Y., & Hanif Sulaiman, F. (2025). Analyzing Infectious Disease in Multiple District in East Nusa Tenggara (ENT) using K-Means Clustering and Correspondence Analysis. Proceedings of The International Conference on Data Science and Official Statistics, 2025(1), 613–621. https://doi.org/10.34123/icdsos.v2025i1.426