Comparison of Imputation Methods: Traditional, Machine Learning, and Deep Learning on Multivariate Time Series with MCAR and MNAR

Ferigo Taufani Tri Hakiki; Naufal Luthfan  Tasbihi; Akila Akhtar  El Dafi; Nurfaudzan .; Andi Shahifah Muthahharah

doi:10.34123/icdsos.v2025i1.707

Authors

Ferigo Taufani Tri Hakiki Statistics Department, Institut Teknologi Sepuluh Nopember, Kampus ITS Sukolilo, Surabaya 60111, Indonesia
Naufal Luthfan Tasbihi Statistics Department, Institut Teknologi Sepuluh Nopember, Kampus ITS Sukolilo, Surabaya 60111, Indonesia
Akila Akhtar El Dafi Statistics Department, Institut Teknologi Sepuluh Nopember, Kampus ITS Sukolilo, Surabaya 60111, Indonesia
Nurfaudzan . Statistics Department, Institut Teknologi Sepuluh Nopember, Kampus ITS Sukolilo, Surabaya 60111, Indonesia
Andi Shahifah Muthahharah Data Science and Decisions Department, The University of New South Wales, High St, Kensington NSW 2052, Australia

DOI:

https://doi.org/10.34123/icdsos.v2025i1.707

Abstract

This study compares the methods of Linear Interpolation, Kalman Filtering, SVR, and RNN-GRU for multivariate time series that exhibit linear trends and seasonality. Synthetic data for three variables were generated for small, medium, and large sample sizes. Missing values were systematically inserted using Missing Completely at Random (MCAR) and Missing Not at Random (MNAR) patterns with proportions of 10%, 20%, and 35%. The accuracy of imputation was evaluated using RMSE, MAPE, and R² over 150 simulation repetitions per scenario. The results indicate that each method has advantages under certain conditions. Linear Interpolation is suitable for data with linear trends, small sample sizes, and low to moderate missingness levels, and is effective for both MCAR and MNAR patterns. Kalman Filtering is optimal for medium to large datasets, particularly in handling linear and seasonal trend patterns with high proportions of missing data due to MCAR. SVR excels in large seasonal data scenarios with MNAR missingness patterns. RNN-GRU performs well under low missingness conditions, particularly for small seasonal datasets with MNAR patterns. These findings emphasise that the choice of imputation method should consider data size, trend patterns, and the missing data mechanism to minimise bias and preserve the integrity of the temporal structure.

Comparison of Imputation Methods: Traditional, Machine Learning, and Deep Learning on Multivariate Time Series with MCAR and MNAR

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

SUPPORTED BY

SITE LINKS

CONTACT US