Comparison of Imputation Methods: Traditional, Machine Learning, and Deep Learning on Multivariate Time Series with MCAR and MNAR
DOI:
https://doi.org/10.34123/icdsos.v2025i1.707Abstract
This study compares the methods of Linear Interpolation, Kalman Filtering, SVR, and RNN-GRU for multivariate time series that exhibit linear trends and seasonality. Synthetic data for three variables were generated for small, medium, and large sample sizes. Missing values were systematically inserted using Missing Completely at Random (MCAR) and Missing Not at Random (MNAR) patterns with proportions of 10%, 20%, and 35%. The accuracy of imputation was evaluated using RMSE, MAPE, and R² over 150 simulation repetitions per scenario. The results indicate that each method has advantages under certain conditions. Linear Interpolation is suitable for data with linear trends, small sample sizes, and low to moderate missingness levels, and is effective for both MCAR and MNAR patterns. Kalman Filtering is optimal for medium to large datasets, particularly in handling linear and seasonal trend patterns with high proportions of missing data due to MCAR. SVR excels in large seasonal data scenarios with MNAR missingness patterns. RNN-GRU performs well under low missingness conditions, particularly for small seasonal datasets with MNAR patterns. These findings emphasise that the choice of imputation method should consider data size, trend patterns, and the missing data mechanism to minimise bias and preserve the integrity of the temporal structure.