Two-Stage RFM and Macroeconomics Interaction Model for Accurate CLV Prediction in Direct Sales
DOI:
https://doi.org/10.34123/icdsos.v2025i1.642Keywords:
Customer Lifetime Value, Direct Sales, Ensemble Learning, Macroeconomic Interactions, Two-Stage ModelingAbstract
This study introduces a two-stage predictive model integrating Recency, Frequency, Monetary (RFM) metrics with macroeconomic indicators to estimate Customer Lifetime Value (CLV) in direct sales, addressing dynamic customer behavior in volatile markets. Data from the Halalmart Sales Integrated System (January 2023–July 2025, 29,893 transactions, ~431 unique customers monthly) were combined with Indonesian macroeconomic indicators (Consumer Confidence Index, Consumer Expectation Index) from Bank Indonesia and inflation data from the Central Bureau of Statistics (BPS). The first stage uses CatBoost classification, achieving 89.3% accuracy to identify active customers, followed by an ensemble regression (CatBoost, XGBoost, LightGBM, Ridge, RandomForest), yielding an R2 of 0.894 for CLV prediction. RFM features contribute 40.3% to classification and 16.2% to regression variance, while macroeconomic interactions dominate, contributing 59.7% and 83.8%, respectively. A key interaction, Monetary and Consumer Confidence Index, shows a 0.773 correlation with CLV. SHAP analysis enhances model interpretability. Despite a skewed dataset with approximately 65% zero CLV, the model supports targeted marketing strategies, offering valuable insights for strategic decision-making in direct sales environments