Profitable Retail Customer Identification Based on a Combined Prediction Strategy of Customer Lifetime Value

No Thumbnail Available
If you need an accessible version of this item, please email your request to so that they may create one and provide it to you.
Journal Title
Journal ISSN
Volume Title
Midwest Social Sciences Journal
As a fundamental concept of customer relationship management (CRM), customer lifetime value (CLV) serves as a crucial metric to identify profitable retail customers. Various methods are available to predict CLV in different contexts. With the development of consumer "big data," modern statistics and machine learning algorithms have been gradually adopted in CLV modeling. We introduce two machine learning algorithms – the gradient boosting decision tree (GBDT) and the random forest (RF) – in retail customer CLV modeling and compare their predictive performance with two classical models – the Pareto/NBD (HB) and the Pareto/GGG. To ensure CLV prediction and customer identification's robustness, we combined the predictions of the four aforementioned models to determine which customers are the most – or least – profitable. Using 43 weeks of customer transaction data from a large retailer in China, we predict customer value in the future 20 weeks. The results show that GBDT and RF's predictive performance is generally better than that of the Pareto/NBD (HB) and Pareto/GGG models. Since the predictions are not entirely consistent, we combine them to identify the profitable and unprofitable customers
Customer lifetime value (CLV); Pareto/NBD (HB); Pareto/GGG; Gradient boosting decision tree (GBDT); Random forest (RF); Valuable retail customer identification
Link(s) to data and video for this item