Whether you’re a data scientist, a marketer or a data leader, chances are that if you’ve Googled “Customer Lifetime Value”, you’ve been disappointed. I felt that too, back when I was leading CLV research in a data science team in the e-commerce domain. We went looking for state-of-the-art methods, but Google returned only basic tutorials with unrealistically manicured datasets, and marketing ‘fluff’ posts describing vague and unimaginative uses for CLV. There was nothing about the pros and cons of available methods when applied in on real world data, and with real world clients. We learned all that on our own, and now I want to share it.
Presenting: all the stuff the CLV tutorials left out.
In this post, I’ll cover:
- What is CLV? (I’ll be brief, as this part you probably already know)
- Do you really need CLV prediction? Or can you start with historic CLV calculation?
- What can your company already gain from historic CLV information, especially when you combine it with other business data?
In the rest of the series, I’ll present:
- Uses for CLV prediction
- Methods for calculating and predicting CLV, and their advantages and disadvantages
- Lessons learned on how to use them correctly.
And I’ll sprinkle some data science best-practices throughout. Sound like a plan? Great, let’s go!
Customer Lifetime Value is the value generated by a customer over their ‘lifetime’ with a retailer: that is, between their first and last purchase there. ‘Value’ can be defined as pure revenue: how much the customer spent. But in my e-commerce experience, I found that more mature retailers care less about short-term revenue than they do about long-term profit. Hence, they’re more likely to consider ‘value’ as revenue minus costs. As we’ll see in part two though, knowing which costs to subtract is easier said than done…
Experienced R&D teams know that for new data science projects, it’s best to start simple. For CLV, this can be as ‘easy’ as using historic transactions to calculate lifetime value so far. You can:
- calculate a simple average over all your customers, or
- calculate an average based on logical segments, such as per demographic group.
Even this rearward-facing view has many uses for a retailer’s marketing and purchasing (that is, inventory management) teams. In fact, depending on the company’s data literacy level and available resources, this might even be enough (at least to get started). Plus, data scientists can get a feel for the company’s customers’ typical spending habits, and this can be invaluable if the company does later want to predict future CLV, on a per customer basis.
To help you and the company decide whether you need historic CLV insights or future predictions, let’s view some use-cases for each. After all, you want the marketing, management, and data science teams to be aligned from the beginning on how the project’s outputs are going to be used. That’s the best way to avoid building the wrong thing, and having to start again later.
Many tutorials only discuss uses for CLV prediction, on a per-customer basis. They list obvious use-cases, like ‘try to re-engage the predicted low-spenders to get them shopping more.’ But the possibilities go so much further than that.
Whether you get you CLV information via calculation or prediction, you can amplify its business value by combining it with other data. All you need is a CLV value, or some kind of CLV level score (e.g. High, Medium, Low), per customer ID. Then you can join this with other information sources, such as:
- the products customers are buying
- the sales channels (in-store, online, etc) they’re using
- returns information
- shipping times
- and so on.
I’ve illustrated this, below. Each box shows a data table and its column names. See how each table contains a Customer_ID? That’s what allows them all to be joined. I’ll explain the columns of the CLV_Info table in part three; First, I promised you use-cases.
Let’s say you’ve ranked all your customers by total spending so far, and segmented them somehow. For example, your marketing team asked you to split the data into the Top 10% of Spenders, the Middle 20%, and the Bottom 70%. Perhaps you’ve even done this multiple times on different subgroups of your customer base, such as per country, if you have online shops around the world. And now, imagine you’ve combined this with other business data, as described above. What can your company can do with this information?
Honestly, there are so many questions you can ask of your data, and so much you can do with the answers, and I could never cover it all. I don’t have the domain knowledge you do, and that’s a massively important, massively undervalued thing in data science. But in the next few sections, I’ll provide you some ideas to get you thinking like a data-driven marketer. It’s up to you to take this further…:
Explore CLV segments and their needs
- What makes a top-tier customer? Are they extremely regular, modest spenders? Or do they shop less often, but spend more per transaction? Knowing this helps your marketing and inventory teams identify what kind of customers they really want to acquire — and retain! Then they can plan marketing and customer service efforts, and even inventory and product promotions, accordingly.
- Why are costs high and/or revenue low for your bottom-tier shoppers? Are they only ever purchasing items at extreme discounts? Always returning things? Or buying on credit and not paying on time? Apparently there’s a poor product-customer fit — could you improve it by showing them different products? Or here’s another question: are your bottom-tier customers always buying one product and then never shopping with you again? Maybe it’s a ‘poison product’, which should be removed from your inventory.
- Are your high CLV customers more satisfied? Why? Imagine you’re a clothing retailer and your customers have an option to save their sizing information to their account. This allows your online store to make sizing recommendations when a logged-in customer is about to add an item to their basket. You also notice that most of your high CLV customers have saved their sizes, and they have fewer returns. Hence, you suspect that recommendations: Reduce return rates > improve customer satisfaction > and keep shoppers loyal.
- How can you action this information? Here’s just one idea: the website team could add prompts reminding users to add their size information. Ideally this will increase revenue, decrease costs, and improve customer satisfaction, but if you’re truly data-driven then you’ll want to A/B test the change. This way you can measure the impact, controlling for outside effects, and keeping an eye on ‘guardrail’ metrics. These are metrics you would not want to see change during an A/B test, such as the number of account deletions.
Explore your demographics
The last section was about CLV tiers; now I’m referring to different customer subgroups, such as those based on age range, gender, or location. There are two ways you could do this.
- Perform the above CLV analysis on your whole customer base, and then see how your subgroups are distributed among CLV tiers, like this:
2. Split into subgroups first, and then do a CLV analysis for each.
Or, you can try both approaches! It depends on the business needs and resources available. But again, there are plenty of interesting questions:
- Which subgroups do you have? Forget the obvious ones I just listed; let’s get creative. For example, you could split customers by their original acquisition channel, or the channel they now use most: online v.s. instore, app v.s. website. You could split by membership level, if you offer it. Using tracking cookies from your webstore, you can even split by preferred shopping device: desktop computer versus tablet versus mobile. Why? Well, maybe your mobile-phone-based shoppers have lower basket values, because people prefer to make big purchases on a desktop. The more domain knowledge you can build up, the better your analysis and — if it comes to it — machine learning efforts will be.
- How does buying behaviour differ by customer subgroup? When do they shop? How often? For how much? Do they respond well to promotions and cross-sells? How long are they loyal? Do they spend often in the beginning of their lifetime and then tailor off, or is it some other pattern? This kind of information can help you plan marketing activities and even estimate future revenue, and I shouldn’t need to tell you how useful that is…
- What’s a ‘typical’ customer journey? Are you acquiring most of your new customers in physical stores? Does that mean your stores are great but your website sucks? Or are your in-store workers better at getting people to sign up for membership than your website is? Either way, you could try to improve the website, or at least, be smarter about which channels you advertise on. And what about new customer offers, newsletter sign-up discounts, or friend referrals: are they attracting solid numbers of high CLV customers? If not, time to reevaluate those campaigns.
Get clever about your offering, and how you market it
- If you understand your customers better, you can serve them better. For a retailer, that could include stocking up on the types of products their best customers seem to favour. A mobile phone provider could improve the services that its high CLV customers are using, like adding features to their mobile app. Of course, you’ll want to A/B test any changes, to make sure you don’t introduce changes that customers hate. And don’t abandon your low CLV customers — instead, try to find out what’s going wrong, and how you can improve it.
- Similarly, if you understand your customers, you can speak their language. By showing the right ads, at the right time, on the right channels, you can acquire customers you want, and who want to shop with you.
Know what to spend on customer acquisition
- Ever wondered why companies start emailing you when you haven’t shopped there for a while? It’s because it’s expensive to acquire a customer, and they don’t want to lose you. That’s also why, when you browse one e-commerce site, those products follow you around the internet. Those are -called ‘programmatic ads’, and they appear because the company paid for that first click, and they’re not willing to give you up, yet.
- As a retailer, you don’t just want throw money at acquiring any old customer. You want to gain and retain the high value ones: those who’ll stay loyal and generate good revenues over a long lifetime. Calculating historic CLV allows you to also calculate your break-even points: how long it took each customer to ‘repay’ their acquisition cost. What’s the average, and which CLV tiers and customer demographic groups pay themselves off fastest? Knowing this will help marketing teams budget their customer acquisition campaigns and improve their new-customer welcome flows (i.e. those emails you get after the first purchase at a new shop), to increase early engagement and thus improve break-even times.
Track performance over time
- Re-evaluate to identify trends. Businesses and markets change, beyond the control of any retailer. By periodically re-calculating your historic CLV, you can continuously build your understanding of your customers and their needs, and whether you’re meeting them. How often should you re-run your analysis? That depends on your typical sales and customer acquisition velocity: a supermarket might re-evaluate more often than a furniture dealer, for example. It also depends on how often the business can actually handle getting new CLV information and using it to make data-driven decisions.
- Re-evaluate to improve. Periodically re-calculating CLV will help you ensure you’re gaining ever-more-valuable customers. And don’t forget to run extra evaluations after introducing a big strategy change, to ensure you’re not sending numbers in the wrong direction.
I know, I know… you want to talk Machine Learning, and what you can use CLV predictions for. But this post is long enough as it is, so I’ll save it for next time, along with the lessons my team learned on how to model historic CLV and predict future CLV using real-world data. Then in part three, we’ll cover the pros and cons of the available modelling and prediction methods. If you’d like a reminder of that, then don’t forget to subscribe. See you next time!