I recall an incident at a general insurance company a few months back. The team had built a cross sell model. They had first segmented the customer base and done a product association analysis. Two particular products were found to be closely associated. For sake of confidentiality lets call them ... well... Product 1 and Product 2 (not very innovative... are we?).
The next step was to find customers who had only Product 1 and not Product 2 and treat them as the target for scoring the propensity for purchasing Product 2. The data set of the customers demography, transactions, etc., was formed and the model building and scoring process executed. The customers were ranked to give the potential base for cross sell.
While the process used was appropriate there was a major flaw in the way the scoring exercise was used. The data created for model building was derived from a cut of the customer database as of a particular time... say January 31, 2010. This is where the process went .. drastically .. wrong. A cardinal mistake committed by statistical standards.
For sake of explanation, consider three customers who have bought both Products 1 and 2. The following gives the timeline of purchase of the two products.
As can be observed, by taking last 12 months data from a cut of Jan 2010, the actual purchase of the two products were not taken into consideration. For Customer 1, the purchase of Product 2, which is the target event in this analysis, actually happened outside the period of analysis.
The correct approach would be to identify the event of purchase of Product 2. Term this period as Base period. For each customer, basis the purchase of Product 2, the base period will be different. Then take the data for 12 months past from this base period.
The reason this needs to be done is we are studying the pattern of behaviour a customer exhibits before he purchases the product. Thus, the period of analysis is relative to the purchase of the product.
The following figure shows the period for which data needs to be extracted for each customer. This is dependent on the purchase of Product 2. Notice that this period is different for each customer. In fact for customer 1, this period is way in the past and goes beyond the period that was displayed. It probably needs to be decided whether Customer 1 is a "vintage" customer and should be excluded from the analysis.
This is classic case of the wrong science applied to correct art. No doubt this cross sell campaign had a high chances of failure. And the blame to be put on the statistical model which failed to predict the correct potential base.
If you want to avoid this and similar pitfalls, I will be glad to discuss ... contact me at michaeldsilva@gmail.com.
No comments:
Post a Comment