Tuesday, October 18, 2011

Analytical Data Mart -- is a Myth

Over the past two years, I have observed an increasing number of RFPs which include setting up of analytical data marts as part of the scope. This is a disturbing trend. It shows that the analytical projects are driven by non-analytical expertise.

I dont blame the IT guys for the way the RFPs are designed. They take a typcial data warehouse / reporting approach. In this case the requirements are known and the data warehouse is expected to maintain and provide the data elements for the reporting needs. Analytics too follows similar analogy. We have an end requirement -- which may be lapse prediction and the predictors, say. And this requirement needs data elements. This is where the similarity ends. Before the model is developed, one does not know which of the data elements are needed for the end report. In fact, one of the objective of the model building is to identify the data elements that are significant contributors towards the event under consideration, the lapsation of a policy.


Vendors in the market strut analytical data models. It basically consists of 100s of variables and derived variables which are likely to play a role of a contributer to the observed event. The IT team following the sales pitch of such vendors often include a scope of creating an analytical data mart containing all the 100s of variables listed.

If a company has enough budget and time (and patience), it will be okay to create an analytical data mart of over 800 variables from multiple sources of data and involving complex transformations. But this is never the case.

Now consider the models built on this data mart. Any model used in business will hardly have more than 20 variables (both direct and derived). So the rest of the 780+ variables was wasted.

An ideal way would be to let the statisticians use data dumps to do the modelling activity. Once a model is developed and tested and found to be useful for business deployment, the need for productionizing is to make the 6 to 8 to 20 variables available for the scoring purpose. Compare this with creating a 800+ variable data mart -- the former is a much more practical approach.

I have not even dwelt on the process of modelling and data prepartion. Based on the objective being tested, the data preparation will differ hugely from model to model. Often the analytical data marts get ignored and the analysts goes back to data dumps for creating the analytical data set for modelling. See my earlier post on time stamped data sets for modelling (http://crmzen.blogspot.com/2010/03/time-factor-in-modeling.html) to understand the complexity in creating data for statistical modelling.

It will be good for the IT department and the data warehousing personnel to understand this difference in the analytical process. Especially since this difference is not subtle. It is not needed to wait for 12 to 18 months (or more) till the data warehouse is set up and populated for the analytical activity to begin. And, the return on investment is much higher with predictive analytics. When compounded with the quick turnaround, the returns multiply.

Monday, October 03, 2011

Statistics hints at Existence of God

At the onset of this post let me make a few things clear. I am not an atheist. I have faith in the bible. But I do not hesitate in questioning facts about the bible. Now, according to some, that makes me an atheist. Atleast, it does not make me a fanatic. So I leave it at that.

My professor of economics, Mr. Sakhalkar, once made a statement that when one reads he should not be selective. That creates a bias and restricts ones circle of influence. Going down that path a couple of years ago took me down the path of the origins of religion.

A very interesting fact came to the conscious. "God" was an invention of man to blame someone for things that man did not understand and could not control. In the old days, the most frightful element for man was fire. He could not understand it, he could not control it and it was most destructive force. So a Fire God existed. And among all the gods, the Fire God was the most powerful one. There was Water God, Land God, Wind God, Sun God, Star God and so on.

As man started understanding the elements of nature, the importance of that God diminished. Until the God itself was abolished. Eventually as we sit in the twenty-first century, almost all of the element Gods are extinct. When the Gods started disappearing, man found reason to blame other men for various elements. So a forest fire was because some fool dropped a lighted cigarette on dry grass. Understanding man and its nature became a prime topic of importance. God now starting taking form of man. A convenient person to blame when things go beyond explanation.

A statistical model is a function which describes the observed behaviour based on identified independent variables. But what most sales personnel (call them consultants or statisticians) leave out is the "epsilon" (E). Every function in statistics has the epsilon attached to it. Consider the following function which forecast the amount of sale at a retail outlet.

Y = aX + bZ + ε

where Y is the amount of sale, X is the average salary of store visitors and Z is the fact that it is raining. ε represents the epsilon or the error component. This is statisticians' way of keeping themselves legally safe (yup... statisticians are smarter than lawyers). It implies that though the function predicts the amount of sale, there is an error component which explains the deviation in actual amount against the predicted amount. So if the actual sales is different from the predicted amount, blame the error component and not the statistician.

There is a whole lot of effort expended in trying to understand and explain this error component. New variables found, new algorithms applied, but the ε still lives on. Till date there has not been any statistics model that does not have the ε in it.

Even a statement that "All crows are black." will be stated by a statistician as "It is with 95% level of confidence that 99% of the time all crows are black." This is with the understanding that if someone sees a crow with is not all black, the statistician is safe with his statement.

So now we have a EPSILON which is the unexplained factor and responsible for all the deviations in the statistical model. In some cases, like predicting the likelihood of a patient surviving a critical operation, this EPSILON also represents a dangerous and frightening probability. A play in the equation that is unexplained and blamed for any deviation in our predictive capabilities. EUREKA ---- we have found the "STATISTICS" GOD.

ε

Friday, September 02, 2011

The Don Quixote in the Marketing Department

While reading Don Quixote, an interesting thought crossed my mind while on the chapter where the character prepares to go on his mission. The knight had made a helmet and tested it with his lance. The helmet broke on the first contact. Thereafter, I quote from the book --

"He did not like its being broke with so much ease, and therefore to secure it from the like accident, he made it anew, and fenced it with thin plates of iron, which he fixed in the inside of it so artificially, that at last he had reason to be satisfied with the solidity of the work and so, without any experiment, he resolved it should pass to all intents and purposes for a full and sufficient helmet."

In case you have not noticed, that is one single sentence. I caught myself chuckling when I was reading the text matter. Not from the confidence of the knight in the book but from remembering some of my discussions with marketing people. At one such discussion with a CMO, I was presenting a plan to execute the campaign. The initial phase was to test the campaign offer in a pilot. The CMO was aghast at this suggestion. He said that they know their customer and he knows what they want and the offer is the best that could happen to his customers. He wanted us to guarantee a minimum uptake on the offer. Since there was no history of similar offers neither was there any results of any testing on the offer, I refused to guarantee unless he agrees to do a pilot campaign. The CMO refused and I did not pick up the assignment. Needless to say, the CMO did not last long in the company.

At another assignment that I was involved for a promotion campaign, the client had designed a "intersting" campaign for his customers. On discussing, I found out that the campaign was "interesting" because people who heard about it found it interesting. These people were apparently collegues from other departments. I did a quick dip-stick and found that no one from my project team are the customers of the client. That is, no body bought their products. Now, when I discussed the promotion with my team members, they sure found it interesting. Then, I made one of them call up the office boy who was manning the reception desk to explain the promotion. The office boy was confused and wondered if he had intercepted a key message from the extra-terrestrials (okay I am over doing this last part). I asked the office boy if he bought the products of the client. He said yes and he did not seem too keen on the promotion. When I took this finding to the client, they just blew into my face. I asked them to repeat the exercise with the security guard. But the client refused to go ahead with the experiment and moved on to other activity in the execution plan. It would not be a surprise that the promotion did not perform as per expectation.

These two scenarios reflected the Don Quixote mindset in the real world. Sometimes we are so confident on what we believe will work with the customer that we just refuse to do any test marketing. It is a surprise that when one launches a new product, there is a whole lot of science applied to the pilot launch. But when it comes to campaigns, everyone just believes the campaign offer is the best idea and wants to execute it immediately. Is it because a new product has hundreds of crores of rupees spent on it whereas a campaign would relatively cost only a couple of crores of rupees? But if one clubs all the campaign costs as well as the costs of opportunities lost when a customer signs for do-not-disturb or moves to the competitor, the combined cost will eventually overrun the development cost.

At one telco, we were discussing with the campaign manager on his campaign activities. This person had recently run a "successful" campaign selling 100 Rupee voucher to people who recharge with 50 Rupee voucher. For this, he was giving 15 Rupee talk time free. His definition of success was he sold 30 Crore Rupee worth of recharge vouchers. When asked how does he know that the same customers would have anyhow bought the 100 Rupee voucher or more without any offer and he would have perhaps sold 38 crore Rupee worth of recharges, he was stumped. We persisted, highlighting that he has given off atleast 30% of the revenue in free talk time which further increased the cost of the offer. We told him, he should have test/control the offer before launching it. He was so angry with us and refused to meet us for the next two months. The communication restarted when he ran into some problem on new campaigns and wanted help. So he called us back for a discussion. Well, the prodigal son deserves a feast... so we went to meet him.

So many Don Quixotes in real life. This reminds me of a recent post on a social platform. The post was from a "vegetarian" guy who was angry because his grocery store sent him offers on non-vegetarian fare. He was upset that the store did not bother to check that he has NEVER bought a non-vegetarian item from the store ever.

Thursday, August 11, 2011

ATM eating cash!!! A lost opportunity for CRM...

Mumbai Mirror's 10th August edition has a cover page article that screamed about an ATM that eats up the customer's cash. On reading further, apparently someone had tinkered with the ATM machine such that it would debit a higher amount than what was actually withdrawn. In one of the cases mentioned, the customer withdrew Rs. 10,000 but was debited for Rs. 40,000. Another customer made a transaction of Rs. 50,000 but was debited for Rs. 200,000 thousand.

The article has generated a decent amount of comments on how ATMs are tinkered with.

In all this melee, there was something the bank, in this case Axis Bank, could have done. I am not going to lecture on how ATMs can be made more secured. That is not my area of expertise.

In a career spanning close to two decades, I have delivered multiple projects. As with every software project, the User Acceptance Test phase is the final stage wherein the end users test the software before signing it off for deployment. During every UAT, I always set some ground rules:
1. It is a system made by man and can definitely be broken by man.
2. If your objective is to break the system, you will definitely succeed and it is not a commendable thing to achieve.

I use the same rules for this scenario. The ATM has been designed by man and so can be broken by man. There is a manual process involved where access it permitted to a person and thus opens environment for tinkering.

In line with CRM, the question is what could the Bank have done?

In both cases mentioned, the customer was the one to complain to the bank. No doubt the bank would have refunded the money to the customer. But what about the customer who did not get the SMS message or did not check his account soon enough?

This is a perfect case for event or transaction based analysis. In both scenarios, withdrawal of such a large amount may not have been a regular transaction for the customer. In fact for the second case, withdrawal of Rs. 200,000 may have been a first.

The bank could have analysed the debits for each customer and been able to identify the unusual withdrawal by the customer. Based on past behaviour each customer may have a different threshold for identifying an unusual behaviour. The moment this unusal behaviour was identified, the bank should call the customer and confirm the withdrawal. When the customer denies the transaction, it would point to possible fraud. The bank would have various options now:
-- deactivate the debit card
-- noticing the ATM machine to be the same one, decommission it immediately so more customers do not face the trouble.

From a customer perpsective, the bank could have assured the customer that the transactions will be actively investigated and the amount credited back to the account if valid.

The benefit to the bank was that the customer would be comfortable thinking that the bank is looking into his case as well as the bank could have limited the customers exposed to the fraudulent ATM. And more important, imagine if the press article said -- that the bank identified the fraud and quickly protected more customers from facing the same by blocking the ATM. Now that article would be "priceless".

Friday, August 05, 2011

Let HRD solve Marketing Issues.....

Recently I viewed the video of the presentation of Deborah Rhodes during her appearance at TED. During the speech she has quoted Malcolm Gladwell stating "The only time a physician and a physicist get together is when the physicist gets sick". She goes on to state that this occurence "makes no sense, because physicians have all kinds of problems that they don't realize have solutions. And physicists have all kinds of solutions for things that they don't realize have solutions." Before I continue with my post, I want anyone who has a woman to love in their life (and thats practically everyone) to view this talk by Deborah titled "A tool that finds 3x more breast tumors, and why it's not available to you."

What Malcolm Gladwell says is so very true. If you look around your office and see groups of people hudled in the conference rooms, there is a 9 out of 10 chance that they belong to the same department. They have probably got together to solve some problem or address some issue related to operations of the department.

Organizations define Key Performance Indicators and allocate them to various departments. So revenue is sales. Costing is finance. and so on.... Thereafter, these KPIs are the babies of each department and no body else is allowed to play with the baby or provide tips to the foster parents.

At one telco, there was a high amount of customer churn in the landline business after the first 3 months of activation. The role of keeping the customer active belonged to the "Retention Department." There was a whole lot of action happening in the retention department. I was called in to see if statistics could play a role. Having just two months of customer behaviour data did not excite he statistician in me. I decided to snoop around while I was at the premises of the company. I was also building CRM processes in line with eTOM for this telco. As part of this assignment, I was walking the process with the team responsible for installing the landline phones at the consumer site. One of this person stated that the sales person would ask the customer to just sign the acquisition form and would fill it up later in the office. While filling the form, the sales person would tick all options and services to be activated. When this person went for installation, the consumer would only be interested in knowing about voice calls since that was all he wanted. However, when the bills came in, the consumer saw an 'inflated' bill since he was also charged for services that he did not need. The installation person stated that he could not help the consumer since the installation process did not allow him to revalidate the consumer services and to deactivate the ones the consumer does not need.

This discussion connected the dots. The consumer saw a high amount on his bill for just 'voice telephony.' To him, the telco was overcharging him. Nobody explained to him the rentals charged for services that were activated for him because the sales person ticked the options in the application form. The consumer got upset over this payment and would request for disconnection in the second month. Before he could be disconnected, the telco required him to pay his outstanding till date. So a second bill was generated. This showed that majority of consumers churned in the third month.

If only, the retention department had involved every one who had a consumer touch point in trying to understand the problem at hand.

For that matter, the Human Resources and Finance departments may also have some solutions to the problems of say, the Marketing department.
 
test