Most time, while presenting analytics, I get queries such as "do you use neural network?" or "do you use support vector?". Almost all models I have built have been with Linear Regression or Decision Tree. I have often found good fitment of the predicted values to the observed values. Whether it was for churn prediction, default prediction, offer uptake, next visit, next spend, etc. The accuracy (or classification rate) has ranged from 65% to 86%. A good enough accuracy considering the fact that these models where not mission critical such as the actuarial tables for life insurers.
So in all cases, my response to these questions was "No". This often upset the enquirer. Then we get into a debate on why did I not use these algorithms. Every point of the argument I bring everyone back to the uplift curve or the classification matrix. Irrespective of the algorithm, if I have achieved a acceptable accuracy in my prediction, that should be end of the argument. But, alas, it is seldom so.
All algorithms end up in generating a scoring logic and gives scores on the observed variable. In reality these scores are not much helpful on its own. Business owners want to know how such scores have been generated. This is where the simplified algorithms of regression and decision trees are very useful. The end result of the modelling exercise is a "human understandable" function such as:
For decision tree: if (age > 25) and (income < 15000), then (Y = 0.05)
For linear regression: Y = a + b(age) + c(income).
The Y denotes the score. Now these equations explains in plain human language the rationale behind the score. Since, it is understandable, it also gives some additional insights into the drivers of the score. Thus, two customers having the same score, may have different drivers of the score. For example, one would have the 'age' variable contributing significantly to the score while for the other customer, it would be the 'income' variable. Try getting this insight from a "neural network" algorithm.
Another reason, I go simple is because often such models are used to convince management to loosen the purse string for additional budget towards some activity. Maybe a new campaign, maybe a new campaign, etc. The management team are good business people but often not statistical experts. The simplified functions are easy to explain and to be understood by this team. Now try explaining a complex function and getting a budget sanctioned by the management team.
And finally, the adage Keep It Simple Stupid (KISS) is so very useful. The objective is not how complex the algorithm is but how good the model fitment is. The complex the model the more time taken for data preparation and for understanding the output and tinkering with the data for improvement in uplift. And often the uplift of the complex algorithms over the linear regression or decision tree are few basis points. It may not be worth the time and effort involved.
Remember, I am not talking mission critical applications here. If I was building actuarial tables or drug efficacy, then I would scout around for alternate algorithms and seek the best fit ones. But for marketing models, where the life is short for the model and the window of opportunity is opened for an even shorter time, it makes sense to keep it simple and run with the model. A 65% accurate model is better than no model at all. And a 90% accurate model achieved after the window of opportunity closes is of no use.
So the next time I am asked if i used "neural network", I guess I will just KISS and make up with the interrogator.