What to Do When the Linear Regression Assumptions Don’t Hold
Published in · 12 min read · Sep 28, 2019
--
You can always spot a data science newbie by the speed with which they jump to fitting a neural network.
Neural networks are cool and can do awesome things that, for many of us (myself included), are the reason why we got into data science in the first place. I mean, who goes into data science to play around with daggy old linear regression models?
Yet, the irony of the situation is that, unless you are working in a specialist field, like computer vision or natural language processing, a lot of the time, simple models, like linear regression, actually provide a better solution to your problem than complex black box models, like neural networks and support vector machines.
After all, linear regression models are:
- fast to train and query;
- not prone to overfitting and make efficient use of data, so can be applied to relatively small datasets; and
- are easy to explain, even to people from a non-technical background.
I’ve heard senior data scientists, with experience working with cutting edge AI, sing the praises of linear regression for these very reasons.