So, you’ve fit a model, did your prediction and you now get a ridiculously high score. You’re a genius! Or… maybe not?
When it comes to building machine learning models, a common mistake that many beginners make is not being able to make sense of why a certain model is performing well or even underperforming. How do you know if your model is good and how can you fix it if it's not?
Any model in machine learning is assessed based on the prediction error on a new independent and unseen set of data. Error is nothing but the difference between the actual output and the predicted output. In this medium post, I will tackle this issue and discuss what is known as the Bias-Variance Trade-off.
Before explaining what bias and variance are, I will start off with an example that will make things seem very intuitive. A quick glance at the figure below will give it away.
The Bias Error
Mathematically, bias is the expected error in the prediction of a model. In other words, you can think of bias as to how far a new value is predicted from the actual.
For example, in the diagram above, when we try to solve the problem by fitting a linear regression algorithm, we are assuming that the target has linear relationships with its features which may not be right, and the errors due to the linearity in the model, in this case, are bias errors. The linear regression model cannot capture the relationship and therefore it has a large bias.
The Variance Error
Variance is a measure of the variability in the results predicted by a model. It is basically describing how scattered the predicted values are from the actual values. In machine learning lingo, the difference in fits between datasets is called variance.
Variance quantifies the difference in prediction when we change the data set. We have a high variance when the predictions are going to be very different in a test case. High variance signifies a model is overfitting.
In our nth polynomial example (the perfect fit), we have very high variance because it results in a significant error on new data.
In machine learning, the ideal algorithm has the lowest combination of bias and variance, the optimal balance.
When we try to increase bias, the variance decreases, and when trying to increase variance, the bias decreases. If a model is too simple, you run the risk of underfitting, with high bias and low variance. On the other hand, having a very complex model will result in an overfit model with high variance and low bias.
This trade-off should result in a balance between the two, and ideally, low bias and low variance is the target for any machine learning model.
** Section not complete…
>>>> Insert the chart here and talk about it….
Optimizing the Tradeoff
While there is no set answer some common techniques are listed below:
- Dimensionality reduction removes some features which are could be to the variance
- Using ensemble models