In evaluating a model, you should understand the concepts of bias, variance, and the tradeoff in minimizing them. Knowing how to handle these errors will help you build accurate models and avoid falling into the overfitting and underfitting traps. That is why the bias-variance tradeoff is a central problem in Supervised Learning.
The conflict arises when simultaneously minimizing these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set.
First of all, let’s define bias and variance:
Bias is the difference between the model’s average prediction and the correct value you are trying to predict. It is also the erroneous assumptions in the learning algorithm. Models with significant bias do not even capture patterns in the training data and oversimplify the model.
It is the variability of model prediction for a given point or value which provides you with how your data’s spread is. Models with colossal variance tend to mimic the training data and do not generalize well-enough on unseen data. In consequence, such models performed well on training data but poorly on test data.
Firstly, let’s dive a bit into the mathematics: suppose you want to predict as a function of .
Where is the error and it is normal distributed with mean zero.
Using any modeling technique, you try to estimate with , in this case, you can say that the expected squared error at a point x is:
This can be further decomposed as:
Creating good models do not reduce irreducible error. It measures the amount of randomness or noise in the data.
In underfitting conditions, the model has high bias and low variance. It also occurs when we have a short amount of data or the model’s structure cannot capture the nature of the data’s patterns.
In overfitting conditions, the model captures the randomness and the patterns, and it is said to mimic the training data. These kinds of models have low bias but high variance.
But why is there a tradeoff?
On the one hand, if your model is too simple and has few parameters, it may have high bias and low variance. On the other hand, if your model has many parameters, it’s prone to have high variance and low bias. That is why we need to get the best model to balance these variables.
How you can handle it?
Having some idea of what to do could be game-changer when dealing with complex data. That is why in the next table, you will find some basic ideas on how you should proceed:
|Decrease Variance||Decrease Bias|
|Using dimensionality reduction and feature selection.||Adding features.|
|Increasing the training set.||Introducing more complex algorithms.|
|Introducing regularization techniques.||Mixture models and ensemble learning.|
|Bagging techniques.||Boosting techniques.|
In summary, you already know that bias-variance tradeoff will be present in all algorithm training tasks. Having them identified will make you use proper techniques and be more cautious in evaluating the readiness of any model trained to get the best understanding of the data without falling short in an overfitted or underfitted model.