Why is Bias-Variance Tradeoff important after all?

In evaluating a model, you should understand the concepts of bias, variance, and the tradeoff in minimizing them. Knowing how to handle these errors will help you build accurate models and avoid falling into the overfitting and underfitting traps. That is why the bias-variance tradeoff is a central problem in Supervised Learning.


The conflict arises when simultaneously minimizing these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set.

First of all, let’s define bias and variance:


Bias is the difference between the model’s average prediction and the correct value you are trying to predict. It is also the erroneous assumptions in the learning algorithm. Models with significant bias do not even capture patterns in the training data and oversimplify the model.


It is the variability of model prediction for a given point or value which provides you with how your data’s spread is. Models with colossal variance tend to mimic the training data and do not generalize well-enough on unseen data. In consequence, such models performed well on training data but poorly on test data.


Bias-Variance Trade-off

Firstly, let’s dive a bit into the mathematics: suppose you want to predict Y as a function of f(X).

Y = f(X)+ e

Where e is the error and it is normal distributed with mean zero.

Using any modeling technique, you try to estimate Y with \hat{Y}, in this case, you can say that the expected squared error at a point x is:


This e(x) can be further decomposed as:

e(x) = (E[\hat(Y) - Y])^2 + E[(Y-\hat(Y))^2]+\sigma_e^2

e(x) = Bias^2 + Variance + Irreducible Error

Creating good models do not reduce irreducible error. It measures the amount of randomness or noise in the data.

In underfitting conditions, the model has high bias and low variance. It also occurs when we have a short amount of data or the model’s structure cannot capture the nature of the data’s patterns.

In overfitting conditions, the model captures the randomness and the patterns, and it is said to mimic the training data. These kinds of models have low bias but high variance.


But why is there a tradeoff?

On the one hand, if your model is too simple and has few parameters, it may have high bias and low variance. On the other hand, if your model has many parameters, it’s prone to have high variance and low bias. That is why we need to get the best model to balance these variables.

Bias-variance trade-off in machine learning. This figure illustrates... |  Download Scientific Diagram
Bias-Variance trade-off illustration

How you can handle it?

Having some idea of what to do could be game-changer when dealing with complex data. That is why in the next table, you will find some basic ideas on how you should proceed:

Decrease VarianceDecrease Bias
Using dimensionality reduction and feature selection.Adding features.
Increasing the training set.Introducing more complex algorithms.
Introducing regularization techniques.Mixture models and ensemble learning.
Bagging techniques.Boosting techniques.
Approaches to deal with Bias-Variance Tradeoff

In summary, you already know that bias-variance tradeoff will be present in all algorithm training tasks. Having them identified will make you use proper techniques and be more cautious in evaluating the readiness of any model trained to get the best understanding of the data without falling short in an overfitted or underfitted model.

One thought on “Why is Bias-Variance Tradeoff important after all?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: