How do we quantify the quality of fit?

The sum of squares of residuals () is a quantification of the error between the measured and predicted values after regression:

The total sum of squares around the mean value is the magnitude of the residual error associated with the dependent variable prior to regression:

We can then use then use the difference between and to quantify the improvement or error reduction. To do this, we define:

If we have , that means that the line explains all of the variability of the data and is therefore a “perfect fit”. On the other hand, would be a poor fit.

Something like means that 86% of the uncertainty is explained by the linear model.

It’s still important to check results visually, as and can trick you!