Early Stopping

What is early stopping?

Early stopping is a term used in reference to machine learning when discussing the prevention of overfitting a model to data.  How does one determine how long to train on a data set, balancing how accurate the model is with how well it generalizes? If we let a complex model train long enough on a given data set it can eventually learn the data exactly.  Given data that isn’t represented in the training set, the model will perform poorly when analyzing the data (overfitting). Conversely if the model is only trained for a few epochs, the model could generalize well but will not have a desirable accuracy (underfitting).

Early Stopping Condition

How is the sweet spot for training located?  Can we find an early stopping condition? Often data sets are split into three components: training set, validation set, test set.  The training set is used exclusively to train the model and to determine accuracy on the training set. The validation set is used to determine how well the model generalizes to unseen data. When the error on the training set begins to deviate from the error on the validation set, a threshold can be set to determine the early stopping condition and the ideal number of epochs to train.

As a model begins training, highly biased and low in complexity, both the training error and validation error will decrease.  This can be seen in the underfitting zone when looking at the bias and generalization. If the model trains for long enough the a very complex and unbiased model can be learned but the variance or error increases as seen in the overfitting zone.  The validation set is used in conjunction with early stopping to determine the optimal training zone for a model.