Improve Your Boosting Algorithms with Early Stopping

Author:Murphy  |  View: 27558  |  Time: 2025-03-23 18:39:16
Photo by Glenn Carstens-Peters on Unsplash

Boosting algorithms are largely popular in the Data Science space, and rightly so. Models that incorporate boosting yield some of the best performances, which is why they are commonplace in both academia and industry.

That being said, these types of algorithms will register suboptimal results if they are not configured properly.

One such feature that is often underutilized is early stopping.

Here, we give a high-level overview of early stopping and why it should be incorporated into your boosting algorithms.


Recap on Boosting

Before getting into early stopping, let's briefly discuss boosting.

In short, algorithms that leverage boosting train a series of sequential models, with each model aiming to address the error made by its predecessor.

Boosting algorithms adhere to the following steps:

  1. Train a weak model with initial weights
  2. Evaluate the "error" of this first model
  3. Train a new model with modified weights that address the issues with the previous model
  4. Evaluate the "error" of this new model
  5. Repeat steps 3 or 4 until a specific criterion is met (e.g., number of iterations, model performance, etc.)

In theory, boosting serves as a perfect solution to determining the optimal weights for a particular model.

After all, if the model keeps learning from its previous mistakes, performing more iterations should yield better results. So, why not just perform as many iterations as possible? With a near-infinite number of models, we could achieve peak performance!

Unfortunately, that is usually not the case.


More Iterations ≠ Better

After a select number of iterations, the model will modify its weights and likely become better at generalizing with the training data. However, if an algorithm goes past the ideal number of iterations, it will start capturing the noise and become unable to perform as well on unseen data.

In other words, boosting algorithms that use too many iterations become prone to overfitting.


Finding the Right Number of Iterations

Now we know that a boosting algorithm needs enough iterations to find the optimal weights for its model, but not too many that would make it succumb to overfitting.

The key then is to find the "sweet spot": a number of iterations that is not too high or too low.

However, it can be challenging to determine the ideal number of iterations as this figure varies from case to case. There are a number of factors that influence this figure, ranging from the underlying data and the model being trained.

One way to address this is to use early stopping.


Early stopping

Early stopping entails ending the training of the boosting model prematurely if the individual model's performance against the validation set does not improve after a given number of iterations.

Essentially, instead of training weak models a fixed number of times, we can configure it to continue training only if it is showing better results.

The benefits of this early stopping are obvious at a glance. With this technique, we can ensure that the model will stop training before it succumbs to overfitting, thereby improving performance. It also reduces the run time for the training process since it leads to fewer iterations.


Case Study

The best way to demonstrate the early stopping is with a case study. Let's work with the built-in titanic dataset from the Scikit Learn library.

The objective is to train a light gradient boosting machine (LGBM) with and without early stopping and compare their results in terms of their f1-scores and run time.

  1. Without Early Stopping

Let's create an LGBM classifier that uses 1000 iterations (specified in the n_estimators hyperparameter) and evaluate it against the testing set.

F-1 Score (Created by Author)

Next, lets use the %%timeit command to determine the run time for these operations:

Code Output (Created By Author)

By using a boosting algorithm with 1000 iterations, the model yields an f-1 score of about 0.85 in about 473 ms.

Not bad, but do we even need 1000 iterations in the first place?

For a clearer picture, let's see how the f-1 score of the model changes as we increase the number of iterations.

Code Output (Created By Author)

Shockingly enough, the model's performance against the test set steadily decreases after around the first 50 iterations!

It's clear that the boosting algorithm does not need that many iterations to reach peak performance for this dataset.

2. With Early Stopping

This time let's see how the model will perform after incorporating early stopping.

Using early stopping in an LGBM classifier requires the explicit assignment of two hyperparameters. The first one is called eval_set, which contains the validation set. The validation set is what the model will use to evaluate its performance at each iteration.

The second hyperparameter is called early_stopping_rounds , which contains the number of iterations that the model can run without yielding a greater performance against the validation set. If the performance does not improve within these iterations, the model will stop training prematurely.

For this case, the value assigned to early_stopping_rounds is 20. This means that if the model's f-1 score against the validation set does not improve over its predecessors within 20 iterations, the training process will stop, even though it has been configured to run for 1000 iterations.

Code Output (Created by Author)

With early stopping, the model yields an f-1 score of about 0.92, which is a considerable improvement compared to the model that doesn't use early stopping!

Furthermore, the model should now train in less time since it uses fewer iterations. We can confirm this with the %%timeit operation.

As expected, the model that does early stopping is trained in a fraction of the time it takes to train the model that does not use early stopping.


Conclusion

Photo by Prateek Katyal on Unsplash

Overall, algorithms that leverage boosting tend to be more robust, given how they "learn" from multiple weak models. However, maximizing the number of iterations used in these algorithms is not a viable solution.

Different use cases will call for a different number of iterations, which is why the ideal number of iterations used in an algorithm should be based on the individual models' performances.

For that reason, early stopping is a technique that has immense practical value. It uses the validation set as an indicator of whether the algorithm should have more iterations or stop prematurely. As explained and demonstrated in the case study, early stopping enables models to achieve better performance with less training time.

I wish you the best of luck in your data science endeavors!

Tags: Data Science Machine Learning Python

Comment