Enhancing Robustness of Gradient-Boosted Decision Trees through One-Hot Encoding and Regularization

04/26/2023
by   Shijie Cui, et al.
0

Gradient-boosted decision trees (GBDT) are widely used and highly effective machine learning approach for tabular data modeling. However, their complex structure may lead to low robustness against small covariate perturbation in unseen data. In this study, we apply one-hot encoding to convert a GBDT model into a linear framework, through encoding of each tree leaf to one dummy variable. This allows for the use of linear regression techniques, plus a novel risk decomposition for assessing the robustness of a GBDT model against covariate perturbations. We propose to enhance the robustness of GBDT models by refitting their linear regression forms with L_1 or L_2 regularization. Theoretical results are obtained about the effect of regularization on the model performance and robustness. It is demonstrated through numerical experiments that the proposed regularization approach can enhance the robustness of the one-hot-encoded GBDT models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2019

Gradient Boosted Decision Tree Neural Network

In this paper we propose a method to build a neural network that is simi...
research
02/15/2018

Gradient Boosting With Piece-Wise Linear Regression Trees

Gradient boosting using decision trees as base learners, so called Gradi...
research
03/24/2017

Binarsity: a penalization for one-hot encoded features

This paper deals with the problem of large-scale linear supervised learn...
research
04/15/2022

An interpretable machine learning approach for ferroalloys consumptions

This paper is devoted to a practical method for ferroalloys consumption ...
research
08/20/2020

On ℓ_p-norm Robustness of Ensemble Stumps and Trees

Recent papers have demonstrated that ensemble stumps and trees could be ...
research
04/10/2019

Enhancing Decision Tree based Interpretation of Deep Neural Networks through L1-Orthogonal Regularization

One obstacle that so far prevents the introduction of machine learning m...
research
06/11/2023

Generating One-Hot Maps under Encryption

One-hot maps are commonly used in the AI domain. Unsurprisingly, they ca...

Please sign up or login with your details

Forgot password? Click here to reset