DART: Dropouts meet Multiple Additive Regression Trees

05/07/2015
by   K. V. Rashmi, et al.
0

Multiple Additive Regression Trees (MART), an ensemble model of boosted regression trees, is known to deliver high prediction accuracy for diverse tasks, and it is widely used in practice. However, it suffers an issue which we call over-specialization, wherein trees added at later iterations tend to impact the prediction of only a few instances, and make negligible contribution towards the remaining instances. This negatively affects the performance of the model on unseen data, and also makes the model over-sensitive to the contributions of the few, initially added tress. We show that the commonly used tool to address this issue, that of shrinkage, alleviates the problem only to a certain extent and the fundamental issue of over-specialization still remains. In this work, we explore a different approach to address the problem that of employing dropouts, a tool that has been recently proposed in the context of learning deep neural networks. We propose a novel way of employing dropouts in MART, resulting in the DART algorithm. We evaluate DART on ranking, regression and classification tasks, using large scale, publicly available datasets, and show that DART outperforms MART in each of the tasks, with a significant margin. We also show that DART overcomes the issue of over-specialization to a considerable extent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2017

InfiniteBoost: building infinite ensembles with gradient descent

In machine learning ensemble methods have demonstrated high accuracy for...
research
04/05/2022

GP-BART: a novel Bayesian additive regression trees approach using Gaussian processes

The Bayesian additive regression trees (BART) model is an ensemble metho...
research
04/26/2020

How Much Should I Pay? An Empirical Analysis on Monetary Prize in TopCoder

It is reported that task monetary prize is one of the most important mot...
research
07/25/2021

Relational Boosted Regression Trees

Many tasks use data housed in relational databases to train boosted regr...
research
05/06/2020

Interpretable Learning-to-Rank with Generalized Additive Models

Interpretability of learning-to-rank models is a crucial yet relatively ...
research
05/06/2021

Learning Early Exit Strategies for Additive Ranking Ensembles

Modern search engine ranking pipelines are commonly based on large machi...
research
10/31/2022

HARRIS: Hybrid Ranking and Regression Forests for Algorithm Selection

It is well known that different algorithms perform differently well on a...

Please sign up or login with your details

Forgot password? Click here to reset