Revisiting Deep Learning Models for Tabular Data

06/22/2021
by   Yury Gorishniy, et al.
0

The necessity of deep learning for tabular data is still an unanswered question addressed by a large number of research efforts. The recent literature on tabular DL proposes several deep architectures reported to be superior to traditional "shallow" models like Gradient Boosted Decision Trees. However, since existing works often use different benchmarks and tuning protocols, it is unclear if the proposed models universally outperform GBDT. Moreover, the models are often not compared to each other, therefore, it is challenging to identify the best deep model for practitioners. In this work, we start from a thorough review of the main families of DL models recently developed for tabular data. We carefully tune and evaluate them on a wide range of datasets and reveal two significant findings. First, we show that the choice between GBDT and DL models highly depends on data and there is still no universally superior solution. Second, we demonstrate that a simple ResNet-like architecture is a surprisingly effective baseline, which outperforms most of the sophisticated models from the DL literature. Finally, we design a simple adaptation of the Transformer architecture for tabular data that becomes a new strong DL baseline and reduces the gap between GBDT and DL models on datasets where GBDT dominates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/26/2023

TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning

Deep learning (DL) models for tabular data problems are receiving increa...
research
03/10/2022

On Embeddings for Numerical Features in Tabular Deep Learning

Recently, Transformer-like deep architectures have shown strong performa...
research
01/06/2021

The data synergy effects of time-series deep learning models in hydrology

When fitting statistical models to variables in geoscientific discipline...
research
07/07/2022

Revisiting Pretraining Objectives for Tabular Deep Learning

Recent deep learning models for tabular data currently compete with the ...
research
12/06/2017

A trans-disciplinary review of deep learning research for water resources scientists

Deep learning (DL), a new-generation artificial neural network research,...
research
12/03/2020

Creativity of Deep Learning: Conceptualization and Assessment

While the potential of deep learning(DL) for automating simple tasks is ...
research
09/15/2020

CorDEL: A Contrastive Deep Learning Approach for Entity Linkage

Entity linkage (EL) is a critical problem in data cleaning and integrati...

Please sign up or login with your details

Forgot password? Click here to reset