Revisiting Pretraining Objectives for Tabular Deep Learning

07/07/2022
by   Ivan Rubachev, et al.
0

Recent deep learning models for tabular data currently compete with the traditional ML models based on decision trees (GBDT). Unlike GBDT, deep models can additionally benefit from pretraining, which is a workhorse of DL for vision and NLP. For tabular problems, several pretraining methods were proposed, but it is not entirely clear if pretraining provides consistent noticeable improvements and what method should be used, since the methods are often not compared to each other or comparison is limited to the simplest MLP architectures. In this work, we aim to identify the best practices to pretrain tabular DL models that can be universally applied to different datasets and architectures. Among our findings, we show that using the object target labels during the pretraining stage is beneficial for the downstream performance and advocate several target-aware pretraining objectives. Overall, our experiments demonstrate that properly performed pretraining significantly increases the performance of tabular DL models, which often leads to their superiority over GBDTs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2023

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

While deep learning (DL) models are state-of-the-art in text and image d...
research
06/02/2019

Pretraining Methods for Dialog Context Representation Learning

This paper examines various unsupervised pretraining objectives for lear...
research
12/20/2022

Pretraining Without Attention

Transformers have been essential to pretraining success in NLP. Other ar...
research
05/26/2022

Learning to segment with limited annotations: Self-supervised pretraining with regression and contrastive loss in MRI

Obtaining manual annotations for large datasets for supervised training ...
research
06/22/2021

Revisiting Deep Learning Models for Tabular Data

The necessity of deep learning for tabular data is still an unanswered q...
research
03/10/2022

On Embeddings for Numerical Features in Tabular Deep Learning

Recently, Transformer-like deep architectures have shown strong performa...
research
07/26/2023

TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning

Deep learning (DL) models for tabular data problems are receiving increa...

Please sign up or login with your details

Forgot password? Click here to reset