XTab: Cross-table Pretraining for Tabular Transformers

05/10/2023
by   Bingzhao Zhu, et al.
0

The success of self-supervised learning in computer vision and natural language processing has motivated pretraining methods on tabular data. However, most existing tabular self-supervised learning models fail to leverage information across multiple data tables and cannot generalize to new tables. In this work, we introduce XTab, a framework for cross-table pretraining of tabular transformers on datasets from various domains. We address the challenge of inconsistent column types and quantities among tables by utilizing independent featurizers and using federated learning to pretrain the shared component. Tested on 84 tabular prediction tasks from the OpenML-AutoML Benchmark (AMLB), we show that (1) XTab consistently boosts the generalizability, learning speed, and performance of multiple tabular transformers, (2) by pretraining FT-Transformer via XTab, we achieve superior performance than other state-of-the-art tabular deep learning models on various tasks such as regression, binary, and multiclass classification.

READ FULL TEXT
research
04/08/2021

SiT: Self-supervised vIsion Transformer

Self-supervised learning methods are gaining increasing traction in comp...
research
05/19/2022

TransTab: Learning Transferable Tabular Transformers Across Tables

Tabular data (or tables) are the most widely used data format in machine...
research
07/18/2023

UniTabE: Pretraining a Unified Tabular Encoder for Heterogeneous Tabular Data

Recent advancements in Natural Language Processing (NLP) have witnessed ...
research
10/13/2021

Study of positional encoding approaches for Audio Spectrogram Transformers

Transformers have revolutionized the world of deep learning, specially i...
research
07/15/2022

Position Prediction as an Effective Pretraining Strategy

Transformers have gained increasing popularity in a wide range of applic...
research
06/19/2023

ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers

In this paper we delve into the properties of transformers, attained thr...
research
10/26/2021

TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining

We introduce a block-online variant of the temporal feature-wise linear ...

Please sign up or login with your details

Forgot password? Click here to reset