Progressive Feature Upgrade in Semi-supervised Learning on Tabular Domain

Recent semi-supervised and self-supervised methods have shown great success in the image and text domain by utilizing augmentation techniques. Despite such success, it is not easy to transfer this success to tabular domains. It is not easy to adapt domain-specific transformations from image and language to tabular data due to mixing of different data types (continuous data and categorical data) in the tabular domain. There are a few semi-supervised works on the tabular domain that have focused on proposing new augmentation techniques for tabular data. These approaches may have shown some improvement on datasets with low-cardinality in categorical data. However, the fundamental challenges have not been tackled. The proposed methods either do not apply to datasets with high-cardinality or do not use an efficient encoding of categorical data. We propose using conditional probability representation and an efficient progressively feature upgrading framework to effectively learn representations for tabular data in semi-supervised applications. The extensive experiments show superior performance of the proposed framework and the potential application in semi-supervised settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2023

Scaling Up Semi-supervised Learning with Unconstrained Unlabelled Data

We propose UnMixMatch, a semi-supervised learning framework which can le...
research
04/01/2016

Semi-supervised and Unsupervised Methods for Categorizing Posts in Web Discussion Forums

Web discussion forums are used by millions of people worldwide to share ...
research
08/27/2021

Contrastive Mixup: Self- and Semi-Supervised learning for Tabular Domain

Recent literature in self-supervised has demonstrated significant progre...
research
09/30/2020

Adversarial Semi-Supervised Multi-Domain Tracking

Neural networks for multi-domain learning empowers an effective combinat...
research
06/01/2021

Semi-Supervised Disparity Estimation with Deep Feature Reconstruction

Despite the success of deep learning in disparity estimation, the domain...
research
07/03/2019

Encoding high-cardinality string categorical variables

Statistical analysis usually requires a vector representation of categor...
research
03/27/2019

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small data challenges have emerged in many learning problems, since the ...

Please sign up or login with your details

Forgot password? Click here to reset