G-CEALS: Gaussian Cluster Embedding in Autoencoder Latent Space for Tabular Data Representation

01/02/2023
by   Manar D. Samad, et al.
0

The latent space of autoencoders has been improved for clustering image data by jointly learning a t-distributed embedding with a clustering algorithm inspired by the neighborhood embedding concept proposed for data visualization. However, multivariate tabular data pose different challenges in representation learning than image data, where traditional machine learning is often superior to deep tabular data learning. In this paper, we address the challenges of learning tabular data in contrast to image data and present a novel Gaussian Cluster Embedding in Autoencoder Latent Space (G-CEALS) algorithm by replacing t-distributions with multivariate Gaussian clusters. Unlike current methods, the proposed approach independently defines the Gaussian embedding and the target cluster distribution to accommodate any clustering algorithm in representation learning. A trained G-CEALS model extracts a quality embedding for unseen test data. Based on the embedding clustering accuracy, the average rank of the proposed G-CEALS method is 1.4 (0.7), which is superior to all eight baseline clustering and cluster embedding methods on seven tabular data sets. This paper shows one of the first algorithms to jointly learn embedding and clustering to improve multivariate tabular data representation in downstream clustering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2022

Effectiveness of Deep Image Embedding Clustering Methods on Tabular Data

Deep learning methods in the literature are commonly benchmarked on imag...
research
12/02/2022

Improved Representation Learning Through Tensorized Autoencoders

The central question in representation learning is what constitutes a go...
research
10/20/2019

Representation Learning for Discovering Phonemic Tone Contours

Tone is a prosodic feature used to distinguish words in many languages, ...
research
08/16/2019

N2D:(Not Too) Deep clustering via clustering the local manifold of an autoencoded embedding

Deep clustering has increasingly been demonstrating superiority over con...
research
11/12/2019

Deep Clustering for Mars Rover image datasets

In this paper, we build autoencoders to learn a latent space from unlabe...
research
06/09/2022

Unsupervised Deep Discriminant Analysis Based Clustering

This work presents an unsupervised deep discriminant analysis for cluste...
research
09/14/2022

Efficient Unsupervised Learning for Plankton Images

Monitoring plankton populations in situ is fundamental to preserve the a...

Please sign up or login with your details

Forgot password? Click here to reset