Graph Spectral Feature Learning for Mixed Data of Categorical and Numerical Type

05/06/2020
by   Saswata Sahoo, et al.
0

Feature learning in the presence of a mixed type of variables, numerical and categorical types, is an important issue for related modeling problems. For simple neighborhood queries under mixed data space, standard practice is to consider numerical and categorical variables separately and combining them based on some suitable distance functions. Alternatives, such as Kernel learning or Principal Component do not explicitly consider the inter-dependence structure among the mixed type of variables. In this work, we propose a novel strategy to explicitly model the probabilistic dependence structure among the mixed type of variables by an undirected graph. Spectral decomposition of the graph Laplacian provides the desired feature transformation. The Eigen spectrum of the transformed feature space shows increased separability and more prominent clusterability among the observations. The main novelty of our paper lies in capturing interactions of the mixed feature type in an unsupervised framework using a graphical model. We numerically validate the implications of the feature learning strategy

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2020

Learning Representation for Mixed Data Types with a Nonlinear Deep Encoder-Decoder Framework

Representation of data on mixed variables, numerical and categorical typ...
research
05/09/2021

Bayesian Kernelised Test of (In)dependence with Mixed-type Variables

A fundamental task in AI is to assess (in)dependence between mixed-type ...
research
10/07/2020

FairMixRep : Self-supervised Robust Representation Learning for Heterogeneous Data with Fairness constraints

Representation Learning in a heterogeneous space with mixed variables of...
research
06/02/2023

Mixed-type Distance Shrinkage and Selection for Clustering via Kernel Metric Learning

Distance-based clustering and classification are widely used in various ...
research
11/23/2021

ptype-cat: Inferring the Type and Values of Categorical Variables

Type inference is the task of identifying the type of values in a data c...
research
02/02/2022

Mold into a Graph: Efficient Bayesian Optimization over Mixed-Spaces

Real-world optimization problems are generally not just black-box proble...
research
10/11/2021

Density-based interpretable hypercube region partitioning for mixed numeric and categorical data

Consider a structured dataset of features, such as {SEX, INCOME, RACE, E...

Please sign up or login with your details

Forgot password? Click here to reset