General Latent Feature Models for Heterogeneous Datasets

06/12/2017
by   Isabel Valera, et al.
0

Latent feature modeling allows capturing the latent structure responsible for generating the observed properties of a set of objects. It is often used to make predictions either for new values of interest or missing information in the original data, as well as to perform data exploratory analysis. However, although there is an extensive literature on latent feature models for homogeneous datasets, where all the attributes that describe each object are of the same (continuous or discrete) nature, there is a lack of work on latent feature modeling for heterogeneous databases. In this paper, we introduce a general Bayesian nonparametric latent feature model suitable for heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. First, it accounts for heterogeneous data while keeping the properties of conjugate models, which allow us to infer the model in linear time with respect to the number of objects and attributes. Second, its Bayesian nonparametric nature allows us to automatically infer the model complexity from the data, i.e., the number of features necessary to capture the latent structure in the data. Third, the latent features in the model are binary-valued variables, easing the interpretability of the obtained latent features in data exploratory analysis. We show the flexibility of the proposed model by solving both prediction and data analysis tasks on several real-world datasets. Moreover, a software package of the GLFM is publicly available for other researcher to use and improve it.

READ FULL TEXT

page 21

page 25

research
07/26/2017

General Latent Feature Modeling for Data Exploration Tasks

This paper introduces a general Bayesian non- parametric latent feature ...
research
08/31/2021

DoGR: Disaggregated Gaussian Regression for Reproducible Analysis of Heterogeneous Data

Quantitative analysis of large-scale data is often complicated by the pr...
research
05/27/2019

Adaptive probabilistic principal component analysis

Using the linear Gaussian latent variable model as a starting point we r...
research
01/24/2020

Sparse Semi-supervised Heterogeneous Interbattery Bayesian Analysis

The Bayesian approach to feature extraction, known as factor analysis (F...
research
05/09/2012

Correlated Non-Parametric Latent Feature Models

We are often interested in explaining data through a set of hidden facto...
research
12/30/2020

Hybrid Function Representation for Heterogeneous Objects

Heterogeneous object modelling is an emerging area where geometric shape...
research
06/22/2020

Latent feature sharing: an adaptive approach to linear decomposition models

Latent feature models are canonical tools for exploratory analysis in cl...

Please sign up or login with your details

Forgot password? Click here to reset