Log In Sign Up

Using Eigencentrality to Estimate Joint, Conditional and Marginal Probabilities from Mixed-Variable Data: Method and Applications

by   Andrew Skabar, et al.

The ability to estimate joint, conditional and marginal probability distributions over some set of variables is of great utility for many common machine learning tasks. However, estimating these distributions can be challenging, particularly in the case of data containing a mix of discrete and continuous variables. This paper presents a non-parametric method for estimating these distributions directly from a dataset. The data are first represented as a graph consisting of object nodes and attribute value nodes. Depending on the distribution to be estimated, an appropriate eigenvector equation is then constructed. This equation is then solved to find the corresponding stationary distribution of the graph, from which the required distributions can then be estimated and sampled from. The paper demonstrates how the method can be applied to many common machine learning tasks including classification, regression, missing value imputation, outlier detection, random vector generation, and clustering.


page 1

page 2

page 3

page 4


A copula transformation in multivariate mixed discrete-continuous models

Copulas allow a flexible and simultaneous modeling of complicated depend...

Sufficiency, Separability and Temporal Probabilistic Models

Suppose we are given the conditional probability of one variable given s...

Identifying the Relevant Nodes Without Learning the Model

We propose a method to identify all the nodes that are relevant to compu...

GFlowNet Foundations

Generative Flow Networks (GFlowNets) have been introduced as a method to...

Joint and conditional estimation of tagging and parsing models

This paper compares two different ways of estimating statistical languag...

Coupled Generative Adversarial Networks

We propose coupled generative adversarial network (CoGAN) for learning a...

Estimating differential entropy using recursive copula splitting

A method for estimating the Shannon differential entropy of multidimensi...