Learning Clique Forests

05/06/2019
by   Guido Previde Massara, et al.
0

We propose a topological learning algorithm for the estimation of the conditional dependency structure of large sets of random variables from sparse and noisy data. The algorithm, named Maximally Filtered Clique Forest (MFCF), produces a clique forest and an associated Markov Random Field (MRF) by generalising Prim's minimum spanning tree algorithm. To the best of our knowledge, the MFCF presents three elements of novelty with respect to existing structure learning approaches. The first is the repeated application of a local topological move, the clique expansion, that preserves the decomposability of the underlying graph. Through this move the decomposability and calculation of scores is performed incrementally at the variable (rather than edge) level, and this provides better computational performance and an intuitive application of multivariate statistical tests. The second is the capability to accommodate a variety of score functions and, while this paper is focused on multivariate normal distributions, it can be directly generalised to different types of statistics. Finally, the third is the variable range of allowed clique sizes which is an adjustable topological constraint that acts as a topological penalizer providing a way to tackle sparsity at l_0 semi-norm level; this allows a clean decoupling of structure learning and parameter estimation. The MFCF produces a representation of the clique forest, together with a perfect ordering of the cliques and a perfect elimination ordering for the vertices. As an example we propose an application to covariance selection models and we show that the MCFC outperforms the Graphical Lasso for a number of classes of matrices.

READ FULL TEXT

page 23

page 29

page 35

page 41

research
04/15/2021

On clique numbers of colored mixed graphs

An (m,n)-colored mixed graph, or simply, an (m,n)-graph is a graph havin...
research
05/01/2021

Perfect Forests in Graphs and Their Extensions

Let G be a graph on n vertices. For i∈{0,1} and a connected graph G, a s...
research
02/10/2010

A Generalization of the Chow-Liu Algorithm and its Application to Statistical Learning

We extend the Chow-Liu algorithm for general random variables while the ...
research
03/28/2019

Finding a planted clique by adaptive probing

We consider a variant of the planted clique problem where we are allowed...
research
01/19/2020

The Power of Pivoting for Exact Clique Counting

Clique counting is a fundamental task in network analysis, and even the ...
research
11/10/2022

Filtration-Domination in Bifiltered Graphs

Bifiltered graphs are a versatile tool for modelling relations between d...
research
01/16/2013

Probabilistic Models for Query Approximation with Large Sparse Binary Datasets

Large sparse sets of binary transaction data with millions of records an...

Please sign up or login with your details

Forgot password? Click here to reset