Conjoined Dirichlet Process

02/08/2020
by   Michelle N. Ngo, et al.
0

Biclustering is a class of techniques that simultaneously clusters the rows and columns of a matrix to sort heterogeneous data into homogeneous blocks. Although many algorithms have been proposed to find biclusters, existing methods suffer from the pre-specification of the number of biclusters or place constraints on the model structure. To address these issues, we develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns. The proposed method utilizes dual Dirichlet process mixture models to learn row and column clusters, with the number of resulting clusters determined by the data rather than pre-specified. Probabilistic biclusters are identified by modeling the mutual dependence between the row and column clusters. We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches.

READ FULL TEXT
research
06/10/2019

Goodness-of-fit Test for Latent Block Models

Latent Block Models are used for probabilistic biclustering, which is sh...
research
09/13/2018

MSc Dissertation: Exclusive Row Biclustering for Gene Expression Using a Combinatorial Auction Approach

The availability of large microarray data has led to a growing interest ...
research
06/13/2021

Two-way Spectrum Pursuit for CUR Decomposition and Its Application in Joint Column/Row Subset Selection

The problem of simultaneous column and row subset selection is addressed...
research
10/01/2013

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

In standard clustering problems, data points are represented by vectors,...
research
04/14/2015

Probabilistic Clustering of Time-Evolving Distance Data

We present a novel probabilistic clustering model for objects that are r...
research
06/16/2022

Variational Estimators of the Degree-corrected Latent Block Model for Bipartite Networks

Biclustering on bipartite graphs is an unsupervised learning task that s...
research
09/09/2020

Biclustering with Alternating K-Means

Biclustering is the task of simultaneously clustering the rows and colum...

Please sign up or login with your details

Forgot password? Click here to reset