Consistent Biclustering

06/29/2012
by   Cheryl J. Flynn, et al.
0

Biclustering, the process of simultaneously clustering the rows and columns of a data matrix, is a popular and effective tool for finding structure in a high-dimensional dataset. Many biclustering procedures appear to work well in practice, but most do not have associated consistency guarantees. To address this shortcoming, we propose a new biclustering procedure based on profile likelihood. The procedure applies to a broad range of data modalities, including binary, count, and continuous observations. We prove that the procedure recovers the true row and column classes when the dimensions of the data matrix tend to infinity, even if the functional form of the data distribution is misspecified. The procedure requires computing a combinatorial search, which can be expensive in practice. Rather than performing this search directly, we propose a new heuristic optimization procedure based on the Kernighan-Lin heuristic, which has nice computational properties and performs well in simulations. We demonstrate our procedure with applications to congressional voting records, and microarray analysis.

READ FULL TEXT

page 15

page 28

research
03/14/2021

A two-way factor model for high-dimensional matrix data

In this article, we introduce a two-way factor model for a high-dimensio...
research
06/27/2012

Inferring Latent Structure From Mixed Real and Categorical Relational Data

We consider analysis of relational data (a matrix), in which the rows co...
research
09/09/2020

Biclustering with Alternating K-Means

Biclustering is the task of simultaneously clustering the rows and colum...
research
06/28/2023

High-Dimensional Canonical Correlation Analysis

This paper studies high-dimensional canonical correlation analysis (CCA)...
research
10/01/2013

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

In standard clustering problems, data points are represented by vectors,...
research
10/03/2021

Vector or Matrix Factor Model? A Strong Rule Helps!

This paper investigates the issue of determining the dimensions of row a...

Please sign up or login with your details

Forgot password? Click here to reset