Goodness-of-fit Test on the Number of Biclusters in Relational Data Matrix

02/23/2021
by   Chihiro Watanabe, et al.
7

Biclustering is a method for detecting homogeneous submatrices in a given observed matrix, and it is an effective tool for relational data analysis. Although there are many studies that estimate the underlying bicluster structure of a matrix, few have enabled us to determine the appropriate number of biclusters in an observed matrix. Recently, a statistical test on the number of biclusters has been proposed for a regular-grid bicluster structure, where we assume that the latent bicluster structure can be represented by row-column clustering. However, when the latent bicluster structure does not satisfy such regular-grid assumption, the previous test requires a larger number of biclusters than necessary (i.e., a finer bicluster structure than necessary) for the null hypothesis to be accepted, which is not desirable in terms of interpreting the accepted bicluster structure. In this study, we propose a new statistical test on the number of biclusters that does not require the regular-grid assumption and derive the asymptotic behavior of the proposed test statistic in both null and alternative cases. To develop the proposed test, we construct a consistent submatrix localization algorithm, that is, the probability that it outputs the correct bicluster structure converges to one. We illustrate the effectiveness of the proposed method by applying it to both synthetic and practical relational data matrices.

READ FULL TEXT

page 5

page 33

page 36

page 37

research
06/10/2019

Goodness-of-fit Test for Latent Block Models

Latent Block Models are used for probabilistic biclustering, which is sh...
research
03/26/2021

Deep Two-Way Matrix Reordering for Relational Data Analysis

Matrix reordering is a task to permute the rows and columns of a given o...
research
08/16/2021

Detecting changes in covariance via random matrix theory

A novel method is proposed for detecting changes in the covariance struc...
research
07/19/2022

A Normal Test for Independence via Generalized Mutual Information

Testing hypothesis of independence between two random elements on a join...
research
12/13/2022

Testing the Graph of a Gaussian Graphical Model

The Gaussian graphical model is routinely employed to model the joint di...
research
02/05/2018

An efficient counting method for the colored triad census

The triad census is an important approach to understand local structure ...
research
10/14/2022

An hypothesis test for the domain of attraction of a random variable

In this work we address the problem of detecting wether a sampled probab...

Please sign up or login with your details

Forgot password? Click here to reset