Dual Graph-Laplacian PCA: A Closed-Form Solution for Bi-clustering to Find "Checkerboard" Structures on Gene Expression Data

01/21/2019
by   Jin-Xing Liu, et al.
0

In the context of cancer, internal "checkerboard" structures are normally found in the matrices of gene expression data, which correspond to genes that are significantly up- or down-regulated in patients with specific types of tumors. In this paper, we propose a novel method, called dual graph-regularization principal component analysis (DGPCA). The main innovation of this method is that it simultaneously considers the internal geometric structures of the condition manifold and the gene manifold. Specifically, we obtain principal components (PCs) to represent the data and approximate the cluster membership indicators through Laplacian embedding. This new method is endowed with internal geometric structures, such as the condition manifold and gene manifold, which are both suitable for bi-clustering. A closed-form solution is provided for DGPCA. We apply this new method to simultaneously cluster genes and conditions (e.g., different samples) with the aim of finding internal "checkerboard" structures on gene expression data, if they exist. Then, we use this new method to identify regulatory genes under the particular conditions and to compare the results with those of other state-of-the-art PCA-based methods. Promising results on gene expression data have been verified by extensive experiments

READ FULL TEXT

page 1

page 2

page 4

page 6

research
05/12/2020

A Novel Granular-Based Bi-Clustering Method of Deep Mining the Co-Expressed Genes

Traditional clustering methods are limited when dealing with huge and he...
research
05/28/2019

Supervised Discriminative Sparse PCA for Com-Characteristic Gene Selection and Tumor Classification on Multiview Biological Data

Principal Component Analysis (PCA) has been used to study the pathogenes...
research
08/30/2019

Network Elastic Net for Identifying Smoking specific gene expression for lung cancer

Survival month for non-small lung cancer patients depend upon which stag...
research
11/10/2017

New Interpretation of Principal Components Analysis

A new look on the principal component analysis has been presented. First...
research
07/16/2012

Designing various component analysis at will

This paper provides a generic framework of component analysis (CA) metho...
research
02/03/2023

A Novel Fuzzy Bi-Clustering Algorithm with AFS for Identification of Co-Regulated Genes

The identification of co-regulated genes and their transcription-factor ...
research
11/09/2020

Stratification of Systemic Lupus Erythematosus Patients Using Gene Expression Data to Reveal Expression of Distinct Immune Pathways

Systemic lupus erythematosus (SLE) is the tenth leading cause of death i...

Please sign up or login with your details

Forgot password? Click here to reset