Elastic Coupled Co-clustering for Single-Cell Genomic Data

03/29/2020
by   Pengcheng Zeng, et al.
0

The recent advances in single-cell technologies have enabled us to profile genomic features at unprecedented resolution and data sets from multiple domains are available, including data sets that profile different types of genomic features and data sets that profile the same type of genomic features across different species. These data sets typically have different powers in identifying the unknown cell types through clustering, and data integration can potentially lead to a better performance of clustering algorithms. In this work, we formulate the problem in an unsupervised transfer learning framework, which utilizes knowledge learned from auxiliary data set to improve the clustering performance of target data set. The degree of shared information among the target and auxiliary data sets can vary, and their distributions can also be different. To address these challenges, we propose an elastic coupled co-clustering based transfer learning algorithm, by elastically propagating clustering knowledge obtained from the auxiliary data set to the target data set. Implementation on single-cell genomic data sets shows that our algorithm greatly improves clustering performance over the traditional learning algorithms. The source code and data sets are available at https://github.com/cuhklinlab/elasticC3.

READ FULL TEXT
research
12/15/2022

Silhouette: Toward Performance-Conscious and Transferable CPU Embeddings

Learned embeddings are widely used to obtain concise data representation...
research
04/22/2022

EmbedTrack – Simultaneous Cell Segmentation and Tracking Through Learning Offsets and Clustering Bandwidths

A systematic analysis of the cell behavior requires automated approaches...
research
11/25/2019

Learning New Tricks from Old Dogs – Inter-Species, Inter-Tissue Domain Adaptation for Mitotic Figure Assessment

For histopathological tumor assessment, the count of mitotic figures per...
research
01/06/2020

MREC: a fast and versatile framework for aligning and matching data with applications to single cell molecular data

Comparing and aligning large datasets is a pervasive problem occurring a...
research
01/06/2020

MREC: a fast and versatile framework for aligning and matching point clouds with applications to single cell molecular data

Comparing and aligning large datasets is a pervasive problem occurring a...
research
03/27/2023

CoCon: A Data Set on Combined Contextualized Research Artifact Use

In the wake of information overload in academia, methodologies and syste...
research
12/19/2019

Reconstruction of Gene Regulatory Networks usingMultiple Datasets

Motivation: Laboratory gene regulatory data for a species are sporadic. ...

Please sign up or login with your details

Forgot password? Click here to reset