CLMB: deep contrastive learning for robust metagenomic binning

11/18/2021
by   Pengfei Zhang, et al.
98

The reconstruction of microbial genomes from large metagenomic datasets is a critical procedure for finding uncultivated microbial populations and defining their microbial functional roles. To achieve that, we need to perform metagenomic binning, clustering the assembled contigs into draft genomes. Despite the existing computational tools, most of them neglect one important property of the metagenomic data, that is, the noise. To further improve the metagenomic binning step and reconstruct better metagenomes, we propose a deep Contrastive Learning framework for Metagenome Binning (CLMB), which can efficiently eliminate the disturbance of noise and produce more stable and robust results. Essentially, instead of denoising the data explicitly, we add simulated noise to the training data and force the deep learning model to produce similar and stable representations for both the noise-free data and the distorted data. Consequently, the trained model will be robust to noise and handle it implicitly during usage. CLMB outperforms the previous state-of-the-art binning methods significantly, recovering the most near-complete genomes on almost all the benchmarking datasets (up to 17% more reconstructed genomes compared to the second-best method). It also improves the performance of bin refinement, reconstructing 8-22 more high-quality genomes and 15-32 more middle-quality genomes than the second-best result. Impressively, in addition to being compatible with the binning refiner, single CLMB even recovers on average 15 more HQ genomes than the refiner of VAMB and Maxbin on the benchmarking datasets. CLMB is open-source and available at https://github.com/zpf0117b/CLMB/.

READ FULL TEXT

page 3

page 4

page 5

page 6

page 11

page 17

page 18

page 19

research
04/19/2021

Contrastive Learning Improves Model Robustness Under Label Noise

Deep neural network-based classifiers trained with the categorical cross...
research
09/14/2022

Joint Debiased Representation and Image Clustering Learning with Self-Supervision

Contrastive learning is among the most successful methods for visual rep...
research
09/01/2022

TempCLR: Reconstructing Hands via Time-Coherent Contrastive Learning

We introduce TempCLR, a new time-coherent contrastive learning approach ...
research
06/10/2021

Learning to See by Looking at Noise

Current vision systems are trained on huge datasets, and these datasets ...
research
03/08/2022

Selective-Supervised Contrastive Learning with Noisy Labels

Deep networks have strong capacities of embedding data into latent repre...
research
08/26/2021

Spatio-Temporal Graph Contrastive Learning

Deep learning models are modern tools for spatio-temporal graph (STG) fo...
research
08/23/2018

StretchDenoise: Parametric Curve Reconstruction with Guarantees by Separating Connectivity from Residual Uncertainty of Samples

We reconstruct a closed denoised curve from an unstructured and highly n...

Please sign up or login with your details

Forgot password? Click here to reset