Comparing representations of biological data learned with different AI paradigms, augmenting and cropping strategies

03/07/2022
by   Andrei Dmitrenko, et al.
8

Recent advances in computer vision and robotics enabled automated large-scale biological image analysis. Various machine learning approaches have been successfully applied to phenotypic profiling. However, it remains unclear how they compare in terms of biological feature extraction. In this study, we propose a simple CNN architecture and implement 4 different representation learning approaches. We train 16 deep learning setups on the 770k cancer cell images dataset under identical conditions, using different augmenting and cropping strategies. We compare the learned representations by evaluating multiple metrics for each of three downstream tasks: i) distance-based similarity analysis of known drugs, ii) classification of drugs versus controls, iii) clustering within cell lines. We also compare training times and memory usage. Among all tested setups, multi-crops and random augmentations generally improved performance across tasks, as expected. Strikingly, self-supervised (implicit contrastive learning) models showed competitive performance being up to 11 times faster to train. Self-supervised regularized learning required the most of memory and computation to deliver arguably the most informative features. We observe that no single combination of augmenting and cropping strategies consistently results in top performance across tasks and recommend prospective research directions.

READ FULL TEXT

page 2

page 3

page 6

page 12

page 13

page 14

page 15

research
10/02/2020

Hard Negative Mixing for Contrastive Learning

Contrastive learning has become a key component of self-supervised learn...
research
12/24/2020

Self-Supervised Representation Learning for Astronomical Images

Sky surveys are the largest data generators in astronomy, making automat...
research
11/22/2021

Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks

Self-supervised learning is a powerful paradigm for representation learn...
research
07/18/2023

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments

Self-supervised learning can be used for mitigating the greedy needs of ...
research
08/12/2022

CCRL: Contrastive Cell Representation Learning

Cell identification within the H E slides is an essential prerequisite...
research
07/16/2021

Exploiting generative self-supervised learning for the assessment of biological images with lack of annotations: a COVID-19 case-study

Computer-aided analysis of biological images typically requires extensiv...

Please sign up or login with your details

Forgot password? Click here to reset