DeepAI AI Chat
Log In Sign Up

Self-supervised Representation Learning for Ultrasound Video

by   Jianbo Jiao, et al.

Recent advances in deep learning have achieved promising performance for medical image analysis, while in most cases ground-truth annotations from human experts are necessary to train the deep model. In practice, such annotations are expensive to collect and can be scarce for medical imaging applications. Therefore, there is significant interest in learning representations from unlabelled raw data. In this paper, we propose a self-supervised learning approach to learn meaningful and transferable representations from medical imaging video without any type of human annotation. We assume that in order to learn such a representation, the model should identify anatomical structures from the unlabelled data. Therefore we force the model to address anatomy-aware tasks with free supervision from the data itself. Specifically, the model is designed to correct the order of a reshuffled video clip and at the same time predict the geometric transformation applied to the video clip. Experiments on fetal ultrasound video show that the proposed approach can effectively learn meaningful and strong representations, which transfer well to downstream tasks like standard plane detection and saliency prediction.


page 2

page 3

page 4


Self-supervised Contrastive Video-Speech Representation Learning for Ultrasound

In medical imaging, manual annotations can be expensive to acquire and s...

Self-Supervised Representation Learning for Detection of ACL Tear Injury in Knee MRI

The success and efficiency of Deep Learning based models for computer vi...

Multimodal Self-Supervised Learning for Medical Image Analysis

In this paper, we propose a self-supervised learning approach that lever...

Self-Supervised Learning of Echocardiogram Videos Enables Data-Efficient Clinical Diagnosis

Given the difficulty of obtaining high-quality labels for medical image ...

MarioNette: Self-Supervised Sprite Learning

Visual content often contains recurring elements. Text is made up of gly...

Ultrasound Image Representation Learning by Modeling Sonographer Visual Attention

Image representations are commonly learned from class labels, which are ...

Ultrasound Video Summarization using Deep Reinforcement Learning

Video is an essential imaging modality for diagnostics, e.g. in ultrasou...