Self-Supervised Learning of Echocardiogram Videos Enables Data-Efficient Clinical Diagnosis
Given the difficulty of obtaining high-quality labels for medical image recognition tasks, there is a need for deep learning techniques that can be adequately fine-tuned on small labeled data sets. Recent advances in self-supervised learning techniques have shown that such an in-domain representation learning approach can provide a strong initialization for supervised fine-tuning, proving much more data-efficient than standard transfer learning from a supervised pretraining task. However, these applications are not adapted to applications to medical diagnostics captured in a video format. With this progress in mind, we developed a self-supervised learning approach catered to echocardiogram videos with the goal of learning strong representations for downstream fine-tuning on the task of diagnosing aortic stenosis (AS), a common and dangerous disease of the aortic valve. When fine-tuned on 1 achieves 0.818 AUC (95 approach reaches 0.644 AUC (95 self-supervised model attends more closely to the aortic valve when predicting severe AS as demonstrated by saliency map visualizations.
READ FULL TEXT