Multi-Modal Unsupervised Pre-Training for Surgical Operating Room Workflow Analysis

07/16/2022
by   Muhammad Abdullah Jamal, et al.
0

Data-driven approaches to assist operating room (OR) workflow analysis depend on large curated datasets that are time consuming and expensive to collect. On the other hand, we see a recent paradigm shift from supervised learning to self-supervised and/or unsupervised learning approaches that can learn representations from unlabeled datasets. In this paper, we leverage the unlabeled data captured in robotic surgery ORs and propose a novel way to fuse the multi-modal data for a single video frame or image. Instead of producing different augmentations (or 'views') of the same image or video frame which is a common practice in self-supervised learning, we treat the multi-modal data as different views to train the model in an unsupervised manner via clustering. We compared our method with other state of the art methods and results show the superior performance of our approach on surgical video activity recognition and semantic segmentation.

READ FULL TEXT
research
12/22/2021

Fine-grained Multi-Modal Self-Supervised Learning

Multi-Modal Self-Supervised Learning from videos has been shown to impro...
research
03/16/2022

Object discovery and representation networks

The promise of self-supervised learning (SSL) is to leverage large amoun...
research
05/19/2023

SurgMAE: Masked Autoencoders for Long Surgical Video Analysis

There has been a growing interest in using deep learning models for proc...
research
08/26/2023

Reinforcement Learning Based Multi-modal Feature Fusion Network for Novel Class Discovery

With the development of deep learning techniques, supervised learning ha...
research
05/24/2017

Self-supervised learning of visual features through embedding images into text topic spaces

End-to-end training from scratch of current deep architectures for new c...
research
10/31/2020

Multimodal and self-supervised representation learning for automatic gesture recognition in surgical robotics

Self-supervised, multi-modal learning has been successful in holistic re...
research
06/18/2018

Temporal coherence-based self-supervised learning for laparoscopic workflow analysis

In order to provide the right type of assistance at the right time, comp...

Please sign up or login with your details

Forgot password? Click here to reset