Multi-Task Recurrent Convolutional Network with Correlation Loss for Surgical Video Analysis

07/13/2019
by   Yueming Jin, et al.
2

Surgical tool presence detection and surgical phase recognition are two fundamental yet challenging tasks in surgical video analysis and also very essential components in various applications in modern operating rooms. While these two analysis tasks are highly correlated in clinical practice as the surgical process is well-defined, most previous methods tackled them separately, without making full use of their relatedness. In this paper, we present a novel method by developing a multi-task recurrent convolutional network with correlation loss (MTRCNet-CL) to exploit their relatedness to simultaneously boost the performance of both tasks. Specifically, our proposed MTRCNet-CL model has an end-to-end architecture with two branches, which share earlier feature encoders to extract general visual features while holding respective higher layers targeting for specific tasks. Given that temporal information is crucial for phase recognition, long-short term memory (LSTM) is explored to model the sequential dependencies in the phase recognition branch. More importantly, a novel and effective correlation loss is designed to model the relatedness between tool presence and phase identification of each video frame, by minimizing the divergence of predictions from the two branches. Mutually leveraging both low-level feature sharing and high-level prediction correlating, our MTRCNet-CL method can encourage the interactions between the two tasks to a large extent, and hence can bring about benefits to each other. Extensive experiments on a large surgical video dataset (Cholec80) demonstrate outstanding performance of our proposed method, consistently exceeding the state-of-the-art methods by a large margin (e.g., 89.1 in tool presence detection and 87.4 recognition). The code can be found on our project website.

READ FULL TEXT

page 3

page 8

page 16

page 19

page 22

page 23

research
02/09/2016

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos

Surgical workflow recognition has numerous potential medical application...
research
03/30/2021

Temporal Memory Relation Network for Workflow Recognition from Surgical Video

Automatic surgical workflow recognition is a key component for developin...
research
07/10/2021

Not End-to-End: Explore Multi-Stage Architecture for Online Surgical Phase Recognition

Surgical phase recognition is of particular interest to computer assiste...
research
10/27/2016

Single- and Multi-Task Architectures for Tool Presence Detection Challenge at M2CAI 2016

The tool presence detection challenge at M2CAI 2016 consists of identify...
research
12/22/2017

SFCN-OPI: Detection and Fine-grained Classification of Nuclei Using Sibling FCN with Objectness Prior Interaction

Cell nuclei detection and fine-grained classification have been fundamen...
research
07/20/2018

Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features

Recognizing the phases of a laparoscopic surgery (LS) operation form its...

Please sign up or login with your details

Forgot password? Click here to reset