ST-MTL: Spatio-Temporal Multitask Learning Model to Predict Scanpath While Tracking Instruments in Robotic Surgery

12/10/2021
by   Mobarakol Islam, et al.
17

Representation learning of the task-oriented attention while tracking instrument holds vast potential in image-guided robotic surgery. Incorporating cognitive ability to automate the camera control enables the surgeon to concentrate more on dealing with surgical instruments. The objective is to reduce the operation time and facilitate the surgery for both surgeons and patients. We propose an end-to-end trainable Spatio-Temporal Multi-Task Learning (ST-MTL) model with a shared encoder and spatio-temporal decoders for the real-time surgical instrument segmentation and task-oriented saliency detection. In the MTL model of shared parameters, optimizing multiple loss functions into a convergence point is still an open challenge. We tackle the problem with a novel asynchronous spatio-temporal optimization (ASTO) technique by calculating independent gradients for each decoder. We also design a competitive squeeze and excitation unit by casting a skip connection that retains weak features, excites strong features, and performs dynamic spatial and channel-wise feature recalibration. To capture better long term spatio-temporal dependencies, we enhance the long-short term memory (LSTM) module by concatenating high-level encoder features of consecutive frames. We also introduce Sinkhorn regularized loss to enhance task-oriented saliency detection by preserving computational efficiency. We generate the task-aware saliency maps and scanpath of the instruments on the dataset of the MICCAI 2017 robotic instrument segmentation challenge. Compared to the state-of-the-art segmentation and saliency methods, our model outperforms most of the evaluation metrics and produces an outstanding performance in the challenge.

READ FULL TEXT

page 1

page 2

page 5

page 7

page 9

research
06/29/2019

Learning Where to Look While Tracking Instruments in Robot-assisted Surgery

Directing of the task-specific attention while tracking instrument in su...
research
03/10/2020

AP-MTL: Attention Pruned Multi-task Learning Model for Real-time Instrument Detection and Segmentation in Robot-assisted Surgery

Surgical scene understanding and multi-tasking learning are crucial for ...
research
08/21/2016

STFCN: Spatio-Temporal FCN for Semantic Video Segmentation

This paper presents a novel method to involve both spatial and temporal ...
research
10/24/2019

Attention-Guided Lightweight Network for Real-Time Segmentation of Robotic Surgical Instruments

Real-time segmentation of surgical instruments plays a crucial role in r...
research
01/09/2020

STAViS: Spatio-Temporal AudioVisual Saliency Network

We introduce STAViS, a spatio-temporal audiovisual saliency network that...
research
12/03/2018

SUSiNet: See, Understand and Summarize it

In this work we propose a multi-task spatio-temporal network, called SUS...
research
07/01/2020

Rethinking Anticipation Tasks: Uncertainty-aware Anticipation of Sparse Surgical Instrument Usage for Context-aware Assistance

Intra-operative anticipation of instrument usage is a necessary componen...

Please sign up or login with your details

Forgot password? Click here to reset