Surgical Instruction Generation with Transformers

07/14/2021
by   Jinglu Zhang, et al.
0

Automatic surgical instruction generation is a prerequisite towards intra-operative context-aware surgical assistance. However, generating instructions from surgical scenes is challenging, as it requires jointly understanding the surgical activity of current view and modelling relationships between visual information and textual description. Inspired by the neural machine translation and imaging captioning tasks in open domain, we introduce a transformer-backboned encoder-decoder network with self-critical reinforcement learning to generate instructions from surgical images. We evaluate the effectiveness of our method on DAISI dataset, which includes 290 procedures from various medical disciplines. Our approach outperforms the existing baseline over all caption evaluation metrics. The results demonstrate the benefits of the encoder-decoder structure backboned by transformer in handling multimodal context.

READ FULL TEXT
research
03/22/2020

DAISI: Database for AI Surgical Instruction

Telementoring surgeons as they perform surgery can be essential in the t...
research
10/09/2018

Image Captioning as Neural Machine Translation Task in SOCKEYE

Image captioning is an interdisciplinary research problem that stands be...
research
03/30/2019

Machine translation considering context information using Encoder-Decoder model

In the task of machine translation, context information is one of the im...
research
01/17/2023

HanoiT: Enhancing Context-aware Translation via Selective Context

Context-aware neural machine translation aims to use the document-level ...
research
10/24/2022

Focused Concatenation for Context-Aware Neural Machine Translation

A straightforward approach to context-aware neural machine translation c...
research
07/09/2020

Alleviating the Burden of Labeling: Sentence Generation by Attention Branch Encoder-Decoder Network

Domestic service robots (DSRs) are a promising solution to the shortage ...
research
06/08/2018

Unsupervised Learning for Surgical Motion by Learning to Predict the Future

We show that it is possible to learn meaningful representations of surgi...

Please sign up or login with your details

Forgot password? Click here to reset