DeepAI AI Chat
Log In Sign Up

Temporal Memory Relation Network for Workflow Recognition from Surgical Video

by   Yueming Jin, et al.
The Chinese University of Hong Kong

Automatic surgical workflow recognition is a key component for developing context-aware computer-assisted systems in the operating theatre. Previous works either jointly modeled the spatial features with short fixed-range temporal information, or separately learned visual and long temporal cues. In this paper, we propose a novel end-to-end temporal memory relation network (TMRNet) for relating long-range and multi-scale temporal patterns to augment the present features. We establish a long-range memory bank to serve as a memory cell storing the rich supportive information. Through our designed temporal variation layer, the supportive cues are further enhanced by multi-scale temporal-only convolutions. To effectively incorporate the two types of cues without disturbing the joint learning of spatio-temporal features, we introduce a non-local bank operator to attentively relate the past to the present. In this regard, our TMRNet enables the current feature to view the long-range temporal dependency, as well as tolerate complex temporal extents. We have extensively validated our approach on two benchmark surgical video datasets, M2CAI challenge dataset and Cholec80 dataset. Experimental results demonstrate the outstanding performance of our method, consistently exceeding the state-of-the-art methods by a large margin (e.g., 67.0 78.9


page 1

page 3

page 8

page 9

page 10


LRTD: Long-Range Temporal Dependency based Active Learning for Surgical Workflow Recognition

Automatic surgical workflow recognition in video is an essentially funda...

Efficient Global-Local Memory for Real-time Instrument Segmentation of Robotic Surgical Video

Performing a real-time and accurate instrument segmentation from videos ...

Automatic Depression Detection via Learning and Fusing Features from Visual Cues

Depression is one of the most prevalent mental disorders, which seriousl...

GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos

Automated surgical step recognition is an important task that can signif...

Multi-Task Recurrent Convolutional Network with Correlation Loss for Surgical Video Analysis

Surgical tool presence detection and surgical phase recognition are two ...

MURPHY: Relations Matter in Surgical Workflow Analysis

Autonomous robotic surgery has advanced significantly based on analysis ...

Temporal Pyramid Network for Pedestrian Trajectory Prediction with Multi-Supervision

Predicting human motion behavior in a crowd is important for many applic...

Code Repositories


[TMI'2021] Temporal Memory Relation Network for Workflow Recognition from Surgical Video

view repo