Using Human Gaze For Surgical Activity Recognition

03/09/2022
by   Abdishakour Awale, et al.
20

Automatically recognizing surgical activities plays an important role in providing feedback to surgeons, and is a fundamental step towards computer-aided surgical systems. Human gaze and visual saliency carry important information about visual attention, and can be used in computer vision systems. Although state-of-the-art surgical activity recognition models learn spatial temporal features, none of these models make use of human gaze and visual saliency. In this study, we propose to use human gaze with a spatial temporal attention mechanism for activity recognition in surgical videos. Our model consists of an I3D-based architecture, learns spatio-temporal features using 3D convolutions, as well as learning an attention map using human gaze. We evaluated our model on the Suturing task of JIGSAWS which is a publicly available surgical video understanding dataset. Our evaluations on a subset of random video segments in this task suggest that our approach achieves promising results with an accuracy of 86.2

READ FULL TEXT

page 1

page 2

page 3

research
01/11/2020

Towards Generalizable Surgical Activity Recognition Using Spatial Temporal Graph Convolutional Networks

Modeling and recognition of surgical activities poses an interesting res...
research
11/08/2020

Integrating Human Gaze into Attention for Egocentric Activity Recognition

It is well known that human gaze carries significant information about v...
research
05/05/2022

Activity Detection in Long Surgical Videos using Spatio-Temporal Models

Automatic activity detection is an important component for developing te...
research
07/03/2019

Novel evaluation of surgical activity recognition models using task-based efficiency metrics

Purpose: Surgical task-based metrics (rather than entire procedure metri...
research
09/30/2017

Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze

Unsupervised segmentation of action segments in egocentric videos is a d...
research
12/29/2013

Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition

Systems based on bag-of-words models from image features collected at ma...
research
07/08/2021

4D Attention: Comprehensive Framework for Spatio-Temporal Gaze Mapping

This study presents a framework for capturing human attention in the spa...

Please sign up or login with your details

Forgot password? Click here to reset