DeepAI AI Chat
Log In Sign Up

Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid Embedding Aggregation Transformer

by   Xiaojie Gao, et al.

Real-time surgical phase recognition is a fundamental task in modern operating rooms. Previous works tackle this task relying on architectures arranged in spatio-temporal order, however, the supportive benefits of intermediate spatial features are not considered. In this paper, we introduce, for the first time in surgical workflow analysis, Transformer to reconsider the ignored complementary effects of spatial and temporal features for accurate surgical phase recognition. Our hybrid embedding aggregation Transformer fuses cleverly designed spatial and temporal embeddings by allowing for active queries based on spatial information from temporal embedding sequences. More importantly, our framework is lightweight and processes the hybrid embeddings in parallel to achieve a high inference speed. Our method is thoroughly validated on two large surgical video datasets, i.e., Cholec80 and M2CAI16 Challenge datasets, and significantly outperforms the state-of-the-art approaches at a processing speed of 91 fps.


OperA: Attention-Regularized Transformers for Surgical Phase Recognition

In this paper we introduce OperA, a transformer-based model that accurat...

Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation

Automatic surgical scene segmentation is fundamental for facilitating co...

ARST: Auto-Regressive Surgical Transformer for Phase Recognition from Laparoscopic Videos

Phase recognition plays an essential role for surgical workflow analysis...

Rethinking Causality-driven Robot Tool Segmentation with Temporal Constraints

Purpose: Vision-based robot tool segmentation plays a fundamental role i...

Activity Detection in Long Surgical Videos using Spatio-Temporal Models

Automatic activity detection is an important component for developing te...

CataNet: Predicting remaining cataract surgery duration

Cataract surgery is a sight saving surgery that is performed over 10 mil...

Surgical Workflow Recognition: from Analysis of Challenges to Architectural Study

Algorithmic surgical workflow recognition is an ongoing research field a...