VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation

12/08/2021
by   Su Ho Han, et al.
0

For online video instance segmentation (VIS), fully utilizing the information from previous frames in an efficient manner is essential for real-time applications. Most previous methods follow a two-stage approach requiring additional computations such as RPN and RoIAlign, and do not fully exploit the available information in the video for all subtasks in VIS. In this paper, we propose a novel single-stage framework for online VIS built based on the grid structured feature representation. The grid-based features allow us to employ fully convolutional networks for real-time processing, and also to easily reuse and share features within different components. We also introduce cooperatively operating modules that aggregate information from available frames, in order to enrich the features for all subtasks in VIS. Our design fully takes advantage of previous information in a grid form for all tasks in VIS in an efficient way, and we achieved the new state-of-the-art accuracy (38.6 AP and 36.9 AP) and speed (40.0 FPS) on YouTube-VIS 2019 and 2021 datasets among online VIS methods.

READ FULL TEXT

page 3

page 7

page 8

research
09/21/2023

TCOVIS: Temporally Consistent Online Video Instance Segmentation

In recent years, significant progress has been made in video instance se...
research
04/13/2021

Crossover Learning for Fast Online Video Instance Segmentation

Modeling temporal visual context across frames is critical for video ins...
research
01/05/2023

InsPro: Propagating Instance Query and Proposal for Online Video Instance Segmentation

Video instance segmentation (VIS) aims at segmenting and tracking object...
research
07/21/2022

In Defense of Online Models for Video Instance Segmentation

In recent years, video instance segmentation (VIS) has been largely adva...
research
06/07/2021

Video Instance Segmentation using Inter-Frame Communication Transformers

We propose a novel end-to-end solution for video instance segmentation (...
research
03/30/2023

MobileInst: Video Instance Segmentation on the Mobile

Although recent approaches aiming for video instance segmentation have a...
research
04/05/2016

Counting Grid Aggregation for Event Retrieval and Recognition

Event retrieval and recognition in a large corpus of videos necessitates...

Please sign up or login with your details

Forgot password? Click here to reset