MobileInst: Video Instance Segmentation on the Mobile

03/30/2023
by   Renhong Zhang, et al.
0

Although recent approaches aiming for video instance segmentation have achieved promising results, it is still difficult to employ those approaches for real-world applications on mobile devices, which mainly suffer from (1) heavy computation and memory cost and (2) complicated heuristics for tracking objects. To address those issues, we present MobileInst, a lightweight and mobile-friendly framework for video instance segmentation on mobile devices. Firstly, MobileInst adopts a mobile vision transformer to extract multi-level semantic features and presents an efficient query-based dual-transformer instance decoder for mask kernels and a semantic-enhanced mask decoder to generate instance segmentation per frame. Secondly, MobileInst exploits simple yet effective kernel reuse and kernel association to track objects for video instance segmentation. Further, we propose temporal query passing to enhance the tracking ability for kernels. We conduct experiments on COCO and YouTube-VIS datasets to demonstrate the superiority of MobileInst and evaluate the inference latency on a mobile CPU core of Qualcomm Snapdragon-778G, without other methods of acceleration. On the COCO dataset, MobileInst achieves 30.5 mask AP and 176 ms on the mobile CPU, which reduces the latency by 50 to the previous SOTA. For video instance segmentation, MobileInst achieves 35.0 AP on YouTube-VIS 2019 and 30.1 AP on YouTube-VIS 2021. Code will be available to facilitate real-world applications and future research.

READ FULL TEXT

page 3

page 4

page 6

page 10

page 11

page 12

research
12/15/2021

SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation

In this work, we present SeqFormer, a frustratingly simple model for vid...
research
11/16/2022

Robust Online Video Instance Segmentation with Track Queries

Recently, transformer-based methods have achieved impressive results on ...
research
03/28/2023

OpenInst: A Simple Query-Based Method for Open-World Instance Segmentation

Open-world instance segmentation has recently gained significant popular...
research
06/07/2023

RefineVIS: Video Instance Segmentation with Temporal Attention Refinement

We introduce a novel framework called RefineVIS for Video Instance Segme...
research
12/02/2020

Learning Universal Shape Dictionary for Realtime Instance Segmentation

We present a novel explicit shape representation for instance segmentati...
research
03/31/2022

Human Instance Segmentation and Tracking via Data Association and Single-stage Detector

Human video instance segmentation plays an important role in computer un...
research
12/08/2021

VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation

For online video instance segmentation (VIS), fully utilizing the inform...

Please sign up or login with your details

Forgot password? Click here to reset