AuxAdapt: Stable and Efficient Test-Time Adaptation for Temporally Consistent Video Semantic Segmentation

10/24/2021
by   Yizhe Zhang, et al.
0

In video segmentation, generating temporally consistent results across frames is as important as achieving frame-wise accuracy. Existing methods rely either on optical flow regularization or fine-tuning with test data to attain temporal consistency. However, optical flow is not always avail-able and reliable. Besides, it is expensive to compute. Fine-tuning the original model in test time is cost sensitive. This paper presents an efficient, intuitive, and unsupervised online adaptation method, AuxAdapt, for improving the temporal consistency of most neural network models. It does not require optical flow and only takes one pass of the video. Since inconsistency mainly arises from the model's uncertainty in its output, we propose an adaptation scheme where the model learns from its own segmentation decisions as it streams a video, which allows producing more confident and temporally consistent labeling for similarly-looking pixels across frames. For stability and efficiency, we leverage a small auxiliary segmentation network (AuxNet) to assist with this adaptation. More specifically, AuxNet readjusts the decision of the original segmentation network (Main-Net) by adding its own estimations to that of MainNet. At every frame, only AuxNet is updated via back-propagation while keeping MainNet fixed. We extensively evaluate our test-time adaptation approach on standard video benchmarks, including Cityscapes, CamVid, and KITTI. The results demonstrate that our approach provides label-wise accurate, temporally consistent, and computationally efficient adaptation (5+ folds overhead reduction comparing to state-of-the-art test-time adaptation methods).

READ FULL TEXT

page 1

page 8

page 11

page 12

page 13

page 14

research
07/26/2016

Joint Optical Flow and Temporally Consistent Semantic Segmentation

The importance and demands of visual scene understanding have been stead...
research
02/27/2019

Single-frame Regularization for Temporally Stable CNNs

Convolutional neural networks (CNNs) can model complicated non-linear re...
research
08/01/2018

Learning Blind Video Temporal Consistency

Applying image processing algorithms independently to each frame of a vi...
research
06/08/2016

Point-wise mutual information-based video segmentation with high temporal consistency

In this paper, we tackle the problem of temporally consistent boundary d...
research
10/24/2021

Perceptual Consistency in Video Segmentation

In this paper, we present a novel perceptual consistency perspective on ...
research
04/04/2019

Architecture Search of Dynamic Cells for Semantic Video Segmentation

In semantic video segmentation the goal is to acquire consistent dense s...
research
12/28/2016

Semantic Video Segmentation by Gated Recurrent Flow Propagation

Semantic video segmentation is challenging due to the sheer amount of da...

Please sign up or login with your details

Forgot password? Click here to reset