Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos

03/13/2023
by   Yubin Hu, et al.
0

Video semantic segmentation (VSS) is a computationally expensive task due to the per-frame prediction for videos of high frame rates. In recent work, compact models or adaptive network strategies have been proposed for efficient VSS. However, they did not consider a crucial factor that affects the computational cost from the input side: the input resolution. In this paper, we propose an altering resolution framework called AR-Seg for compressed videos to achieve efficient VSS. AR-Seg aims to reduce the computational cost by using low resolution for non-keyframes. To prevent the performance degradation caused by downsampling, we design a Cross Resolution Feature Fusion (CReFF) module, and supervise it with a novel Feature Similarity Training (FST) strategy. Specifically, CReFF first makes use of motion vectors stored in a compressed video to warp features from high-resolution keyframes to low-resolution non-keyframes for better spatial alignment, and then selectively aggregates the warped features with local attention mechanism. Furthermore, the proposed FST supervises the aggregated features with high-resolution features through an explicit similarity loss and an implicit constraint from the shared decoding layer. Extensive experiments on CamVid and Cityscapes show that AR-Seg achieves state-of-the-art performance and is compatible with different segmentation backbones. On CamVid, AR-Seg saves 67 with the PSPNet18 backbone while maintaining high segmentation accuracy. Code: https://github.com/THU-LYJ-Lab/AR-Seg.

READ FULL TEXT

page 4

page 7

research
10/13/2022

U-HRNet: Delving into Improving Semantic Representation of High Resolution Network for Dense Prediction

High resolution and advanced semantic representation are both vital for ...
research
07/07/2020

Real-time Semantic Segmentation with Fast Attention

In deep CNN based models for semantic segmentation, high accuracy relies...
research
02/13/2023

RFC-Net: Learning High Resolution Global Features for Medical Image Segmentation on a Computational Budget

Learning High-Resolution representations is essential for semantic segme...
research
12/25/2012

High Quality Image Interpolation via Local Autoregressive and Nonlocal 3-D Sparse Regularization

In this paper, we propose a novel image interpolation algorithm, which i...
research
04/19/2022

Per-clip adaptive Lagrangian multiplier optimisation with low-resolution proxies

This work focuses on reducing the computational cost of repeated video e...
research
09/15/2023

Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval

Optimizing video inference efficiency has become increasingly important ...
research
07/25/2023

Spectrum-guided Multi-granularity Referring Video Object Segmentation

Current referring video object segmentation (R-VOS) techniques extract c...

Please sign up or login with your details

Forgot password? Click here to reset