Inter-BMV: Interpolation with Block Motion Vectors for Fast Semantic Segmentation on Video

10/08/2018
by   Samvit Jain, et al.
0

Models optimized for accuracy on single images are often prohibitively slow to run on each frame in a video. Recent work exploits the use of optical flow to warp image features forward from select keyframes, as a means to conserve computation on video. This approach, however, achieves only limited speedup, even when optimized, due to the accuracy degradation introduced by repeated forward warping, and the inference cost of optical flow estimation. To address these problems, we propose a new scheme that propagates features using the block motion vectors (BMV) present in compressed video (e.g. H.264 codecs), instead of optical flow, and bi-directionally warps and fuses features from enclosing keyframes to capture scene context on each video frame. Our technique, interpolation-BMV, enables us to accurately estimate the features of intermediate frames, while keeping inference costs low. We evaluate our system on the CamVid and Cityscapes datasets, comparing to both a strong single-frame baseline and related work. We find that we are able to substantially accelerate segmentation on video, achieving near real-time frame rates (20+ frames per second) on large images (e.g. 960 x 720 pixels), while maintaining competitive accuracy. This represents an improvement of almost 6x over the single-frame baseline and 2.5x over the fastest prior work.

READ FULL TEXT
research
03/21/2018

Fast Semantic Segmentation on Video Using Motion Vector-Based Feature Interpolation

Models optimized for accuracy on challenging, dense prediction tasks suc...
research
12/11/2019

Deep motion estimation for parallel inter-frame prediction in video compression

Standard video codecs rely on optical flow to guide inter-frame predicti...
research
07/17/2018

Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video

In this paper, we present Accel, a novel semantic video segmentation sys...
research
04/04/2019

Architecture Search of Dynamic Cells for Semantic Video Segmentation

In semantic video segmentation the goal is to acquire consistent dense s...
research
03/17/2022

Transframer: Arbitrary Frame Prediction with Generative Models

We present a general-purpose framework for image modelling and vision ta...
research
07/25/2022

Error-Aware Spatial Ensembles for Video Frame Interpolation

Video frame interpolation (VFI) algorithms have improved considerably in...
research
08/10/2017

Semantic Video CNNs through Representation Warping

In this work, we propose a technique to convert CNN models for semantic ...

Please sign up or login with your details

Forgot password? Click here to reset