Patchwork: A Patch-wise Attention Network for Efficient Object Detection and Segmentation in Video Streams

04/03/2019
by   Yuning Chai, et al.
0

Recent advances in single-frame object detection and segmentation techniques have motivated a wide range of works to extend these methods to process video streams. In this paper, we explore the idea of hard attention aimed for latency-sensitive applications. Instead of reasoning about every frame separately, our method selects and only processes a small sub-window of the frame. Our technique then makes predictions for the full frame based on the sub-windows from previous frames and the update from the current sub-window. The latency reduction by this hard attention mechanism comes at the cost of degraded accuracy. We made two contributions to address this. First, we propose a specialized memory cell that recovers lost context when processing sub-windows. Secondly, we adopt a Q-learning-based policy training strategy that enables our approach to intelligently select the sub-windows such that the staleness in the memory hurts the performance the least. Our experiments suggest that our approach reduces the latency by approximately four times without significantly sacrificing the accuracy on the ImageNet VID video object detection dataset and the DAVIS video object segmentation dataset. We further demonstrate that we can reinvest the saved computation into other parts of the network, and thus resulting in an accuracy increase at a comparable computational cost as the original system and beating other recently proposed state-of-the-art methods in the low latency range.

READ FULL TEXT

page 1

page 4

page 6

page 7

page 8

page 9

page 10

page 11

research
04/18/2021

Motion Vector Extrapolation for Video Object Detection

Despite the continued successes of computationally efficient deep neural...
research
11/30/2017

Towards High Performance Video Object Detection

There has been significant progresses for image object detection in rece...
research
01/31/2015

Max-Margin Object Detection

Most object detection methods operate by applying a binary classifier to...
research
12/16/2017

Impression Network for Video Object Detection

Video object detection is more challenging compared to image object dete...
research
08/12/2017

Kill Two Birds With One Stone: Boosting Both Object Detection Accuracy and Speed With adaptive Patch-of-Interest Composition

Object detection is an important yet challenging task in video understan...
research
02/24/2020

Video Monitoring Queries

Recent advances in video processing utilizing deep learning primitives a...
research
12/11/2014

An active search strategy for efficient object class detection

Object class detectors typically apply a window classifier to all the wi...

Please sign up or login with your details

Forgot password? Click here to reset