Looking Fast and Slow: Memory-Guided Mobile Video Object Detection

03/25/2019
by   Mason Liu, et al.
8

With a single eye fixation lasting a fraction of a second, the human visual system is capable of forming a rich representation of a complex environment, reaching a holistic understanding which facilitates object recognition and detection. This phenomenon is known as recognizing the "gist" of the scene and is accomplished by relying on relevant prior knowledge. This paper addresses the analogous question of whether using memory in computer vision systems can not only improve the accuracy of object detection in video streams, but also reduce the computation time. By interleaving conventional feature extractors with extremely lightweight ones which only need to recognize the gist of the scene, we show that minimal computation is required to produce accurate detections when temporal memory is present. In addition, we show that the memory contains enough information for deploying reinforcement learning algorithms to learn an adaptive inference policy. Our model achieves state-of-the-art performance among mobile methods on the Imagenet VID 2015 dataset, while running at speeds of up to 70+ FPS on a Pixel 3 phone.

READ FULL TEXT
research
09/18/2017

Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video

Object detection is considered one of the most challenging problems in t...
research
04/21/2017

Track Everything: Limiting Prior Knowledge in Online Multi-Object Recognition

This paper addresses the problem of online tracking and classification o...
research
10/23/2020

Object-aware Feature Aggregation for Video Object Detection

We present an Object-aware Feature Aggregation (OFA) module for video ob...
research
01/15/2022

Realtime 3D Object Detection for Headsets

Mobile headsets should be capable of understanding 3D physical environme...
research
02/10/2023

Context Understanding in Computer Vision: A Survey

Contextual information plays an important role in many computer vision t...
research
12/07/2016

Spatially Adaptive Computation Time for Residual Networks

This paper proposes a deep learning architecture based on Residual Netwo...
research
02/10/2020

Advances in Deep Space Exploration via Simulators Deep Learning

The StarLight program conceptualizes fast interstellar travel via small ...

Please sign up or login with your details

Forgot password? Click here to reset