MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos

10/18/2022
by   Mathias Parger, et al.
0

Convolutional neural network inference on video input is computationally expensive and has high memory bandwidth requirements. Recently, researchers managed to reduce the cost of processing upcoming frames by only processing pixels that changed significantly. Using sparse convolutions, the sparsity of frame differences can be translated to speedups on current inference devices. However, previous work was relying on static cameras. Moving cameras add new challenges in how to fuse newly unveiled image regions with already processed regions efficiently to minimize the update rate - without increasing memory overhead and without knowing the camera extrinsics of future frames. In this work, we propose MotionDeltaCNN, a CNN framework that supports moving cameras and variable resolution input. We propose a spherical buffer which enables seamless fusion of newly unveiled regions and previously processed regions - without increasing the memory footprint. Our evaluations show that we outperform previous work significantly by explicitly adding support for moving camera input.

READ FULL TEXT

page 2

page 3

page 4

page 6

page 7

research
03/08/2022

DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos

Convolutional neural network inference on video data requires powerful h...
research
03/08/2023

EvConv: Fast CNN Inference on Event Camera Inputs For High-Speed Robot Perception

Event cameras capture visual information with a high temporal resolution...
research
05/24/2023

AutoDepthNet: High Frame Rate Depth Map Reconstruction using Commodity Depth and RGB Cameras

Depth cameras have found applications in diverse fields, such as compute...
research
06/01/2023

Deformable Convolutions and LSTM-based Flexible Event Frame Fusion Network for Motion Deblurring

Event cameras differ from conventional RGB cameras in that they produce ...
research
03/09/2022

VGQ-CNN: Moving Beyond Fixed Cameras and Top-Grasps for Grasp Quality Prediction

We present the Versatile Grasp Quality Convolutional Neural Network (VGQ...
research
04/18/2021

Let's See Clearly: Contaminant Artifact Removal for Moving Cameras

Contaminants such as dust, dirt and moisture adhering to the camera lens...
research
04/22/2020

Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions

In this work we target the problem of estimating accurately localised co...

Please sign up or login with your details

Forgot password? Click here to reset