Visual Descriptor Learning from Monocular Video

04/15/2020
by   Umashankar Deekshith, et al.
5

Correspondence estimation is one of the most widely researched and yet only partially solved area of computer vision with many applications in tracking, mapping, recognition of objects and environment. In this paper, we propose a novel way to estimate dense correspondence on an RGB image where visual descriptors are learned from video examples by training a fully convolutional network. Most deep learning methods solve this by training the network with a large set of expensive labeled data or perform labeling through strong 3D generative models using RGB-D videos. Our method learns from RGB videos using contrastive loss, where relative labeling is estimated from optical flow. We demonstrate the functionality in a quantitative analysis on rendered videos, where ground truth information is available. Not only does the method perform well on test data with the same background, it also generalizes to situations with a new background. The descriptors learned are unique and the representations determined by the network are global. We further show the applicability of the method to real-world videos.

READ FULL TEXT

page 1

page 2

page 4

page 5

research
09/10/2021

Temporally Coherent Person Matting Trained on Fake-Motion Dataset

We propose a novel neural-network-based method to perform matting of vid...
research
03/26/2016

Nonrigid Optical Flow Ground Truth for Real-World Scenes with Time-Varying Shading Effects

In this paper we present a dense ground truth dataset of nonrigidly defo...
research
12/21/2016

Learning Motion Patterns in Videos

The problem of determining whether an object is in motion, irrespective ...
research
04/14/2021

Adaptive Intermediate Representations for Video Understanding

A common strategy to video understanding is to incorporate spatial and m...
research
07/10/2018

SceneEDNet: A Deep Learning Approach for Scene Flow Estimation

Estimating scene flow in RGB-D videos is attracting much interest of the...
research
02/26/2022

Optical flow-based branch segmentation for complex orchard environments

Machine vision is a critical subsystem for enabling robots to be able to...
research
01/19/2022

CAST: Character labeling in Animation using Self-supervision by Tracking

Cartoons and animation domain videos have very different characteristics...

Please sign up or login with your details

Forgot password? Click here to reset