Taskology: Utilizing Task Relations at Scale

05/14/2020
by   Yao Lu, et al.
1

It has been recognized that the joint training of computer vision tasks with shared network components enables higher performance for each individual task. Training tasks together allows learning the inherent relationships among them; however, this requires large sets of labeled data. Instead, we argue that utilizing the known relationships between tasks explicitly allows improving their performance with less labeled data. To this end, we aim to establish and explore a novel approach for the collective training of computer vision tasks. In particular, we focus on utilizing the inherent relations of tasks by employing consistency constraints derived from physics, geometry, and logic. We show that collections of models can be trained without shared components, interacting only through the consistency constraints as supervision (peer-supervision). The consistency constraints enforce the structural priors between tasks, which enables their mutually consistent training, and – in turn – leads to overall higher performance. Treating individual tasks as modules, agnostic to their implementation, reduces the engineering overhead to collectively train many tasks to a minimum. Furthermore, the collective training can be distributed among multiple compute nodes, which further facilitates training at scale. We demonstrate our framework on subsets of the following collection of tasks: depth and normal prediction, semantic segmentation, 3D motion estimation, and object tracking and detection in point clouds.

READ FULL TEXT

page 2

page 4

page 6

page 7

research
07/10/2017

Revisiting Unreasonable Effectiveness of Data in Deep Learning Era

The success of deep learning in vision can be attributed to: (a) models ...
research
09/18/2016

Label-Free Supervision of Neural Networks with Physics and Domain Knowledge

In many machine learning applications, labeled data is scarce and obtain...
research
04/26/2020

Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point Clouds of Wild Scenes

The deficiency of 3D segmentation labels is one of the main obstacles to...
research
08/08/2020

How Trustworthy are the Existing Performance Evaluations for Basic Vision Tasks?

Performance evaluation is indispensable to the advancement of machine vi...
research
11/14/2021

Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks

Computer vision tasks can benefit from the estimation of the salient obj...
research
05/19/2017

Sparse Coding on Stereo Video for Object Detection

Deep Convolutional Neural Networks (DCNN) require millions of labeled tr...
research
12/06/2022

GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

Despite the tremendous progress of Masked Autoencoders (MAE) in developi...

Please sign up or login with your details

Forgot password? Click here to reset