The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots

04/01/2019
by   Fabio Cermelli, et al.
0

Deep networks have brought significant advances in robot perception, enabling to improve the capabilities of robots in several visual tasks, ranging from object detection and recognition to pose estimation, semantic scene segmentation and many others. Still, most approaches typically address visual tasks in isolation, resulting in overspecialized models which achieve strong performances in specific applications but work poorly in other (often related) tasks. This is clearly sub-optimal for a robot which is often required to perform simultaneously multiple visual recognition tasks in order to properly act and interact with the environment. This problem is exacerbated by the limited computational and memory resources typically available onboard to a robotic platform. The problem of learning flexible models which can handle multiple tasks in a lightweight manner has recently gained attention in the computer vision community and benchmarks supporting this research have been proposed. In this work we study this problem in the robot vision context, proposing a new benchmark, the RGB-D Triathlon, and evaluating state of the art algorithms in this novel challenging scenario. We also define a new evaluation protocol, better suited to the robot vision setting. Results shed light on the strengths and weaknesses of existing approaches and on open issues, suggesting directions for future research.

READ FULL TEXT

page 1

page 2

page 4

research
11/15/2019

OpenLORIS-Object: A Dataset and Benchmark towards Lifelong Object Recognition

The recent breakthroughs in computer vision have benefited from the avai...
research
04/13/2015

Real-world Object Recognition with Off-the-shelf Deep Conv Nets: How Many Objects can iCub Learn?

The ability to visually recognize objects is a fundamental skill for rob...
research
01/24/2018

The challenge of simultaneous object detection and pose estimation: a comparative study

Detecting objects and estimating their pose remains as one of the major ...
research
07/08/2019

Introduction to Camera Pose Estimation with Deep Learning

Over the last two decades, deep learning has transformed the field of co...
research
10/03/2022

A Strong Transfer Baseline for RGB-D Fusion in Vision Transformers

The Vision Transformer (ViT) architecture has recently established its p...
research
07/18/2018

Visual Affordance and Function Understanding: A Survey

Nowadays, robots are dominating the manufacturing, entertainment and hea...
research
05/15/2023

Event Camera-based Visual Odometry for Dynamic Motion Tracking of a Legged Robot Using Adaptive Time Surface

Our paper proposes a direct sparse visual odometry method that combines ...

Please sign up or login with your details

Forgot password? Click here to reset