ALP: Action-Aware Embodied Learning for Perception

06/16/2023
by   Xinran Liang, et al.
0

Current methods in training and benchmarking vision models exhibit an over-reliance on passive, curated datasets. Although models trained on these datasets have shown strong performance in a wide variety of tasks such as classification, detection, and segmentation, they fundamentally are unable to generalize to an ever-evolving world due to constant out-of-distribution shifts of input data. Therefore, instead of training on fixed datasets, can we approach learning in a more human-centric and adaptive manner? In this paper, we introduce Action-aware Embodied Learning for Perception (ALP), an embodied learning framework that incorporates action information into representation learning through a combination of optimizing policy gradients through reinforcement learning and inverse dynamics prediction objectives. Our method actively explores complex 3D environments to both learn generalizable task-agnostic representations as well as collect downstream training data. We show that ALP outperforms existing baselines in object detection and semantic segmentation. In addition, we show that by training on actively collected data more relevant to the environment and task, our method generalizes more robustly to downstream tasks compared to models pre-trained on fixed datasets such as ImageNet.

READ FULL TEXT

page 8

page 16

research
04/08/2022

Does Robustness on ImageNet Transfer to Downstream Tasks?

As clean ImageNet accuracy nears its ceiling, the research community is ...
research
03/25/2022

Reinforcement Learning with Action-Free Pre-Training from Videos

Recent unsupervised pre-training methods have shown to be effective on l...
research
12/23/2019

Learning to Navigate Using Mid-Level Visual Priors

How much does having visual priors about the world (e.g. the fact that t...
research
04/22/2021

Pri3D: Can 3D Priors Help 2D Representation Learning?

Recent advances in 3D perception have shown impressive progress in under...
research
07/20/2022

Is an Object-Centric Video Representation Beneficial for Transfer?

The objective of this work is to learn an object-centric video represent...
research
06/21/2021

Lossy Compression for Lossless Prediction

Most data is automatically collected and only ever "seen" by algorithms....
research
10/13/2020

Which Model to Transfer? Finding the Needle in the Growing Haystack

Transfer learning has been recently popularized as a data-efficient alte...

Please sign up or login with your details

Forgot password? Click here to reset