A Computational Account Of Self-Supervised Visual Learning From Egocentric Object Play

05/30/2023
by   Deepayan Sanyal, et al.
0

Research in child development has shown that embodied experience handling physical objects contributes to many cognitive abilities, including visual learning. One characteristic of such experience is that the learner sees the same object from several different viewpoints. In this paper, we study how learning signals that equate different viewpoints – e.g., assigning similar representations to different views of a single object – can support robust visual learning. We use the Toybox dataset, which contains egocentric videos of humans manipulating different objects, and conduct experiments using a computer vision framework for self-supervised contrastive learning. We find that representations learned by equating different physical viewpoints of an object benefit downstream image classification accuracy. Further experiments show that this performance improvement is robust to variations in the gaps between viewpoints, and that the benefits transfer to several different image classification tasks.

READ FULL TEXT

page 1

page 2

page 4

research
07/27/2022

On the robustness of self-supervised representations for multi-view object classification

It is known that representations from self-supervised pre-training can p...
research
07/30/2021

Object-aware Contrastive Learning for Debiased Scene Representation

Contrastive self-supervised learning has shown impressive results in lea...
research
06/20/2020

Unsupervised Image Classification for Deep Representation Learning

Deep clustering against self-supervised learning is a very important and...
research
08/09/2017

Transitive Invariance for Self-supervised Visual Representation Learning

Learning visual representations with self-supervised learning has become...
research
04/05/2016

The Curious Robot: Learning Visual Representations via Physical Interactions

What is the right supervisory signal to train visual representations? Cu...
research
03/19/2020

Foldover Features for Dynamic Object Behavior Description in Microscopic Videos

Behavior description is conducive to the analysis of tiny objects, simil...
research
05/12/2022

Embodied vision for learning object representations

Recent time-contrastive learning approaches manage to learn invariant ob...

Please sign up or login with your details

Forgot password? Click here to reset