Embodied vision for learning object representations

05/12/2022
by   Arthur Aubret, et al.
4

Recent time-contrastive learning approaches manage to learn invariant object representations without supervision. This is achieved by mapping successive views of an object onto close-by internal representations. When considering this learning approach as a model of the development of human object recognition, it is important to consider what visual input a toddler would typically observe while interacting with objects. First, human vision is highly foveated, with high resolution only available in the central region of the field of view. Second, objects may be seen against a blurry background due to infants' limited depth of field. Third, during object manipulation a toddler mostly observes close objects filling a large part of the field of view due to their rather short arms. Here, we study how these effects impact the quality of visual representations learnt through time-contrastive learning. To this end, we let a visually embodied agent "play" with objects in different locations of a near photo-realistic flat. During each play session the agent views an object in multiple orientations before turning its body to view another object. The resulting sequence of views feeds a time-contrastive learning algorithm. Our results show that visual statistics mimicking those of a toddler improve object recognition accuracy in both familiar and novel environments. We argue that this effect is caused by the reduction of features extracted in the background, a neural network bias for large features in the image and a greater similarity between novel and familiar background regions. We conclude that the embodied nature of visual learning may be crucial for understanding the development of human object perception.

READ FULL TEXT

page 1

page 2

page 4

page 5

research
07/27/2022

Time to augment contrastive learning

Biological vision systems are unparalleled in their ability to learn vis...
research
12/02/2007

Learning View Generalization Functions

Learning object models from views in 3D visual object recognition is usu...
research
03/31/2017

Transfer of View-manifold Learning to Similarity Perception of Novel Objects

We develop a model of perceptual similarity judgment based on re-trainin...
research
11/20/2016

Object Recognition with and without Objects

While recent deep neural networks have achieved a promising performance ...
research
01/30/2019

Invariant Feature Mappings for Generalizing Affordance Understanding Using Regularized Metric Learning

This paper presents an approach for learning invariant features for obje...
research
05/30/2023

A Computational Account Of Self-Supervised Visual Learning From Egocentric Object Play

Research in child development has shown that embodied experience handlin...
research
02/18/2019

Object Recognition under Multifarious Conditions: A Reliability Analysis and A Feature Similarity-based Performance Estimation

In this paper, we investigate the reliability of online recognition plat...

Please sign up or login with your details

Forgot password? Click here to reset