From simple innate biases to complex visual concepts

09/01/2021
by   Daniel Harari, et al.
0

Early in development, infants learn to solve visual problems that are highly challenging for current computational methods. We present a model that deals with two fundamental problems in which the gap between computational difficulty and infant learning is particularly striking: learning to recognize hands and learning to recognize gaze direction. The model is shown a stream of natural videos and learns without any supervision to detect human hands by appearance and by context, as well as direction of gaze, in complex natural scenes. The algorithm is guided by an empirically motivated innate mechanism—the detection of “mover” events in dynamic images, which are the events of a moving image region causing a stationary region to move or change after contact. Mover events provide an internal teaching signal, which is shown to be more effective than alternative cues and sufficient for the efficient acquisition of hand and gaze representations. The implications go beyond the specific tasks, by showing how domain-specific “proto concepts” can guide the system to acquire meaningful concepts, which are significant to the observer but statistically inconspicuous in the sensory input.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
12/13/2018

Using Motion and Internal Supervision in Object Recognition

In this thesis we address two related aspects of visual object recogniti...
research
11/29/2016

Measuring and modeling the perception of natural and unconstrained gaze in humans and machines

Humans are remarkably adept at interpreting the gaze direction of other ...
research
04/10/2018

Discovery and usage of joint attention in images

Joint visual attention is characterized by two or more individuals looki...
research
09/01/2021

A model for discovering 'containment' relations

Rapid developments in the fields of learning and object recognition have...
research
04/12/2023

Bridging the Gap: Gaze Events as Interpretable Concepts to Explain Deep Neural Sequence Models

Recent work in XAI for eye tracking data has evaluated the suitability o...
research
01/31/2022

Beyond synchronization: Body gestures and gaze direction in duo performance

In this chapter, we focus on two main categories of visual interaction: ...
research
03/14/2019

Teaching with IMPACT

Like many problems in AI in their general form, supervised learning is c...

Please sign up or login with your details

Forgot password? Click here to reset