Learning What and Where – Unsupervised Disentangling Location and Identity Tracking

05/26/2022
by   Manuel Traub, et al.
0

Our brain can almost effortlessly decompose visual data streams into background and salient objects. Moreover, it can track the objects and anticipate their motion and interactions. In contrast, recent object reasoning datasets, such as CATER, have revealed fundamental shortcomings of current vision-based AI systems, particularly when targeting explicit object encodings, object permanence, and object reasoning. We introduce an unsupervised disentangled LOCation and Identity tracking system (Loci), which excels on the CATER tracking challenge. Inspired by the dorsal-ventral pathways in the brain, Loci tackles the what-and-where binding problem by means of a self-supervised segregation mechanism. Our autoregressive neural network partitions and distributes the visual input stream across separate, identically-parameterized and autonomously recruited neural network modules. Each module binds what with where, that is, compressed Gestalt encodings with locations. On the deep latent encoding levels interaction dynamics are processed. Besides exhibiting superior performance in current benchmarks, we propose that Loci may set the stage for deeper, explanation-oriented video processing – akin to some deeper networked processes in the brain that appear to integrate individual entity and spatiotemporal interaction dynamics into event structures.

READ FULL TEXT

page 1

page 3

page 6

page 7

page 8

page 9

page 13

page 14

research
01/04/2022

Online Multi-Object Tracking with Unsupervised Re-Identification Learning and Occlusion Estimation

Occlusion between different objects is a typical challenge in Multi-Obje...
research
08/21/2020

Blending of Learning-based Tracking and Object Detection for Monocular Camera-based Target Following

Deep learning has recently started being applied to visual tracking of g...
research
10/05/2021

UHP-SOT: An Unsupervised High-Performance Single Object Tracker

An unsupervised online object tracking method that exploits both foregro...
research
01/15/2020

UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking

We address Unsupervised Video Object Segmentation (UVOS), the task of au...
research
02/13/2020

Chaotic Phase Synchronization and Desynchronization in an Oscillator Network for Object Selection

Object selection refers to the mechanism of extracting objects of intere...
research
12/03/2017

Visual Explanation by High-Level Abduction: On Answer-Set Programming Driven Reasoning about Moving Objects

We propose a hybrid architecture for systematically computing robust vis...
research
07/28/2019

Spatiotemporal Information Processing with a Reservoir Decision-making Network

Spatiotemporal information processing is fundamental to brain functions....

Please sign up or login with your details

Forgot password? Click here to reset