Reason from Context with Self-supervised Learning

11/23/2022
by   Xiao Liu, et al.
0

A tiny object in the sky cannot be an elephant. Context reasoning is critical in visual recognition, where current inputs need to be interpreted in the light of previous experience and knowledge. To date, research into contextual reasoning in visual recognition has largely proceeded with supervised learning methods. The question of whether contextual knowledge can be captured with self-supervised learning regimes remains under-explored. Here, we established a methodology for context-aware self-supervised learning. We proposed a novel Self-supervised Learning Method for Context Reasoning (SeCo), where the only inputs to SeCo are unlabeled images with multiple objects present in natural scenes. Similar to the distinction between fovea and periphery in human vision, SeCo processes self-proposed target object regions and their contexts separately, and then employs a learnable external memory for retrieving and updating context-relevant target information. To evaluate the contextual associations learned by the computational models, we introduced two evaluation protocols, lift-the-flap and object priming, addressing the problems of "what" and "where" in context reasoning. In both tasks, SeCo outperformed all state-of-the-art (SOTA) self-supervised learning methods by a significant margin. Our network analysis revealed that the external memory in SeCo learns to store prior contextual knowledge, facilitating target identity inference in lift-the-flap task. Moreover, we conducted psychophysics experiments and introduced a Human benchmark in Object Priming dataset (HOP). Our quantitative and qualitative results demonstrate that SeCo approximates human-level performance and exhibits human-like behavior. All our source code and data are publicly available here.

READ FULL TEXT

page 1

page 4

page 5

page 7

page 8

page 15

page 16

research
03/28/2020

Exploit Clues from Views: Self-Supervised and Regularized Learning for Multiview Object Recognition

Multiview recognition has been well studied in the literature and achiev...
research
02/20/2023

A Novel Collaborative Self-Supervised Learning Method for Radiomic Data

The computer-aided disease diagnosis from radiomic data is important in ...
research
04/06/2021

When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes

Context is of fundamental importance to both human and machine vision – ...
research
09/23/2021

How much "human-like" visual experience do current self-supervised learning algorithms need to achieve human-level object recognition?

This paper addresses a fundamental question: how good are our current se...
research
04/06/2023

VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision

Detecting pedestrians accurately in urban scenes is significant for real...
research
08/07/2023

Scaling may be all you need for achieving human-level object recognition capacity with human-like visual experience

This paper asks whether current self-supervised learning methods, if suf...
research
02/09/2021

Improving Visual Reasoning by Exploiting The Knowledge in Texts

This paper presents a new framework for training image-based classifiers...

Please sign up or login with your details

Forgot password? Click here to reset