Visual Noise from Natural Scene Statistics Reveals Human Scene Category Representations

by   Michelle R. Greene, et al.

Our perceptions are guided both by the bottom-up information entering our eyes, as well as our top-down expectations of what we will see. Although bottom-up visual processing has been extensively studied, comparatively little is known about top-down signals. Here, we describe REVEAL (Representations Envisioned Via Evolutionary ALgorithm), a method for visualizing an observer's internal representation of a complex, real-world scene, allowing us to, for the first time, visualize the top-down information in an observer's mind. REVEAL rests on two innovations for solving this high dimensional problem: visual noise that samples from natural image statistics, and a computer algorithm that collaborates with human observers to efficiently obtain a solution. In this work, we visualize observers' internal representations of a visual scene category (street) using an experiment in which the observer views the naturalistic visual noise and collaborates with the algorithm to externalize his internal representation. As no scene information was presented, observers had to use their internal knowledge of the target, matching it with the visual features in the noise. We matched reconstructed images with images of real-world street scenes to enhance visualization. Critically, we show that the visualized mental images can be used to predict rapid scene detection performance, as each observer had faster and more accurate responses to detecting real-world images that were the most similar to his reconstructed street templates. These results show that it is possible to visualize previously unobservable mental representations of real world stimuli. More broadly, REVEAL provides a general method for objectively examining the content of previously private, subjective mental experiences.



page 4

page 7

page 10

page 12

page 14

page 16

page 32


Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

A classical problem in computer vision is to infer a 3D scene representa...

Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding

Images with visual and scene text content are ubiquitous in everyday lif...

GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering

We present a simple yet powerful implicit neural function that can repre...

What Makes Natural Scene Memorable?

Recent studies on image memorability have shed light on the visual featu...

Non-parametric spatially constrained local prior for scene parsing on real-world data

Scene parsing aims to recognize the object category of every pixel in sc...

Validation of Modulation Transfer Functions and Noise Power Spectra from Natural Scenes

The Modulation Transfer Function (MTF) and the Noise Power Spectrum (NPS...

Joint Defogging and Demosaicking

Image defogging is a technique used extensively for enhancing visual qua...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.