How Deep is the Feature Analysis underlying Rapid Visual Categorization?

by   Sven Eberhardt, et al.

Rapid categorization paradigms have a long history in experimental psychology: Characterized by short presentation times and speedy behavioral responses, these tasks highlight the efficiency with which our visual system processes natural object categories. Previous studies have shown that feed-forward hierarchical models of the visual cortex provide a good fit to human visual decisions. At the same time, recent work in computer vision has demonstrated significant gains in object recognition accuracy with increasingly deep hierarchical architectures. But it is unclear how well these models account for human visual decisions and what they may reveal about the underlying brain processes. We have conducted a large-scale psychophysics study to assess the correlation between computational models and human participants on a rapid animal vs. non-animal categorization task. We considered visual representations of varying complexity by analyzing the output of different stages of processing in three state-of-the-art deep networks. We found that recognition accuracy increases with higher stages of visual processing (higher level stages indeed outperforming human participants on the same task) but that human decisions agree best with predictions from intermediate stages. Overall, these results suggest that human participants may rely on visual features of intermediate complexity and that the complexity of visual representations afforded by modern deep network models may exceed those used by human participants during rapid categorization.


page 4

page 8


Atoms of recognition in human and computer vision

Discovering the visual features and representations used by the brain to...

Early Salient Region Selection Does Not Drive Rapid Visual Categorization

The current dominant visual processing paradigm in both human and machin...

Object categorization in finer levels requires higher spatial frequencies, and therefore takes longer

The human visual system contains a hierarchical sequence of modules that...

Affordances Provide a Fundamental Categorization Principle for Visual Scenes

How do we know that a kitchen is a kitchen by looking? Relatively little...

Audio Matters Too: How Audial Avatar Customization Enhances Visual Avatar Customization

Avatar customization is known to positively affect crucial outcomes in n...

Are Face and Object Recognition Independent? A Neurocomputational Modeling Exploration

Are face and object recognition abilities independent? Although it is co...

Please sign up or login with your details

Forgot password? Click here to reset