High-Level Perceptual Similarity is Enabled by Learning Diverse Tasks

03/26/2019
by   Amir Rosenfeld, et al.
10

Predicting human perceptual similarity is a challenging subject of ongoing research. The visual process underlying this aspect of human vision is thought to employ multiple different levels of visual analysis (shapes, objects, texture, layout, color, etc). In this paper, we postulate that the perception of image similarity is not an explicitly learned capability, but rather one that is a byproduct of learning others. This claim is supported by leveraging representations learned from a diverse set of visual tasks and using them jointly to predict perceptual similarity. This is done via simple feature concatenation, without any further learning. Nevertheless, experiments performed on the challenging Totally-Looks-Like (TLL) benchmark significantly surpass recent baselines, closing much of the reported gap towards prediction of human perceptual similarity. We provide an analysis of these results and discuss them in a broader context of emergent visual capabilities and their implications on the course of machine-vision research.

READ FULL TEXT

page 2

page 5

page 6

page 7

research
06/15/2023

DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data

Current perceptual similarity metrics operate at the level of pixels and...
research
08/28/2023

Efficient Discovery and Effective Evaluation of Visual Perceptual Similarity: A Benchmark and Beyond

Visual similarities discovery (VSD) is an important task with broad e-co...
research
09/26/2017

Beyond Accessibility: Lifting Perceptual Limitations for Everyone

We propose that accessibility research can lay the foundation for techno...
research
08/16/2020

Visual stream connectivity predicts assessments of image quality

Some biological mechanisms of early vision are comparatively well unders...
research
02/14/2018

Similarity measures for vocal-based drum sample retrieval using deep convolutional auto-encoders

The expressive nature of the voice provides a powerful medium for commun...
research
10/13/2020

Transforming Neural Network Visual Representations to Predict Human Judgments of Similarity

Deep-learning vision models have shown intriguing similarities and diffe...
research
08/01/2008

I'm sorry to say, but your understanding of image processing fundamentals is absolutely wrong

The ongoing discussion whether modern vision systems have to be viewed a...

Please sign up or login with your details

Forgot password? Click here to reset