Learning similarity preserving representations with neural similarity encoders

02/06/2017
by   Franziska Horn, et al.
0

Many dimensionality reduction or manifold learning algorithms optimize for retaining the pairwise similarities, distances, or local neighborhoods of data points. Spectral methods like Kernel PCA (kPCA) or isomap achieve this by computing the singular value decomposition (SVD) of some similarity matrix to obtain a low dimensional representation of the original data. However, this is computationally expensive if a lot of training examples are available and, additionally, representations for new (out-of-sample) data points can only be created when the similarities to the original training examples can be computed. We introduce similarity encoders (SimEc), which learn similarity preserving representations by using a feed-forward neural network to map data into an embedding space where the original similarities can be approximated linearly. The model optimizes the same objective as kPCA but in the process it learns a linear or non-linear embedding function (in the form of the tuned neural network), with which the representations of novel data points can be computed - even if the original pairwise similarities of the training set were generated by an unknown process such as human ratings. By creating embeddings for both image and text datasets, we demonstrate that SimEc can, on the one hand, reach the same solution as spectral methods, and, on the other hand, obtain meaningful embeddings from similarities based on human labels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2022

Interpretable Dimensionality Reduction by Feature Preserving Manifold Approximation and Projection

Nonlinear dimensionality reduction lacks interpretability due to the abs...
research
07/11/2017

Similarity Search Over Graphs Using Localized Spectral Analysis

This paper provides a new similarity detection algorithm. Given an input...
research
11/23/2020

Manifold Partition Discriminant Analysis

We propose a novel algorithm for supervised dimensionality reduction nam...
research
06/01/2016

A Survey on Learning to Hash

Nearest neighbor search is a problem of finding the data points from the...
research
04/06/2020

Continuous Histogram Loss: Beyond Neural Similarity

Similarity learning has gained a lot of attention from researches in rec...
research
08/08/2020

Dimensionality Reduction via Diffusion Map Improved with Supervised Linear Projection

When performing classification tasks, raw high dimensional features ofte...
research
12/17/2021

Sublinear Time Approximation of Text Similarity Matrices

We study algorithms for approximating pairwise similarity matrices that ...

Please sign up or login with your details

Forgot password? Click here to reset