A Probabilistic Theory of Supervised Similarity Learning for Pointwise ROC Curve Optimization

07/18/2018
by   Robin Vogel, et al.
0

The performance of many machine learning techniques depends on the choice of an appropriate similarity or distance measure on the input space. Similarity learning (or metric learning) aims at building such a measure from training data so that observations with the same (resp. different) label are as close (resp. far) as possible. In this paper, similarity learning is investigated from the perspective of pairwise bipartite ranking, where the goal is to rank the elements of a database by decreasing order of the probability that they share the same label with some query data point, based on the similarity scores. A natural performance criterion in this setting is pointwise ROC optimization: maximize the true positive rate under a fixed false positive rate. We study this novel perspective on similarity learning through a rigorous probabilistic framework. The empirical version of the problem gives rise to a constrained optimization formulation involving U-statistics, for which we derive universal learning rates as well as faster rates under a noise assumption on the data distribution. We also address the large-scale setting by analyzing the effect of sampling-based approximations. Our theoretical results are supported by illustrative numerical experiments.

READ FULL TEXT

page 19

page 20

research
06/21/2019

On Tree-based Methods for Similarity Learning

In many situations, the choice of an adequate similarity measure or metr...
research
08/18/2020

Positive semidefinite support vector regression metric learning

Most existing metric learning methods focus on learning a similarity or ...
research
04/29/2020

Metric learning by Similarity Network for Deep Semi-Supervised Learning

Deep semi-supervised learning has been widely implemented in the real-wo...
research
11/09/2020

A Theory of Universal Learning

How quickly can a given class of concepts be learned from examples? It i...
research
03/08/2019

Ranked List Loss for Deep Metric Learning

The objective of deep metric learning (DML) is to learn embeddings that ...
research
12/18/2013

Functional Bipartite Ranking: a Wavelet-Based Filtering Approach

It is the main goal of this article to address the bipartite ranking iss...
research
06/07/2022

Beyond spectral gap: The role of the topology in decentralized learning

In data-parallel optimization of machine learning models, workers collab...

Please sign up or login with your details

Forgot password? Click here to reset