Differentiable Sorting using Optimal Transport:The Sinkhorn CDF and Quantile Operator

by   Marco Cuturi, et al.

Sorting an array is a fundamental routine in machine learning, one that is used to compute rank-based statistics, cumulative distribution functions (CDFs), quantiles, or to select closest neighbors and labels. The sorting function is however piece-wise constant (the sorting permutation of a vector does not change if the entries of that vector are infinitesimally perturbed) and therefore has no gradient information to back-propagate. We propose a framework to sort elements that is algorithmically differentiable. We leverage the fact that sorting can be seen as a particular instance of the optimal transport (OT) problem on R, from input values to a predefined array of sorted values (e.g. 1,2,...,n if the input array has n elements). Building upon this link , we propose generalized CDFs and quantile operators by varying the size and weights of the target presorted array. Because this amounts to using the so-called Kantorovich formulation of OT, we call these quantities K-sorts, K-CDFs and K-quantiles. We recover differentiable algorithms by adding to the OT problem an entropic regularization, and approximate it using a few Sinkhorn iterations. We call these operators S-sorts, S-CDFs and S-quantiles, and use them in various learning settings: we benchmark them against the recently proposed neuralsort [Grover et al. 2019], propose applications to quantile regression and introduce differentiable formulations of the top-k accuracy that deliver state-of-the art performance.


page 1

page 2

page 3

page 4


Monotonic Differentiable Sorting Networks

Differentiable sorting algorithms allow training with sorting and rankin...

Fast Differentiable Sorting and Ranking

The sorting operation is one of the most basic and commonly used buildin...

Supervised Quantile Normalization for Low-rank Matrix Approximation

Low rank matrix factorization is a fundamental building block in machine...

Differentiable Top-k Operator with Optimal Transport

The top-k operation, i.e., finding the k largest or smallest elements fr...

GenDR: A Generalized Differentiable Renderer

In this work, we present and study a generalized family of differentiabl...

Learning with Differentiable Perturbed Optimizers

Machine learning pipelines often rely on optimization procedures to make...

PiRank: Learning To Rank via Differentiable Sorting

A key challenge with machine learning approaches for ranking is the gap ...

Please sign up or login with your details

Forgot password? Click here to reset