DeepAI AI Chat
Log In Sign Up

Stochastic Optimization of Sorting Networks via Continuous Relaxations

by   Aditya Grover, et al.
Stanford University

Sorting input objects is an important step in many machine learning pipelines. However, the sorting operator is non-differentiable with respect to its inputs, which prohibits end-to-end gradient-based optimization. In this work, we propose NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, where every row sums to one and has a distinct arg max. This relaxation permits straight-through optimization of any computational graph involve a sorting operation. Further, we use this relaxation to enable gradient-based stochastic optimization over the combinatorially large space of permutations by deriving a reparameterized gradient estimator for the Plackett-Luce family of distributions over permutations. We demonstrate the usefulness of our framework on three tasks that require learning semantic orderings of high-dimensional objects, including a fully differentiable, parameterized extension of the k-nearest neighbors algorithm.


page 1

page 2

page 3

page 4


Monotonic Differentiable Sorting Networks

Differentiable sorting algorithms allow training with sorting and rankin...

SoftSort: A Continuous Relaxation for the argsort Operator

While sorting is an important procedure in computer science, the argsort...

Reparameterizing the Birkhoff Polytope for Variational Permutation Inference

Many matching, tracking, sorting, and ranking problems require probabili...

Learning with Differentiable Perturbed Optimizers

Machine learning pipelines often rely on optimization procedures to make...

Permutation Invariant Representations with Applications to Graph Deep Learning

This paper presents primarily two Euclidean embeddings of the quotient s...

PiRank: Learning To Rank via Differentiable Sorting

A key challenge with machine learning approaches for ranking is the gap ...

Differentiable Sorting using Optimal Transport:The Sinkhorn CDF and Quantile Operator

Sorting an array is a fundamental routine in machine learning, one that ...