
Adversarial Computation of Optimal Transport Maps
Computing optimal transport maps between highdimensional and continuous...
06/24/2019 ∙ by Jacob Leygonie, et al. ∙ 0 ∙ shareread it

Entropyregularized Optimal Transport Generative Models
We investigate the use of entropyregularized optimal transport (EOT) co...
11/16/2018 ∙ by Dong Liu, et al. ∙ 4 ∙ shareread it

Interpolating between Optimal Transport and MMD using Sinkhorn Divergences
Comparing probability distributions is a fundamental problem in data sci...
10/18/2018 ∙ by Jean Feydy, et al. ∙ 0 ∙ shareread it

Learning with minibatch Wasserstein : asymptotic and gradient properties
Optimal transport distances are powerful tools to compare probability di...
10/09/2019 ∙ by Kilian Fatras, et al. ∙ 39 ∙ shareread it

Generating the support with extreme value losses
When optimizing against the mean loss over a distribution of predictions...
02/08/2019 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it

Wasserstein Dictionary Learning: Optimal Transportbased unsupervised nonlinear dictionary learning
This article introduces a new nonlinear dictionary learning method for ...
08/07/2017 ∙ by Morgan A. Schmitz, et al. ∙ 0 ∙ shareread it

GEAR: GeometryAware Rényi Information
Shannon's seminal theory of information has been of paramount importance...
06/19/2019 ∙ by Jose Gallego, et al. ∙ 3 ∙ shareread it
Learning Generative Models with Sinkhorn Divergences
The ability to compare two degenerate probability distributions (i.e. two probability distributions supported on two distinct lowdimensional manifolds living in a much higherdimensional space) is a crucial problem arising in the estimation of generative models for highdimensional observations such as those arising in computer vision or natural language. It is known that optimal transport metrics can represent a cure for this problem, since they were specifically designed as an alternative to information divergences to handle such problematic scenarios. Unfortunately, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational burden of evaluating OT losses, (ii) the instability and lack of smoothness of these losses, (iii) the difficulty to estimate robustly these losses and their gradients in high dimension. This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations. These two approximations result in a robust and differentiable approximation of the OT loss with streamlined GPU execution. Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus allowing to find a sweet spot leveraging the geometry of OT and the favorable highdimensional sample complexity of MMD which comes with unbiased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function.
READ FULL TEXT