Gradient Flows in Dataset Space

10/24/2020
by   David Alvarez-Melis, et al.
29

The current practice in machine learning is traditionally model-centric, casting problems as optimization over model parameters, all the while assuming the data is either fixed, or subject to extrinsic and inevitable change. On one hand, this paradigm fails to capture important existing aspects of machine learning, such as the substantial data manipulation (e.g., augmentation) that goes into most state-of-the-art pipelines. On the other hand, this viewpoint is ill-suited to formalize novel data-centric problems, such as model-agnostic transfer learning or dataset synthesis. In this work, we view these and other problems through the lens of dataset optimization, casting them as optimization over data-generating distributions. We approach this class of problems through Wasserstein gradient flows in probability space, and derive practical and efficient particle-based methods for a flexible but well-behaved class of objective functions. Through various experiments on synthetic and real datasets, we show that this framework provides a principled and effective approach to dataset shaping, transfer, and interpolation.

READ FULL TEXT

page 19

page 21

research
10/21/2021

Sliced-Wasserstein Gradient Flows

Minimizing functionals in the space of probability distributions can be ...
research
04/18/2020

Optimization in Machine Learning: A Distribution Space Approach

We present the viewpoint that optimization problems encountered in machi...
research
12/07/2021

Augment Valuate : A Data Enhancement Pipeline for Data-Centric AI

Data scarcity and noise are important issues in industrial applications ...
research
06/18/2020

Riemannian Continuous Normalizing Flows

Normalizing flows have shown great promise for modelling flexible probab...
research
08/09/2018

Policy Optimization as Wasserstein Gradient Flows

Policy optimization is a core component of reinforcement learning (RL), ...
research
07/22/2018

Knowledge-based Transfer Learning Explanation

Machine learning explanation can significantly boost machine learning's ...

Please sign up or login with your details

Forgot password? Click here to reset