An Optimal Transport Kernel for Feature Aggregation and its Relationship to Attention

06/22/2020
by   Grégoire Mialon, et al.
0

We introduce a kernel for sets of features based on an optimal transport distance, along with an explicit embedding function. Our approach addresses the problem of feature aggregation, or pooling, for sets that exhibit long-range dependencies between their members. More precisely, our embedding aggregates the features of a given set according to the transport plan between the set and a reference shared across the data set. Unlike traditional hand-crafted kernels, our embedding can be optimized for a specific task or data set. It also has a natural connection to attention mechanisms in neural networks, which are commonly used to deal with sets, yet requires less data. Our embedding is particularly suited for biological sequence classification tasks and shows promising results for natural language sequences. We provide an implementation of our embedding that can be used alone or as a module in larger learning models. Our code is freely available at https://github.com/claying/OTK .

READ FULL TEXT
research
12/13/2022

Regularized Optimal Transport Layers for Generalized Global Pooling Operations

Global pooling is one of the most significant operations in many machine...
research
06/07/2021

Measuring Generalization with Optimal Transport

Understanding the generalization of deep neural networks is one of the m...
research
01/23/2022

Revisiting Pooling through the Lens of Optimal Transport

Pooling is one of the most significant operations in many machine learni...
research
06/05/2019

Improving Textual Network Embedding with Global Attention via Optimal Transport

Constituting highly informative network embeddings is an important tool ...
research
02/08/2020

Statistical Optimal Transport posed as Learning Kernel Embedding

This work takes the novel approach of posing the statistical Optimal Tra...
research
07/18/2019

optimalFlow: Optimal-transport approach to flow cytometry gating and population matching

Data used in Flow Cytometry present pronounced variability due to biolog...

Please sign up or login with your details

Forgot password? Click here to reset