Sampled Transformer for Point Sets

02/28/2023
by   Shidi Li, et al.
0

The sparse transformer can reduce the computational complexity of the self-attention layers to O(n), whilst still being a universal approximator of continuous sequence-to-sequence functions. However, this permutation variant operation is not appropriate for direct application to sets. In this paper, we proposed an O(n) complexity sampled transformer that can process point set elements directly without any additional inductive bias. Our sampled transformer introduces random element sampling, which randomly splits point sets into subsets, followed by applying a shared Hamiltonian self-attention mechanism to each subset. The overall attention mechanism can be viewed as a Hamiltonian cycle in the complete attention graph, and the permutation of point set elements is equivalent to randomly sampling Hamiltonian cycles. This mechanism implements a Monte Carlo simulation of the O(n^2) dense attention connections. We show that it is a universal approximator for continuous set-to-set functions. Experimental results on point-clouds show comparable or better accuracy with significantly reduced computational complexity compared to the dense transformer or alternative sparse attention schemes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2019

Are Transformers universal approximators of sequence-to-sequence functions?

Despite the widespread adoption of Transformer models for NLP tasks, the...
research
10/01/2018

Set Transformer

Many machine learning tasks such as multiple instance learning, 3D shape...
research
04/06/2019

Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling

Geometric deep learning is increasingly important thanks to the populari...
research
07/11/2022

Dual Vision Transformer

Prior works have proposed several strategies to reduce the computational...
research
01/30/2022

Fast Monte-Carlo Approximation of the Attention Mechanism

We introduce Monte-Carlo Attention (MCA), a randomized approximation met...
research
04/08/2022

Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition

While the Transformer architecture has become ubiquitous in the machine ...
research
10/27/2022

Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost

To overcome the quadratic cost of self-attention, recent works have prop...

Please sign up or login with your details

Forgot password? Click here to reset