Universal Approximation Under Constraints is Possible with Transformers

10/07/2021
by   Anastasis Kratsios, et al.
0

Many practical problems need the output of a machine learning model to satisfy a set of constraints, K. Nevertheless, there is no known guarantee that classical neural network architectures can exactly encode constraints while simultaneously achieving universality. We provide a quantitative constrained universal approximation theorem which guarantees that for any non-convex compact set K and any continuous function f:ℝ^n→ K, there is a probabilistic transformer F̂ whose randomized outputs all lie in K and whose expected output uniformly approximates f. Our second main result is a "deep neural version" of Berge's Maximum Theorem (1963). The result guarantees that given an objective function L, a constraint set K, and a family of soft constraint sets, there is a probabilistic transformer F̂ that approximately minimizes L and whose outputs belong to K; moreover, F̂ approximately satisfies the soft constraints. Our results imply the first universal approximation theorem for classical transformers with exact convex constraint satisfaction. They also yield that a chart-free universal approximation theorem for Riemannian manifold-valued functions subject to suitable geodesically convex constraints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2022

Extending the Universal Approximation Theorem for a Broad Class of Hypercomplex-Valued Neural Networks

The universal approximation theorem asserts that a single hidden layer n...
research
02/11/2016

A Universal Approximation Theorem for Mixture of Experts Models

The mixture of experts (MoE) model is a popular neural network architect...
research
07/31/2020

Rethinking PointNet Embedding for Faster and Compact Model

PointNet, which is the widely used point-wise embedding method and known...
research
01/13/2021

Quantitative Rates and Fundamental Obstructions to Non-Euclidean Universal Approximation with Deep Narrow Feed-Forward Networks

By incorporating structured pairs of non-trainable input and output laye...
research
05/21/2019

A Universal Approximation Result for Difference of log-sum-exp Neural Networks

We show that a neural network whose output is obtained as the difference...
research
10/02/2018

Approximating the Existential Theory of the Reals

The existential theory of the reals (ETR) consists of existentially quan...
research
09/03/2021

Continuous Tasks and the Chromatic Simplicial Approximation Theorem

The celebrated 1999 Asynchronous Computability Theorem (ACT) of Herlihy ...

Please sign up or login with your details

Forgot password? Click here to reset