Log-Precision Transformers are Constant-Depth Uniform Threshold Circuits

07/02/2022
by   William Merrill, et al.
0

We prove that transformer neural networks with logarithmic precision in the input length (and where the feedforward subnetworks are computable using linear space in their input length) can be simulated by constant-depth uniform threshold circuits. Thus, such transformers only recognize formal languages in 𝖳𝖢^0, the class of languages defined by constant-depth, poly-size threshold circuits. This demonstrates a connection between a practical claim in NLP and a theoretical conjecture in computational complexity theory: "attention is all you need" (Vaswani et al., 2017), i.e., transformers are capable of all efficient computation, only if all efficiently computable problems can be solved with log space, i.e., 𝖫 = 𝖯. We also construct a transformer that can evaluate any constant-depth threshold circuit on any input, proving that transformers can follow instructions that are representable in 𝖳𝖢^0.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2023

Average-Hard Attention Transformers are Constant-Depth Uniform Threshold Circuits

Transformers have emerged as a widely used neural network model for vari...
research
10/06/2022

Transformers Can Be Expressed In First-Order Logic with Majority

Characterizing the implicit structure of the computation within neural n...
research
06/30/2021

On the Power of Saturated Transformers: A View from Circuit Complexity

Transformers have become a standard architecture for many NLP problems. ...
research
04/13/2022

Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity

This paper analyzes three formal models of Transformer encoders that dif...
research
07/02/2019

Efficient Circuit Simulation in MapReduce

The MapReduce framework has firmly established itself as one of the most...
research
01/25/2023

Tighter Bounds on the Expressivity of Transformer Encoders

Characterizing neural networks in terms of better-understood formal syst...
research
06/05/2023

Representational Strengths and Limitations of Transformers

Attention layers, as commonly used in transformers, form the backbone of...

Please sign up or login with your details

Forgot password? Click here to reset