Towards Incremental Transformers: An Empirical Analysis of Transformer Models for Incremental NLU

09/15/2021
by   Patrick Kahardipraja, et al.
0

Incremental processing allows interactive systems to respond based on partial inputs, which is a desirable property e.g. in dialogue agents. The currently popular Transformer architecture inherently processes sequences as a whole, abstracting away the notion of time. Recent work attempts to apply Transformers incrementally via restart-incrementality by repeatedly feeding, to an unchanged model, increasingly longer input prefixes to produce partial outputs. However, this approach is computationally costly and does not scale efficiently for long sequences. In parallel, we witness efforts to make Transformers more efficient, e.g. the Linear Transformer (LT) with a recurrence mechanism. In this work, we examine the feasibility of LT for incremental NLU in English. Our results show that the recurrent LT model has better incremental performance and faster inference speed compared to the standard Transformer and LT with restart-incrementality, at the cost of part of the non-incremental (full sequence) quality. We show that the performance drop can be mitigated by training the model to wait for right context before committing to an output and that training with input prefixes is beneficial for delivering correct partial outputs.

READ FULL TEXT
research
05/18/2023

TAPIR: Learning Adaptive Revision for Incremental Natural Language Understanding with a Two-Pass Model

Language is by its very nature incremental in how it is produced and pro...
research
10/11/2020

Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU

While humans process language incrementally, the best language encoders ...
research
07/27/2023

Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs

Deep learning often faces the challenge of efficiently processing dynami...
research
01/13/2020

Reformer: The Efficient Transformer

Large Transformer models routinely achieve state-of-the-art results on a...
research
10/08/2021

Iterative Decoding for Compositional Generalization in Transformers

Deep learning models do well at generalizing to in-distribution data but...
research
05/02/2022

Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection

In modern interactive speech-based systems, speech is consumed and trans...

Please sign up or login with your details

Forgot password? Click here to reset