Query-Key Normalization for Transformers

10/08/2020
by   Alex Henry, et al.
0

Low-resource language translation is a challenging but socially valuable NLP task. Building on recent work adapting the Transformer's normalization to this setting, we propose QKNorm, a normalization technique that modifies the attention mechanism to make the softmax function less prone to arbitrary saturation without sacrificing expressivity. Specifically, we apply ℓ_2 normalization along the head dimension of each query and key matrix prior to multiplying them and then scale up by a learnable parameter instead of dividing by the square root of the embedding dimension. We show improvements averaging 0.928 BLEU over state-of-the-art bilingual benchmarks for 5 low-resource translation pairs from the TED Talks corpus and IWSLT'15.

READ FULL TEXT
research
10/14/2019

Transformers without Tears: Improving the Normalization of Self-Attention

We evaluate three simple, normalization-centric changes to improve Trans...
research
10/01/2019

Application of Low-resource Machine Translation Techniques to Russian-Tatar Language Pair

Neural machine translation is the current state-of-the-art in machine tr...
research
06/16/2022

Text normalization for endangered languages: the case of Ligurian

Text normalization is a crucial technology for low-resource languages wh...
research
10/11/2022

Enriching Biomedical Knowledge for Low-resource Language Through Translation

Biomedical data and benchmarks are highly valuable yet very limited in l...
research
02/01/2023

Attention Link: An Efficient Attention-Based Low Resource Machine Translation Architecture

Transformers have achieved great success in machine translation, but tra...
research
09/05/2023

Advancing Text-to-GLOSS Neural Translation Using a Novel Hyper-parameter Optimization Technique

In this paper, we investigate the use of transformers for Neural Machine...
research
05/25/2020

Dialect Text Normalization to Normative Standard Finnish

We compare different LSTMs and transformer models in terms of their effe...

Please sign up or login with your details

Forgot password? Click here to reset