Injecting Hierarchy with U-Net Transformers

10/16/2019
by   David Donahue, et al.
0

The Transformer architecture has become increasingly popular over the past couple of years, owing to its impressive performance on a number of natural language processing (NLP) tasks. However, it may be argued that the Transformer architecture lacks an explicit hierarchical representation, as all computations occur on word-level representations alone, and therefore, learning structure poses a challenge for Transformer models. In the present work, we introduce hierarchical processing into the Transformer model, taking inspiration from the U-Net architecture, popular in computer vision for its hierarchical view of natural images. We propose a novel architecture that combines ideas from Transformer and U-Net models to incorporate hierarchy at multiple levels of abstraction. We empirically demonstrate that the proposed architecture outperforms the vanilla Transformer and strong baselines in the chit-chat dialogue and machine translation domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2020

Efficient Transformers: A Survey

Transformer model architectures have garnered immense interest lately du...
research
10/30/2019

An Augmented Transformer Architecture for Natural Language Generation Tasks

The Transformer based neural networks have been showing significant adva...
research
04/20/2023

An Introduction to Transformers

The transformer is a neural network component that can be used to learn ...
research
07/14/2022

Forming Trees with Treeformers

Popular models such as Transformers and LSTMs use tokens as its unit of ...
research
01/04/2022

PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture

Transformer networks have achieved great progress for computer vision ta...
research
12/30/2020

Reservoir Transformer

We demonstrate that transformers obtain impressive performance even when...
research
01/21/2022

Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing

How to obtain hierarchical representations with an increasing level of a...

Please sign up or login with your details

Forgot password? Click here to reset