Multi-Stream Transformers

07/21/2021
by   Mikhail Burtsev, et al.
0

Transformer-based encoder-decoder models produce a fused token-wise representation after every encoder layer. We investigate the effects of allowing the encoder to preserve and explore alternative hypotheses, combined at the end of the encoding process. To that end, we design and examine a Multi-stream Transformer architecture and find that splitting the Transformer encoder into multiple encoder streams and allowing the model to merge multiple representational hypotheses improves performance, with further improvement obtained by adding a skip connection between the first and the final encoder layer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2021

Multi-Encoder Learning and Stream Fusion for Transformer-Based End-to-End Automatic Speech Recognition

Stream fusion, also known as system combination, is a common technique i...
research
10/28/2022

PSFormer: Point Transformer for 3D Salient Object Detection

We propose PSFormer, an effective point transformer model for 3D salient...
research
01/15/2022

ViTBIS: Vision Transformer for Biomedical Image Segmentation

In this paper, we propose a novel network named Vision Transformer for B...
research
12/02/2022

A Multi-Stream Fusion Network for Image Splicing Localization

In this paper, we address the problem of image splicing localization wit...
research
07/27/2022

Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers

Transformer encoder architectures have recently achieved state-of-the-ar...
research
10/11/2021

Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning

Studies on self-supervised visual representation learning (SSL) improve ...
research
06/23/2023

The Double Helix inside the NLP Transformer

We introduce a framework for analyzing various types of information in a...

Please sign up or login with your details

Forgot password? Click here to reset