Layer-wise Representation Fusion for Compositional Generalization

07/20/2023
by   Yafang Zheng, et al.
0

Despite successes across a broad range of applications, sequence-to-sequence models' construct of solutions are argued to be less compositional than human-like generalization. There is mounting evidence that one of the reasons hindering compositional generalization is representations of the encoder and decoder uppermost layer are entangled. In other words, the syntactic and semantic representations of sequences are twisted inappropriately. However, most previous studies mainly concentrate on enhancing token-level semantic information to alleviate the representations entanglement problem, rather than composing and using the syntactic and semantic representations of sequences appropriately as humans do. In addition, we explain why the entanglement problem exists from the perspective of recent studies about training deeper Transformer, mainly owing to the “shallow” residual connections and its simple, one-step operations, which fails to fuse previous layers' information effectively. Starting from this finding and inspired by humans' strategies, we propose FuSion (Fusing Syntactic and Semantic Representations), an extension to sequence-to-sequence models to learn to fuse previous layers' information back into the encoding and decoding process appropriately through introducing a fuse-attention module at each encoder and decoder layer. FuSion achieves competitive and even state-of-the-art results on two realistic benchmarks, which empirically demonstrates the effectiveness of our proposal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2023

Learn to Compose Syntactic and Semantic Representations Appropriately for Compositional Generalization

Recent studies have shown that sequence-to-sequence (Seq2Seq) models are...
research
10/09/2021

Disentangled Sequence to Sequence Learning for Compositional Generalization

There is mounting evidence that existing neural network models, in parti...
research
12/29/2020

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning

Encoder layer fusion (EncoderFusion) is a technique to fuse all the enco...
research
06/04/2019

Transcoding compositionally: using attention to find more generalizable solutions

While sequence-to-sequence models have shown remarkable generalization p...
research
07/14/2021

Learning Algebraic Recombination for Compositional Generalization

Neural sequence models exhibit limited compositional generalization abil...
research
09/28/2021

Nana-HDR: A Non-attentive Non-autoregressive Hybrid Model for TTS

This paper presents Nana-HDR, a new non-attentive non-autoregressive mod...
research
04/22/2019

BePT: A Process Translator for Sharing Process Models

Sharing process models on the web has emerged as a widely used concept. ...

Please sign up or login with your details

Forgot password? Click here to reset