Don't Take It Literally: An Edit-Invariant Sequence Loss for Text Generation

06/29/2021
by   Guangyi Liu, et al.
0

Neural text generation models are typically trained by maximizing log-likelihood with the sequence cross entropy loss, which encourages an exact token-by-token match between a target sequence with a generated sequence. Such training objective is sub-optimal when the target sequence not perfect, e.g., when the target sequence is corrupted with noises, or when only weak sequence supervision is available. To address this challenge, we propose a novel Edit-Invariant Sequence Loss (EISL), which computes the matching loss of a target n-gram with all n-grams in the generated sequence. EISL draws inspirations from convolutional networks (ConvNets) which are shift-invariant to images, hence is robust to the shift of n-grams to tolerate edits in the target sequences. Moreover, the computation of EISL is essentially a convolution operation with target n-grams as kernels, which is easy to implement with existing libraries. To demonstrate the effectiveness of EISL, we conduct experiments on three tasks: machine translation with noisy target sequences, unsupervised text style transfer, and non-autoregressive machine translation. Experimental results show our method significantly outperforms cross entropy loss on these three tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2021

Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation

We propose a new training objective named order-agnostic cross entropy (...
research
10/20/2022

Multi-Granularity Optimization for Non-Autoregressive Translation

Despite low latency, non-autoregressive machine translation (NAT) suffer...
research
02/08/2022

Differentiable N-gram Objective on Abstractive Summarization

ROUGE is a standard automatic evaluation metric based on n-grams for seq...
research
10/30/2022

DiffusER: Discrete Diffusion via Edit-based Reconstruction

In text generation, models that generate text from scratch one token at ...
research
06/08/2023

SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking

In many domains, autoregressive models can achieve low log-likelihood on...
research
05/19/2023

Reducing Sequence Length by Predicting Edit Operations with Large Language Models

Large Language Models (LLMs) have demonstrated remarkable performance in...
research
09/03/2019

Encode, Tag, Realize: High-Precision Text Editing

We propose LaserTagger - a sequence tagging approach that casts text gen...

Please sign up or login with your details

Forgot password? Click here to reset