Show Me How To Revise: Improving Lexically Constrained Sentence Generation with XLNet

09/13/2021
by   Xingwei He, et al.
0

Lexically constrained sentence generation allows the incorporation of prior knowledge such as lexical constraints into the output. This technique has been applied to machine translation, and dialog response generation. Previous work usually used Markov Chain Monte Carlo (MCMC) sampling to generate lexically constrained sentences, but they randomly determined the position to be edited and the action to be taken, resulting in many invalid refinements. To overcome this challenge, we used a classifier to instruct the MCMC-based models where and how to refine the candidate sentences. First, we developed two methods to create synthetic data on which the pre-trained model is fine-tuned to obtain a reliable classifier. Next, we proposed a two-step approach, "Predict and Revise", for constrained sentence generation. During the predict step, we leveraged the classifier to compute the learned prior for the candidate sentence. During the revise step, we resorted to MCMC sampling to revise the candidate sentence by conducting a sampled action at a sampled position drawn from the learned prior. We compared our proposed models with many strong baselines on two tasks, generating sentences with lexical constraints and text infilling. Experimental results have demonstrated that our proposed model performs much better than the previous work in terms of sentence fluency and diversity. Our code and pre-trained models are available at https://github.com/NLPCode/MCMCXLNet.

READ FULL TEXT
research
11/24/2020

Language Generation via Combinatorial Constraint Satisfaction: A Tree Search Enhanced Monte-Carlo Approach

Generating natural language under complex constraints is a principled fo...
research
09/26/2021

Parallel Refinements for Lexically Constrained Text Generation with BART

Lexically constrained text generation aims to control the generated text...
research
12/30/2020

Enhancing Pre-trained Language Model with Lexical Simplification

For both human readers and pre-trained language models (PrLMs), lexical ...
research
11/14/2018

CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

In real-world applications of natural language generation, there are oft...
research
02/25/2018

Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method

Generating plausible and fluent sentence with desired properties has lon...
research
08/29/2022

Reweighting Strategy based on Synthetic Data Identification for Sentence Similarity

Semantically meaningful sentence embeddings are important for numerous t...
research
11/03/2018

Unsupervised Identification of Study Descriptors in Toxicology Research: An Experimental Study

Identifying and extracting data elements such as study descriptors in pu...

Please sign up or login with your details

Forgot password? Click here to reset