Enhancing Pre-trained Models with Text Structure Knowledge for Question Generation

09/09/2022
by   Zichen Wu, et al.
0

Today the pre-trained language models achieve great success for question generation (QG) task and significantly outperform traditional sequence-to-sequence approaches. However, the pre-trained models treat the input passage as a flat sequence and are thus not aware of the text structure of input passage. For QG task, we model text structure as answer position and syntactic dependency, and propose answer localness modeling and syntactic mask attention to address these limitations. Specially, we present localness modeling with a Gaussian bias to enable the model to focus on answer-surrounded context, and propose a mask attention mechanism to make the syntactic structure of input passage accessible in question generation process. Experiments on SQuAD dataset show that our proposed two modules improve performance over the strong pre-trained model ProphetNet, and combing them together achieves very competitive results with the state-of-the-art pre-trained model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2020

Are Pre-trained Language Models Knowledgeable to Ground Open Domain Dialogues?

We study knowledge-grounded dialogue generation with pre-trained languag...
research
12/23/2022

Learning to Generate Questions by Enhancing Text Generation with Sentence Selection

We introduce an approach for the answer-aware question generation proble...
research
08/30/2022

Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models

Deep generative models (DGMs) are data-eager. Essentially, it is because...
research
10/16/2022

Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion

Transformer-based pre-trained models like BERT have achieved great progr...
research
07/06/2022

The Role of Complex NLP in Transformers for Text Ranking?

Even though term-based methods such as BM25 provide strong baselines in ...
research
03/21/2022

DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization

Large-scale pre-trained sequence-to-sequence models like BART and T5 ach...
research
04/09/2022

TANet: Thread-Aware Pretraining for Abstractive Conversational Summarization

Although pre-trained language models (PLMs) have achieved great success ...

Please sign up or login with your details

Forgot password? Click here to reset