AttenWalker: Unsupervised Long-Document Question Answering via Attention-based Graph Walking

05/03/2023
by   Yuxiang Nie, et al.
7

Annotating long-document question answering (long-document QA) pairs is time-consuming and expensive. To alleviate the problem, it might be possible to generate long-document QA pairs via unsupervised question answering (UQA) methods. However, existing UQA tasks are based on short documents, and can hardly incorporate long-range information. To tackle the problem, we propose a new task, named unsupervised long-document question answering (ULQA), aiming to generate high-quality long-document QA instances in an unsupervised manner. Besides, we propose AttenWalker, a novel unsupervised method to aggregate and generate answers with long-range dependency so as to construct long-document QA pairs. Specifically, AttenWalker is composed of three modules, i.e., span collector, span linker and answer aggregator. Firstly, the span collector takes advantage of constituent parsing and reconstruction loss to select informative candidate spans for constructing answers. Secondly, by going through the attention graph of a pre-trained long-document model, potentially interrelated text spans (that might be far apart) could be linked together via an attention-walking algorithm. Thirdly, in the answer aggregator, linked spans are aggregated into the final answer via the mask-filling ability of a pre-trained model. Extensive experiments show that AttenWalker outperforms previous methods on Qasper and NarrativeQA. In addition, AttenWalker also shows strong performance in the few-shot learning setting.

READ FULL TEXT

page 1

page 3

page 5

research
08/23/2022

Unsupervised Question Answering via Answer Diversifying

Unsupervised question answering is an attractive task due to its indepen...
research
10/11/2022

Capturing Global Structural Information in Long Document Question Answering with Compressive Graph Selector Network

Long document question answering is a challenging task due to its demand...
research
05/10/2021

Poolingformer: Long Document Modeling with Pooling Attention

In this paper, we introduce a two-level attention schema, Poolingformer,...
research
05/27/2022

V-Doc : Visual questions answers with Documents

We propose V-Doc, a question-answering tool using document images and PD...
research
05/06/2020

Harvesting and Refining Question-Answer Pairs for Unsupervised QA

Question Answering (QA) has shown great success thanks to the availabili...
research
05/08/2021

D2S: Document-to-Slide Generation Via Query-Based Text Summarization

Presentations are critical for communication in all areas of our lives, ...
research
08/28/2020

Rethinking the objectives of extractive question answering

This paper describes two generally applicable approaches towards the sig...

Please sign up or login with your details

Forgot password? Click here to reset