AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

11/11/2021
by   Alexander R. Fabbri, et al.
0

Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions. Each question thread can receive a large number of answers with different perspectives. One goal of answer summarization is to produce a summary that reflects the range of answer perspectives. A major obstacle for abstractive answer summarization is the absence of a dataset to provide supervision for producing such summaries. Recent works propose heuristics to create such data, but these are often noisy and do not cover all perspectives present in the answers. This work introduces a novel dataset of 4,631 CQA threads for answer summarization, curated by professional linguists. Our pipeline gathers annotations for all subtasks involved in answer summarization, including the selection of answer sentences relevant to the question, grouping these sentences based on perspectives, summarizing each perspective, and producing an overall summary. We analyze and benchmark state-of-the-art models on these subtasks and introduce a novel unsupervised approach for multi-perspective data augmentation, that further boosts overall summarization performance according to automatic evaluation. Finally, we propose reinforcement learning rewards to improve factual consistency and answer coverage and analyze areas for improvement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2021

Multi-Perspective Abstractive Answer Summarization

Community Question Answering (CQA) forums such as Stack Overflow and Yah...
research
11/22/2019

Joint Learning of Answer Selection and Answer Summary Generation in Community Question Answering

Community question answering (CQA) gains increasing popularity in both a...
research
05/30/2023

Concise Answers to Complex Questions: Summarization of Long-form Answers

Long-form question answering systems provide rich information by present...
research
11/12/2018

CQASUMM: Building References for Community Question Answering Summarization Corpora

Community Question Answering forums such as Quora, Stackoverflow are ric...
research
10/08/2020

Multi-hop Inference for Question-driven Summarization

Question-driven summarization has been recently studied as an effective ...
research
05/06/2021

Text similarity analysis for evaluation of descriptive answers

Keeping in mind the necessity of intelligent system in educational secto...
research
12/04/2014

Deep Learning for Answer Sentence Selection

Answer sentence selection is the task of identifying sentences that cont...

Please sign up or login with your details

Forgot password? Click here to reset