Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning

11/10/2020
by   Rachel Gardner, et al.
0

Datasets extracted from social networks and online forums are often prone to the pitfalls of natural language, namely the presence of unstructured and noisy data. In this work, we seek to enable the collection of high-quality question-answer datasets from social media by proposing a novel task for automated quality analysis and data cleaning: question-answer (QA) plausibility. Given a machine or user-generated question and a crowd-sourced response from a social media user, we determine if the question and response are valid; if so, we identify the answer within the free-form response. We design BERT-based models to perform the QA plausibility task, and we evaluate the ability of our models to generate a clean, usable question-answer dataset. Our highest-performing approach consists of a single-task model which determines the plausibility of the question, followed by a multi-task model which evaluates the plausibility of the response as well as extracts answers (Question Plausibility AUROC=0.75, Response Plausibility AUROC=0.78, Answer Extraction F1=0.665).

READ FULL TEXT
research
06/24/2021

VOGUE: Answer Verbalization through Multi-Task Learning

In recent years, there have been significant developments in Question An...
research
04/28/2020

Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language Detection

Nowadays, offensive content in social media has become a serious problem...
research
08/29/2021

MEDIC: A Multi-Task Learning Dataset for Disaster Image Classification

Recent research in disaster informatics demonstrates a practical and imp...
research
07/14/2019

TWEETQA: A Social Media Focused Question Answering Dataset

With social media becoming increasingly pop-ular on which lots of news a...
research
04/09/2019

Quizbowl: The Case for Incremental Question Answering

Quizbowl is a scholastic trivia competition that tests human knowledge a...
research
07/08/2017

Predicting the Quality of Short Narratives from Social Media

An important and difficult challenge in building computational models fo...
research
12/06/2015

Want Answers? A Reddit Inspired Study on How to Pose Questions

Questions form an integral part of our everyday communication, both offl...

Please sign up or login with your details

Forgot password? Click here to reset