Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning

11/10/2020
by   Rachel Gardner, et al.
0

Datasets extracted from social networks and online forums are often prone to the pitfalls of natural language, namely the presence of unstructured and noisy data. In this work, we seek to enable the collection of high-quality question-answer datasets from social media by proposing a novel task for automated quality analysis and data cleaning: question-answer (QA) plausibility. Given a machine or user-generated question and a crowd-sourced response from a social media user, we determine if the question and response are valid; if so, we identify the answer within the free-form response. We design BERT-based models to perform the QA plausibility task, and we evaluate the ability of our models to generate a clean, usable question-answer dataset. Our highest-performing approach consists of a single-task model which determines the plausibility of the question, followed by a multi-task model which evaluates the plausibility of the response as well as extracts answers (Question Plausibility AUROC=0.75, Response Plausibility AUROC=0.78, Answer Extraction F1=0.665).

READ FULL TEXT
06/24/2021

VOGUE: Answer Verbalization through Multi-Task Learning

In recent years, there have been significant developments in Question An...
04/28/2020

Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language Detection

Nowadays, offensive content in social media has become a serious problem...
08/29/2021

MEDIC: A Multi-Task Learning Dataset for Disaster Image Classification

Recent research in disaster informatics demonstrates a practical and imp...
07/14/2019

TWEETQA: A Social Media Focused Question Answering Dataset

With social media becoming increasingly pop-ular on which lots of news a...
11/18/2021

How to Build Robust FAQ Chatbot with Controllable Question Generator?

Many unanswerable adversarial questions fool the question-answer (QA) sy...
07/08/2017

Predicting the Quality of Short Narratives from Social Media

An important and difficult challenge in building computational models fo...
12/06/2015

Want Answers? A Reddit Inspired Study on How to Pose Questions

Questions form an integral part of our everyday communication, both offl...