What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge

12/31/2019
by   Kyle Richardson, et al.
0

Open-domain question answering (QA) is known to involve several underlying knowledge and reasoning challenges, but are models actually learning such knowledge when trained on benchmark tasks? To investigate this, we introduce several new challenge tasks that probe whether state-of-the-art QA models have general knowledge about word definitions and general taxonomic reasoning, both of which are fundamental to more complex forms of reasoning and are widespread in benchmark datasets. As an alternative to expensive crowd-sourcing, we introduce a methodology for automatically building datasets from various types of expert knowledge (e.g., knowledge graphs and lexical taxonomies), allowing for systematic control over the resulting probes and for a more comprehensive evaluation. We find automatically constructing probes to be vulnerable to annotation artifacts, which we carefully control for. Our evaluation confirms that transformer-based QA models are already predisposed to recognize certain types of structural lexical knowledge. However, it also reveals a more nuanced picture: their performance degrades substantially with even a slight increase in the number of hops in the underlying taxonomic hierarchy, or as more challenging distractor candidate answers are introduced. Further, even when these models succeed at the standard instance-level evaluation, they leave much room for improvement when assessed at the level of clusters of semantically connected probes (e.g., all Isa questions about a concept).

READ FULL TEXT

page 1

page 10

research
05/11/2023

Evaluating Open-Domain Question Answering in the Era of Large Language Models

Lexical matching remains the de facto evaluation method for open-domain ...
research
04/13/2021

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

The problem of answering questions using knowledge from pre-trained lang...
research
10/07/2021

GNN is a Counter? Revisiting GNN for Question Answering

Question Answering (QA) has been a long-standing research topic in AI an...
research
02/13/2021

PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them

Open-domain Question Answering models which directly leverage question-a...
research
12/17/2021

ActKnow: Active External Knowledge Infusion Learning for Question Answering in Low Data Regime

Deep learning models have set benchmark results in various Natural Langu...
research
08/31/2022

Lifelong Learning for Question Answering with Hierarchical Prompts

QA models with lifelong learning (LL) abilities are important for practi...
research
06/01/2018

A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset

The recent work of Clark et al. introduces the AI2 Reasoning Challenge (...

Please sign up or login with your details

Forgot password? Click here to reset