A Quantitative Evaluation of Natural Language Question Interpretation for Question Answering Systems

09/20/2018
by   Takuto Asakura, et al.
0

Systematic benchmark evaluation plays an important role in the process of improving technologies for Question Answering (QA) systems. While currently there are a number of existing evaluation methods for natural language (NL) QA systems, most of them consider only the final answers, limiting their utility within a black box style evaluation. Herein, we propose a subdivided evaluation approach to enable finer-grained evaluation of QA systems, and present an evaluation tool which targets the NL question (NLQ) interpretation step, an initial step of a QA pipeline. The results of experiments using two public benchmark datasets suggest that we can get a deeper insight about the performance of a QA system using the proposed approach, which should provide a better guidance for improving the systems, than using black box style approaches.

READ FULL TEXT
research
07/02/2019

CS563-QA: A Collection for Evaluating Question Answering Systems

Question Answering (QA) is a challenging topic since it requires tacklin...
research
08/10/2019

A Critical Note on the Evaluation of Clustering Algorithms

Experimental evaluation is a major research methodology for investigatin...
research
07/19/2019

A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data

With the growing number and size of Linked Data datasets, it is crucial ...
research
11/06/2016

Self-Wiring Question Answering Systems

Question answering (QA) has been the subject of a resurgence over the pa...
research
03/06/2020

Natural Language QA Approaches using Reasoning with External Knowledge

Question answering (QA) in natural language (NL) has been an important a...
research
01/01/2021

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

We review the EfficientQA competition from NeurIPS 2020. The competition...
research
01/20/2022

Knowledge Graph Question Answering Leaderboard: A Community Resource to Prevent a Replication Crisis

Data-driven systems need to be evaluated to establish trust in the scien...

Please sign up or login with your details

Forgot password? Click here to reset