SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach

04/24/2018
by   Michael Petrochuk, et al.
0

The SimpleQuestions dataset is one of the most commonly used benchmarks for studying single-relation factoid questions. In this paper, we present new evidence that this benchmark can be nearly solved by standard methods. First we show that ambiguity in the data bounds performance on this benchmark at 83.4 there are often multiple answers that cannot be disambiguated from the linguistic signal alone. Second we introduce a baseline that sets a new state-of-the-art performance level at 78.1 methods. Finally, we report an empirical analysis showing that the upperbound is loose; roughly a third of the remaining errors are also not resolvable from the linguistic signal. Together, these results suggest that the SimpleQuestions dataset is nearly solved.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2021

Systematic Generalization on gSCAN: What is Nearly Solved and What is Next?

We analyze the grounded SCAN (gSCAN) benchmark, which was recently propo...
research
09/13/2021

SituatedQA: Incorporating Extra-Linguistic Contexts into QA

Answers to the same question may change depending on the extra-linguisti...
research
11/22/2019

Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation

In this work, we introduce Panoptic-DeepLab, a simple, strong, and fast ...
research
07/08/2023

Answering Ambiguous Questions via Iterative Prompting

In open-domain question answering, due to the ambiguity of questions, mu...
research
06/12/2017

Evolutionary Multitasking for Single-objective Continuous Optimization: Benchmark Problems, Performance Metric, and Baseline Results

In this report, we suggest nine test problems for multi-task single-obje...
research
05/28/2019

Global forensic geolocation with deep neural networks

An important problem in forensic analyses is identifying the provenance ...
research
07/17/2023

Revisiting Scene Text Recognition: A Data Perspective

This paper aims to re-assess scene text recognition (STR) from a data-or...

Please sign up or login with your details

Forgot password? Click here to reset