WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

08/11/2016
by   Daniel Hewlett, et al.
0

We present WikiReading, a large-scale natural language understanding task and publicly-available dataset with 18 million instances. The task is to predict textual values from the structured knowledge base Wikidata by reading the text of the corresponding Wikipedia articles. The task contains a rich variety of challenging classification and extraction sub-tasks, making it well-suited for end-to-end models such as deep neural networks (DNNs). We compare various state-of-the-art DNN-based architectures for document classification, information extraction, and question answering. We find that models supporting a rich answer space, such as word or character sequences, perform best. Our best-performing model, a word-level sequence to sequence model with a mechanism to copy out-of-vocabulary words, obtains an accuracy of 71.8

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2018

Jack the Reader - A Machine Reading Framework

Many Machine Reading and Natural Language Understanding tasks require re...
research
01/27/2021

LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction

Open Information Extraction (OIE) systems seek to compress the factual p...
research
09/24/2020

Ape210K: A Large-Scale and Template-Rich Dataset of Math Word Problems

Automatic math word problem solving has attracted growing attention in r...
research
09/19/2023

KoBigBird-large: Transformation of Transformer for Korean Language Understanding

This work presents KoBigBird-large, a large size of Korean BigBird that ...
research
12/18/2018

Attend, Copy, Parse - End-to-end information extraction from documents

Document information extraction tasks performed by humans create data co...
research
08/21/2019

WikiCREM: A Large Unsupervised Corpus for Coreference Resolution

Pronoun resolution is a major area of natural language understanding. Ho...
research
05/03/2018

Scalable Semantic Querying of Text

We present the KOKO system that takes declarative information extraction...

Please sign up or login with your details

Forgot password? Click here to reset