Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases

by   Yu Gu, et al.

Existing studies on question answering on knowledge bases (KBQA) mainly operate with the standard i.i.d assumption, i.e., training distribution over questions is the same as the test distribution. However, i.i.d may be neither reasonably achievable nor desirable on large-scale KBs because 1) true user distribution is hard to capture and 2) randomly sample training examples from the enormous space would be highly data-inefficient. Instead, we suggest that KBQA models should have three levels of built-in generalization: i.i.d, compositional, and zero-shot. To facilitate the development of KBQA models with stronger generalization, we construct and release a new large-scale, high-quality dataset with 64,331 questions, GrailQA, and provide evaluation settings for all three levels of generalization. In addition, we propose a novel BERT-based KBQA model. The combination of our dataset and model enables us to thoroughly examine and demonstrate, for the first time, the key role of pre-trained contextual embeddings like BERT in the generalization of KBQA.


page 1

page 3

page 9


Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization

Commonsense question answering (CQA) aims to test if models can answer q...

Challenges in Generalization in Open Domain Question Answering

Recent work on Open Domain Question Answering has shown that there is a ...

ClarQ: A large-scale and diverse dataset for Clarification Question Generation

Question answering and conversational systems are often baffled and need...

CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases

How can we enable computers to automatically answer questions like "Who ...

Knowledge Graph Question Answering Datasets and Their Generalizability: Are They Enough for Future Research?

Existing approaches on Question Answering over Knowledge Graphs (KGQA) h...

Paired Examples as Indirect Supervision in Latent Decision Models

Compositional, structured models are appealing because they explicitly d...

Entity-Based Knowledge Conflicts in Question Answering

Knowledge-dependent tasks typically use two sources of knowledge: parame...