Topic Transferable Table Question Answering

09/15/2021
by   Saneem Ahmed Chemmengath, et al.
0

Weakly-supervised table question-answering(TableQA) models have achieved state-of-art performance by using pre-trained BERT transformer to jointly encoding a question and a table to produce structured query for the question. However, in practical settings TableQA systems are deployed over table corpora having topic and word distributions quite distinct from BERT's pretraining corpus. In this work we simulate the practical topic shift scenario by designing novel challenge benchmarks WikiSQL-TS and WikiTQ-TS, consisting of train-dev-test splits in five distinct topic groups, based on the popular WikiSQL and WikiTableQuestions datasets. We empirically show that, despite pre-training on large open-domain text, performance of models degrades significantly when they are evaluated on unseen topics. In response, we propose T3QA (Topic Transferable Table Question Answering) a pragmatic adaptation framework for TableQA comprising of: (1) topic-specific vocabulary injection into BERT, (2) a novel text-to-text transformer generator (such as T5, GPT2) based natural language question generation pipeline focused on generating topic specific training data, and (3) a logical form reranker. We show that T3QA provides a reasonably good baseline for our topic shift benchmarks. We believe our topic split benchmarks will lead to robust TableQA solutions that are better suited for practical deployment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2021

A Comparative Study of Transformer-Based Language Models on Extractive Question Answering

Question Answering (QA) is a task in natural language processing that ha...
research
04/14/2020

PALM: Pre-training an Autoencoding Autoregressive Language Model for Context-conditioned Generation

Self-supervised pre-training has emerged as a powerful technique for nat...
research
04/15/2020

HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data

Existing question answering datasets focus on dealing with homogeneous i...
research
02/28/2020

DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding

Recent studies on open-domain question answering have achieved prominent...
research
10/06/2020

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

We propose a method to automatically generate a domain- and task-adaptiv...
research
06/08/2023

Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT and GPT-4 for Cost-Efficient Question Answering

Large language models (LLMs), such as ChatGPT and GPT-4, are gaining wid...
research
01/27/2022

TableQuery: Querying tabular data with natural language

This paper presents TableQuery, a novel tool for querying tabular data u...

Please sign up or login with your details

Forgot password? Click here to reset