DocuT5: Seq2seq SQL Generation with Table Documentation

11/11/2022
by   Elena Soare, et al.
0

Current SQL generators based on pre-trained language models struggle to answer complex questions requiring domain context or understanding fine-grained table structure. Humans would deal with these unknowns by reasoning over the documentation of the tables. Based on this hypothesis, we propose DocuT5, which uses off-the-shelf language model architecture and injects knowledge from external `documentation' to improve domain generalization. We perform experiments on the Spider family of datasets that contain complex questions that are cross-domain and multi-table. Specifically, we develop a new text-to-SQL failure taxonomy and find that 19.6 key mistakes, and 49.2 DocuT5, a method that captures knowledge from (1) table structure context of foreign keys and (2) domain knowledge through contextualizing tables and columns. Both types of knowledge improve over state-of-the-art T5 with constrained decoding on Spider, and domain knowledge produces state-of-the-art comparable effectiveness on Spider-DK and Spider-SYN datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2020

TableQA: a Large-Scale Chinese Text-to-SQL Dataset for Table-Aware SQL Generation

Parsing natural language to corresponding SQL (NL2SQL) with data driven ...
research
09/02/2019

Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions

We focus on the cross-domain context-dependent text-to-SQL generation ta...
research
03/15/2022

UniSAr: A Unified Structure-Aware Autoregressive Language Model for Text-to-SQL

Existing text-to-SQL semantic parsers are typically designed for particu...
research
09/11/2021

Exploring Underexplored Limitations of Cross-Domain Text-to-SQL Generalization

Recently, there has been significant progress in studying neural network...
research
01/03/2023

Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge

In this paper, we study the problem of knowledge-intensive text-to-SQL, ...
research
01/31/2023

Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning

Table-based reasoning has shown remarkable progress in combining deep mo...
research
09/12/2018

Knowledge-Aware Conversational Semantic Parsing Over Web Tables

Conversational semantic parsing over tables requires knowledge acquiring...

Please sign up or login with your details

Forgot password? Click here to reset