Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting

08/01/2023
by   Aseem Arora, et al.
0

Cross-domain and cross-compositional generalization of Text-to-SQL semantic parsing is a challenging task. Existing Large Language Model (LLM) based solutions rely on inference-time retrieval of few-shot exemplars from the training set to synthesize a run-time prompt for each Natural Language (NL) test query. In contrast, we devise an algorithm which performs offline sampling of a minimal set-of few-shots from the training data, with complete coverage of SQL clauses, operators and functions, and maximal domain coverage within the allowed token length. This allows for synthesis of a fixed Generic Prompt (GP), with a diverse set-of exemplars common across NL test queries, avoiding expensive test time exemplar retrieval. We further auto-adapt the GP to the target database domain (DA-GP), to better handle cross-domain generalization; followed by a decomposed Least-To-Most-Prompting (LTMP-DA-GP) to handle cross-compositional generalization. The synthesis of LTMP-DA-GP is an offline task, to be performed one-time per new database with minimal human intervention. Our approach demonstrates superior performance on the KaggleDBQA dataset, designed to evaluate generalizability for the Text-to-SQL task. We further showcase consistent performance improvement of LTMP-DA-GP over GP, across LLMs and databases of KaggleDBQA, highlighting the efficacy and model agnostic benefits of our prompt based adapt and decompose approach.

READ FULL TEXT

page 3

page 15

page 20

page 22

research
09/24/2018

Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task

We present Spider, a large-scale, complex and cross-domain semantic pars...
research
10/11/2018

SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task

Most existing studies in text-to-SQL tasks do not require generating com...
research
05/20/2019

Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation

We present a neural approach called IRNet for complex and cross-domain T...
research
06/05/2019

SParC: Cross-Domain Semantic Parsing in Context

We present SParC, a dataset for cross-domainSemanticParsing inContext th...
research
10/21/2022

Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play

The task of context-dependent text-to-SQL aims to convert multi-turn use...
research
01/25/2021

GP: Context-free Grammar Pre-training for Text-to-SQL Parsers

A new method for Text-to-SQL parsing, Grammar Pre-training (GP), is prop...
research
12/21/2018

Multi-component Image Translation for Deep Domain Generalization

Domain adaption (DA) and domain generalization (DG) are two closely rela...

Please sign up or login with your details

Forgot password? Click here to reset