Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play

10/21/2022
by   Qi Liu, et al.
0

The task of context-dependent text-to-SQL aims to convert multi-turn user utterances to formal SQL queries. This is a challenging task due to both the scarcity of training data from which to learn complex contextual dependencies and to generalize to unseen databases. In this paper we explore augmenting the training datasets using self-play, which leverages contextual information to synthesize new interactions to adapt the model to new databases. We first design a SQL-to-text model conditioned on a sampled goal query, which represents a user's intent, that then converses with a text-to-SQL semantic parser to generate new interactions. We then filter the synthesized interactions and retrain the models with the augmented data. We find that self-play improves the accuracy of a strong baseline on SParC and CoSQL, two widely used cross-domain text-to-SQL datasets. Our analysis shows that self-play simulates various conversational thematic relations, enhances cross-domain generalization and improves beam-search.

READ FULL TEXT
research
06/05/2019

SParC: Cross-Domain Semantic Parsing in Context

We present SParC, a dataset for cross-domainSemanticParsing inContext th...
research
10/11/2018

SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task

Most existing studies in text-to-SQL tasks do not require generating com...
research
05/25/2023

CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical Dataset

The cross-domain text-to-SQL task aims to build a system that can parse ...
research
05/11/2023

QURG: Question Rewriting Guided Context-Dependent Text-to-SQL Semantic Parsing

Context-dependent Text-to-SQL aims to translate multi-turn natural langu...
research
07/30/2020

Photon: A Robust Cross-Domain Text-to-SQL System

Natural language interfaces to databases (NLIDB) democratize end user ac...
research
08/01/2023

Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting

Cross-domain and cross-compositional generalization of Text-to-SQL seman...
research
06/02/2021

LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations

This work aims to tackle the challenging heterogeneous graph encoding pr...

Please sign up or login with your details

Forgot password? Click here to reset