Data Augmentation with Hierarchical SQL-to-Question Generation for Cross-domain Text-to-SQL Parsing

03/03/2021
by   Ao Zhang, et al.
0

Data augmentation has attracted a lot of research attention in the deep learning era for its ability in alleviating data sparseness. The lack of data for unseen evaluation databases is exactly the major challenge for cross-domain text-to-SQL parsing. Previous works either require human intervention to guarantee the quality of generated data, or fail to handle complex SQL queries. This paper presents a simple yet effective data augmentation framework. First, given a database, we automatically produce a large amount of SQL queries based on an abstract syntax tree grammar. We require the generated queries cover at least 80 matching. Second, we propose a hierarchical SQL-to-question generation model to obtain high-quality natural language questions, which is the major contribution of this work. Experiments on three cross-domain datasets, i.e., WikiSQL and Spider in English, and DuSQL in Chinese, show that our proposed data augmentation framework can consistently improve performance over strong baselines, and in particular the hierarchical generation model is the key for the improvement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2018

SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task

Most existing studies in text-to-SQL tasks do not require generating com...
research
04/27/2023

Controllable Data Augmentation for Context-Dependent Text-to-SQL

The limited scale of annotated data constraints existing context-depende...
research
12/04/2021

Hierarchical Neural Data Synthesis for Semantic Parsing

Semantic parsing datasets are expensive to collect. Moreover, even the q...
research
10/19/2020

ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries

Translating natural language utterances to executable queries is a helpf...
research
10/29/2022

Diverse Parallel Data Synthesis for Cross-Database Adaptation of Text-to-SQL Parsers

Text-to-SQL parsers typically struggle with databases unseen during the ...
research
10/23/2019

AnnaParser: Semantic Parsing for Tabular Data Analysis

This paper presents a novel approach to translating natural language que...
research
10/23/2019

A Hybrid Semantic Parsing Approach for Tabular Data Analysis

This paper presents a novel approach to translating natural language que...

Please sign up or login with your details

Forgot password? Click here to reset