DeepAI AI Chat
Log In Sign Up

Learning to Synthesize Data for Semantic Parsing

by   Bailin Wang, et al.

Synthesizing data for semantic parsing has gained increasing attention recently. However, most methods require handcrafted (high-precision) rules in their generative process, hindering the exploration of diverse unseen data. In this work, we propose a generative model which features a (non-neural) PCFG that models the composition of programs (e.g., SQL), and a BART-based translation model that maps a program to an utterance. Due to the simplicity of PCFG and pre-trained BART, our generative model can be efficiently learned from existing data at hand. Moreover, explicitly modeling compositions using PCFG leads to a better exploration of unseen programs, thus generate more diverse data. We evaluate our method in both in-domain and out-of-domain settings of text-to-SQL parsing on the standard benchmarks of GeoQuery and Spider, respectively. Our empirical results show that the synthesized data generated from our model can substantially help a semantic parser achieve better compositional and domain generalization.


page 1

page 2

page 3

page 4


Robust Text-to-SQL Generation with Execution-Guided Decoding

We consider the problem of neural semantic parsing, which translates nat...

Hierarchical Neural Data Synthesis for Semantic Parsing

Semantic parsing datasets are expensive to collect. Moreover, even the q...

Towards Generalizable and Robust Text-to-SQL Parsing

Text-to-SQL parsing tackles the problem of mapping natural language ques...

Importance of Synthesizing High-quality Data for Text-to-SQL Parsing

Recently, there has been increasing interest in synthesizing data to imp...

Decoupled Dialogue Modeling and Semantic Parsing for Multi-Turn Text-to-SQL

Recently, Text-to-SQL for multi-turn dialogue has attracted great intere...

Neural Semantic Parsing in Low-Resource Settings with Back-Translation and Meta-Learning

Neural semantic parsing has achieved impressive results in recent years,...

Neural-Symbolic Inference for Robust Autoregressive Graph Parsing via Compositional Uncertainty Quantification

Pre-trained seq2seq models excel at graph semantic parsing with rich ann...