Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task

09/24/2018
by   Tao Yu, et al.
2

We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables, covering 138 different domains. We define a new complex and cross-domain semantic parsing and text-to-SQL task where different complex SQL queries and databases appear in train and test sets. In this way, the task requires the model to generalize well to both new SQL queries and new database schemas. Spider is distinct from most of the previous semantic parsing tasks because they all use a single database and the exact same programs in the train set and the test set. We experiment with various state-of-the-art models and the best model achieves only 14.3 setting. This shows that Spider presents a strong challenge for future research. Our dataset and task are publicly available at https://yale-lily.github.io/spider.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2019

SParC: Cross-Domain Semantic Parsing in Context

We present SParC, a dataset for cross-domainSemanticParsing inContext th...
research
05/25/2023

CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical Dataset

The cross-domain text-to-SQL task aims to build a system that can parse ...
research
09/29/2019

A Pilot Study for Chinese SQL Semantic Parsing

The task of semantic parsing is highly useful for dialogue and question ...
research
06/22/2021

KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers

The goal of database question answering is to enable natural language qu...
research
06/17/2021

End-to-End Cross-Domain Text-to-SQL Semantic Parsing with Auxiliary Task

In this work, we focus on two crucial components in the cross-domain tex...
research
08/01/2023

Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting

Cross-domain and cross-compositional generalization of Text-to-SQL seman...
research
10/21/2020

On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries

Large-scale semantic parsing datasets annotated with logical forms have ...

Please sign up or login with your details

Forgot password? Click here to reset