CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

09/11/2019
by   Tao Yu, et al.
39

We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. It consists of 30k+ turns plus 10k+ annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL, clarifying ambiguous questions, or otherwise informing of unanswerable questions. When user questions are answerable by SQL, the expert describes the SQL and execution results to the user, hence maintaining a natural interaction flow. CoSQL introduces new challenges compared to existing task-oriented dialogue datasets:(1) the dialogue states are grounded in SQL, a domain-independent executable representation, instead of domain-specific slot-value pairs, and (2) because testing is done on unseen databases, success requires generalizing to new domains. CoSQL includes three tasks: SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction. We evaluate a set of strong baselines for each task and show that CoSQL presents significant challenges for future research. The dataset, baselines, and leaderboard will be released at https://yale-lily.github.io/cosql.

READ FULL TEXT
research
06/05/2019

SParC: Cross-Domain Semantic Parsing in Context

We present SParC, a dataset for cross-domainSemanticParsing inContext th...
research
10/11/2018

SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task

Most existing studies in text-to-SQL tasks do not require generating com...
research
04/07/2020

RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases

Text-to-SQL is the problem of converting a user question into an SQL que...
research
12/17/2022

Know What I don't Know: Handling Ambiguous and Unanswerable Questions for Text-to-SQL

The task of text-to-SQL is to convert a natural language question to its...
research
07/30/2020

Photon: A Robust Cross-Domain Text-to-SQL System

Natural language interfaces to databases (NLIDB) democratize end user ac...
research
07/01/2023

JoinBoost: Grow Trees Over Normalized Data Using Only SQL

Although dominant for tabular data, ML libraries that train tree models ...
research
12/23/2020

Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing

We present BRIDGE, a powerful sequential architecture for modeling depen...

Please sign up or login with your details

Forgot password? Click here to reset