Embedding Individual Table Columns for Resilient SQL Chatbots

11/01/2018
by   Bojan Petrovski, et al.
0

Most of the world's data is stored in relational databases. Accessing these requires specialized knowledge of the Structured Query Language (SQL), putting them out of the reach of many people. A recent research thread in Natural Language Processing (NLP) aims to alleviate this problem by automatically translating natural language questions into SQL queries. While the proposed solutions are a great start, they lack robustness and do not easily generalize: the methods require high quality descriptions of the database table columns, and the most widely used training dataset, WikiSQL, is heavily biased towards using those descriptions as part of the questions. In this work, we propose solutions to both problems: we entirely eliminate the need for column descriptions, by relying solely on their contents, and we augment the WikiSQL dataset by paraphrasing column names to reduce bias. We show that the accuracy of existing methods drops when trained on our augmented, column-agnostic dataset, and that our own method reaches state of the art accuracy, while relying on column contents only.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/23/2018

Semantic Parsing with Syntax- and Table-Aware SQL Generation

We present a generative model to map natural language questions into SQL...
research
09/08/2023

Matching Table Metadata with Business Glossaries Using Large Language Models

Enterprises often own large collections of structured data in the form o...
research
10/11/2020

Data Agnostic RoBERTa-based Natural Language to SQL Query Generation

Relational databases are among the most widely used architectures to sto...
research
07/05/2020

DrugDBEmbed : Semantic Queries on Relational Database using Supervised Column Encodings

Traditional relational databases contain a lot of latent semantic inform...
research
12/17/2022

Importance of Synthesizing High-quality Data for Text-to-SQL Parsing

Recently, there has been increasing interest in synthesizing data to imp...
research
10/24/2020

Structure-Grounded Pretraining for Text-to-SQL

Learning to capture text-table alignment is essential for table related ...
research
05/10/2023

SPSQL: Step-by-step Parsing Based Framework for Text-to-SQL Generation

Converting text into the structured query language (Text2SQL) is a resea...

Please sign up or login with your details

Forgot password? Click here to reset