Learning to Generate Structured Queries from Natural Language with Indirect Supervision

09/10/2018 ∙ by Ziwei Bai, et al. ∙ 0

Generating structured query language (SQL) from natural language is an emerging research topic. This paper presents a new learning paradigm from indirect supervision of the answers to natural language questions, instead of SQL queries. This paradigm facilitates the acquisition of training data due to the abundant resources of question-answer pairs for various domains in the Internet, and expels the difficult SQL annotation job. An end-to-end neural model integrating with reinforcement learning is proposed to learn SQL generation policy within the answer-driven learning paradigm. The model is evaluated on datasets of different domains, including movie and academic publication. Experimental results show that our model outperforms the baseline models.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Nowadays, task oriented dialogue systems allow intuitive interaction through natural language, where natural language understanding (NLU) is an essential part. Structured Query Language (SQL) is a standard language for accessing knowledge bases or relational databases. Thus, SQL generation from text is crucial for many NLU applications. However, SQL is very difficult for users without technical training, thus natural language interfaces to databases have been widely studied Androutsopoulos et al. (1995); Popescu et al. (2004); Li and Jagadish (2014)

. Most of these work adopts one or more of the following techniques, rule based pattern matching, syntactic grammars based parse tree mapping, semantic grammars based constituent tree mapping. Some work 

Clarke et al. (2010); Liang et al. (2011); Cai and Yates (2013); Zettlemoyer and Collins (2005, 2007); Artzi et al. (2015); Yih et al. (2015) is proposed as a subtask of semantic parsing. These techniques focus on grammar parsing for specific domains, which cannot be easily generalized to different databases or application domains.

Several work on SQL generation from natural language (NL) has been proposed recently. A SQL generation model Seq2SQL is proposed in Zhong et al. (2017) based on pointer networks Vinyals et al. (2015), together with a WikiSQL corpus of natural language questions, SQL queries and tables from Wikipedia. Some work Xu et al. (2017); Yu et al. (2018) follows Seq2SQL and proposes various approaches to improve the performance of WikiSQL task. Cai et al. (2017) proposes an SQL generation model integrated with SQL grammar. These work needs model training on datasets containing NL questions and corresponding SQL queries. Such data is hard to collect since SQL annotation requires a full knowledge of SQL grammars and the relations between all database tables. Therefore, we propose to learn SQL parsers from indirect supervision, where each NL sentence is labeled with the answer instead of the SQL query. This learning paradigm facilitates data acquisition, since the training data can be easily acquired from Internet or non-expert users’ annotation.

In this paper, we propose a reinforcement learning based SQL generator (SQLGen), learned from indirect supervision of natural language questions and corresponding answers. SQLGen takes COPYNET Gu et al. (2016)

, an encoder-decoder structure as the neural network component. The policy based reinforcement learning is used to guide the learning of SQL generation, and two types of rewards are proposed. The rewards reflect the extent of correctness of generated SQL, which is an integration of correctness in the manner of logic and query execution. In order to provide more precise supervision, the rewards are designed to be vectors instead of scalars, where each element is assigned to a corresponding word in the generated SQL query.

The main contributions of this paper are as follows. (1) We propose a novel learning paradigm for SQL generation without annotated SQL queries for the first time. (2) We design an end-to-end neural model based on COPYNET with policy-based reinforcement learning for the answer-driven learning paradigm. (3) We design a compound point-wise reward assignment mechanism for SQL generation policy learning.

2 Related Work

Semantic parsing has attracted researchers’ attention recent years, which refers to the problem of converting a natural language sentence to a formal meaning representation Clarke et al. (2010); Liang et al. (2011); Cai and Yates (2013). Some research work focused on learning semantic parsers that generate logics executable on knowledge bases Zelle and Mooney (1996); Zettlemoyer and Collins (2005, 2007); Artzi et al. (2015). Recently, there has been some work attempting to learn parsers utilizing the results of query execution as indirect supervision Reddy et al. (2014); Yih et al. (2015); Pasupat and Liang (2015); Guu et al. (2017). However, the grammar structure of SQL is much more complicated than the logical forms in semantic parsing Cai et al. (2017), and it is non-trivial to adapt the semantic parsing techniques to SQL generation domain.

Although translating natural language into SQL queries has been extensively studied Warren and Pereira (1982); Androutsopoulos et al. (1995); Popescu et al. (2004); Giordani and Moschitti (2012), most work focuses on grammar parsing or interactive interface building which heavily relies on the grammar, and the proposed methods are difficult to be generalized to new databases. A neural system based on Seq2Seq model Sutskever et al. (2014) is proposed Iyer et al. (2017) to translate natural language to SQL queries with user feedbacks, which requires gathering user feedbacks to improve accuracy or adapt to new domains. There has also been some work on answering natural language questions based on knowledge bases Lu et al. (2016); Mou et al. (2016).

The most relevant work includes the following. Seq2SQL Zhong et al. (2017) proposes a neural architecture based on pointer networks Vinyals et al. (2015) to generate SQL queries with reinforcement learning. Seq2SQL also proposes a WikiSQL corpus of natural language questions, SQL queries and tables from Wikipedia source. SQLNet Xu et al. (2017) follows the work of Seq2SQL and proposes a sequence-to-set-based approach without reinforcement learning, which improves the performance of WikiSQL task. TYPESQL Yu et al. (2018) employs a slot filling model to predict the attribute values in SQL. All methods split a SQL query into several parts, and predict each part using a different neural module. Furthermore, WikiSQL task only considers generating SQL queries with respect to one table. Cai et al. (2017) proposes an encoder-decoder framework integrated with SQL grammatical structures for SQL generation. It requires preprocessing of annotating the potential attribute values in natural language questions. Compared to the three methods, our approach has the following differences. (1) Our approach learns SQL queries with respect to multiple tables from indirect supervision of natural language question and answer pairs, instead of question and SQL pairs. (2) Our approach adopts an end-to-end learning framework without segmenting SQL queries and learning separately.

Our work is also related to the work on attentional Seq2Seq models, which show promising performances on neural machine translation 

Bahdanau et al. (2014); Tu et al. (2016), dialog generation Serban et al. (2017); Shang et al. (2015), question answering Chen et al. (2017); Xiong et al. (2016), etc. Our work adopts the framework of COPYNET Gu et al. (2016), which incorporates the copying mechanism into the attentional encoder-decoder model. The intuition is that the words from the source sequence may appear in the target sequence, which is true for SQL generation.

3 Task Description

The SQL generation task from natural language questions is described as follows. The input is the natural language question querying the database. The output is a SQL query, the meaning of which should be equivalent to that of the input question.

We show an example in Figure 1. The “Movie” table contains the information of “name”, “genre”, “director”, “year”, “vote” and “language” of each movie, with “name” as the primary key. The input question queries the names of movies in 2001 that are acted by Jackie Chan, and the output SQL query is shown in the figure where the table join operation is needed.

Figure 1: An example of SQL generation task. The two tables are sampled from a movie database. The question queries the movies acted by Jackie Chan in 2001, and the correct SQL query is shown. The information in the brackets of both tables are translations of the Chinese words.

In order to make the problem more tractable, we make a similar assumption to WikiSQL, i.e., any non-SQL token in the generated SQL query should be a substring of the natural language question. Here the SQL tokens refer to all the SQL keywords (e.g. “select”, “from”, “where”, etc.) and the names (including aliases) of tables and columns. For the example in Figure 1, the non-SQL tokens in the SQL query are “Jackie Chan” and “2001”, which should appear in the question. This assumption also facilitates the utilization of COPYNET model, which learns to extract useful keywords from the questions.

Compared to WikiSQL task, our task has the following differences. (1) Our task learns from indirect supervision of the answers to natural language questions instead of SQL queries. (2) Our task considers generating a SQL query with respect to multiple tables, while WikiSQL considers only one table.

4 Approach

In this section, we introduce our SQL generator SQLGen (shown in Figure 2), where an encoder-decoder based architecture COPYNET is employed for SQL generation. We also design a reward assignment mechanism based on the generated SQL queries and the answers. Thus, the generation policy can be supervised by reinforcement learning using the designed reward mechanism.

4.1 Copying Mechanism for SQL Generation

Figure 2: The overview of our SQL generator SQLGen. An example of SQL generation process is shown. The input natural language question asks to “recommend some movies that are produced in China in 2012”. The SQL query is generated on the basis of a COPYNET structure, and the point-wise reward is computed for learning the generation policy by reinforcement learning.

An encoder-decoder based framework COPYNET is employed, which incorporates the copying mechanism while decoding. As shown in Figure 2, the input sequence of the natural language question is transformed by the encoder (e.g. bidirectional RNN) into a representation , and the decoder generates the output SQL query by predicting words based on a mixed probabilistic model of two modes, the generate-mode and the copy-mode. While decoding, COPYNET has not only an attentive read to , but also a selective read to , which renders the word generation from the designated vocabulary and the source sequence.

Vocabulary. The vocabulary in SQL generation domain consists of two parts since the generated SQL query should contain both SQL tokens (as defined in Section 3) and non-SQL tokens (the attribute values appeared in the source sequence).

The first portion of the vocabulary is denoted by , which contains the SQL keywords, operators and database symbols.

  • The SQL keyword set contains all the SQL keywords, such as “select”, “where”.

  • The comparator set contains all the comparative operators, e.g. “”, “”, etc.

  • The database symbol set contains all the names of database tables and columns.

Here we further introduce the constituents of the database symbol set . Let the table set of the database be , where is the name of the -th table. Let the column set with respect to table be , where is the name of the -th column in table . The elements in both and are database symbols. In order to reduce the exploring space of reinforcement learning, we further clarify the column set by introducing the attribute set . Take the example in Figure 1, the attribute set for the “Movie” table is “movie.movie_id”, “movie.movie_name”, “movie.director”, . The database symbol set covers the table set and the attribute sets .

Thus, .

The second portion of the vocabulary is denoted by , which covers all the unique words that appear in the natural language questions. Therefore, the whole vocabulary is .

Encoder. Let be the input sequence. As shown in Figure 2, the input sequence (“Recommend some movies that are produced in China in 2012”) is converted into a representation by an RNN encoder as follows. Note that a bidirectional GRU Cho et al. (2014) is used in this work.


The representation will be accessed by the decoder during the process of SQL generation.

Decoder. A GRU layer is used as the decoder to predict the target sequence. Let the decoder states be and the generated words be . We apply a standard attention mechanism on and obtain a context vector sequence .

Given the decoder state , context vector and

, the probability of generating a word

is computed as follows.


where stands for the generate-mode, and for the copy-mode. The probabilities for the two modes are computed as follows.

where is the normalization term shared by the generate-mode and copy-mode as follows.


and are scoring functions as follows, for generate-mode and copy-mode, respectively.


where and are learnable parameters, and is the one-hot indicator vector for .

Note that a specific state update mechanism is introduced in COPYNET, which can be eliminated if Chinese word segmentation or English chunking is done during preprocessing, or reserved otherwise. The state update mechanism helps to copy a consecutive sub-sequence in the source text, while an attribute value to be copied should be words in a single chunk after preprocessing in our task.

Mask. We rely on reinforcement learning to learn the generation policy since there is no correct SQL queries as direct supervision. However, the exploration space is enormous due to the complexity of the natural language and SQL logic. To solve this problem, we introduce a masking mechanism to reduce the exploration space. When the decoder is predicting the next target word, a mask vector is introduced to indicate whether a word is legal for generation given the previous word(s), as illustrated in Figure 2. The dimension of is , and if word is legal, otherwise.

The mask mechanism can be easily implemented based on SQL grammar. For example, if the previous generated word is the SQL keyword “from”, the current word should be the name of a certain table, thus the other words are illegal. Therefore, the mask mechanism helps to generate grammatically correct SQL queries.

4.2 Reinforcement Learning with Compound Reward

We apply reinforcement learning to learn a SQL generation policy under the indirect supervision of answers. Unlike Zhong et al. (2017)’s work, which assigns a scalar reward to a generated SQL query, we design a compound point-wise reward that acts on each token of the generated SQL query. This mechanism helps to guide the learning of SQL generation policy more precisely.

Figure 3: An illustration of two types of rewards, which act on different parts of the SQL query.

The point-wise reward mechanism is composed of two types of rewards, the coverage reward and the execution reward, which are acted on different portions of SQL queries. As illustrated in Figure 3, the coverage reward is acted on the words of attribute values in the where-conditions, the operators (“and”, “or”) connecting where-conditions, and the token for end-of-sentence (EOS), while the execution reward is acted on all the other words and the operators as in coverage reward.

Coverage reward. The coverage reward aims to guide the learning of word selection from the source text, and the procedure of coverage computation is shown in Algorithm 1. In order to better supervise the copy-mode learning of COPYNET, a vocabulary of attribute values is extracted from the database, which covers the possible values of queried attributes. Thus, the attribute values in the source text can be obtained based on this attribute-value vocabulary. The correct copied words in the generated SQL query are assigned positive rewards of , while the incorrect words and the duplicate correct words are assigned negative rewards of .

Similarly, the correct operators in the generated SQL query are assigned equally positive rewards, while incorrect operators are assigned non-positive rewards. Since there is no direct supervision of the correct SQL query, it is impossible to know whether a generated operator is semantically correct. What we know is the number of attributes in the correct SQL based on the source text and the attribute-value vocabulary. Hence, a correct operator here refers to the first operators in the generated SQL, while an incorrect operator refers to the other redundant operators. The first incorrect operator is assigned a negative reward of , leaving the others no penalty in case that the operators are excessively penalized.

For the EOS token, we reward EOS in the SQL queries with the correct number of attributes and penalize EOS in those with insufficient number of attributes, leaving EOS in other cases no penalty.

0:  SQL query , Source text
0:  Coverage reward
1:   the set of attribute values in source text
3:  for  in copied words in  do
4:     if  and  then
6:     else
7:        Set to
8:  for -th operator in  do
9:     Set
10:   the number of operators in
11:  Set
Algorithm 1 Coverage reward computation

Execution reward. The execution reward aims to guide the learning of SQL representation for natural language logics. The procedure of execution reward computation is shown in Algorithm 2. The execution rewards act on three types of SQL segments, the text segment from “select” to “where”, the condition-clauses without attribute values, operators connecting condition-clauses. For the example in Figure 3, is “select where”, is {“MA.actor_name=”,“M.year=”}, is {“and”}. The words in these SQL segments constitute the targeted word set for the execution reward.

The generated SQL query is executed. If the query result is equal to the answer, it is believed that is correctly generated and the rewards for the targeted words of is set to . Otherwise, the rewards of words in are set to , while those of words in are set to . For in , the SQL query with corresponding single condition is executed. If the result and answer set have common elements, the rewards are set to since the attribute-value pair in the condition should be correct, otherwise. In this way, execution reward guides the reinforcement learning model by assigning higher rewards to correct SQL queries.

Note that we assume the form of the condition clause to be “attribute=value”, which restricts the comparator to “=”. The reasons are two fold. First, the value types are mostly strings in our movie domain, thus equality is the most common comparator, while there is rare data with other comparators. Second, considering all comparators significantly raises the learning complexity, which we hope to study in our future work.

For a SQL query , the whole point-wise reward is a combination of the coverage reward and the execution reward , which act on word set and , respectively. As described above, , which is the set of the operators connecting condition clauses. The whole reward for each in is computed as follows.

0:  SQL query , Answer set
0:  Execution reward
1:  Segment by “where” and operators
2:   text from “select” to “where”
3:  # condition clause form: “attribute=value”
4:   the set of condition-clauses
7:  Execute SQL query and get result
8:  if  then
9:     Set for words in to
10:     return  
11:  Set for words in to
12:  Set for words in to
13:  for  in  do
14:     Concatenate with , get SQL query
15:     Execute SQL query , get result
16:     if  then
17:        Set for words in to
18:     else
19:        Set for words in to
Algorithm 2 Execution reward computation

Learning. We define the accumulative reward of SQL query to be

. The loss function is the negative expected accumulative reward over possible SQL queries, i.e.,

. We have the following equality as shown in Schulman et al. (2015).

Thus, the policy gradient of the loss function can be derived as follows. We approximate the expected gradient with a single Monte-Carlo sample in the last step of the derivation.

5 Experiments


We collect the datasets for evaluation, including a Chinese dataset in movie domain, two English datasets in the domains of academic publication and movie. The datasets consist of natural language questions, corresponding answers and database tables. For the comparison with direct supervised learning methods, we ask volunteers to label the questions with SQL queries.

Movie-Chinese dataset. The questions and answers are collected from a Chinese QA community (Baidu Zhidao), and the database is constructed using data collected from a Chinese movie community (Douban). There are 3 tables in the database, containing information of actors, directors, types, areas and languages of movies. We preprocess the data to eliminate the illegal data, such as confusing questions, incorrect answers.

The proportion of questions involving multiple tables is 78%, while that involving multiple conditions is 43%. The questions involving multiple tables have a high proportion because most SQL queries contain at least the “movie” table, since users tend to query their interested movie names.

Different from Movie-Chinese, the other two datasets are synthetic, where the databases are constructed by data collection from Internet and question-answer pairs are generated by templates.

Academic dataset. The database is constructed using the data from Roy et al. (2013), where we select 3 tables for our task, containing records of papers, researchers and conferences.

Movie dataset. The database is constructed using an open-source dataset111https://github.com/sundeepblue/movie_rating_prediction of IMDB. The dataset contains the same attributes as Movie-Chinese.

Each dataset contains around 10,000 question-answer pairs, and is randomly partitioned into training set, validation set and test set with the proportion of , , , respectively. The datasets can be referred to the supplementary materials in our submission.

Baselines. (1) Seq2Seq-RL is an attentional Seq2Seq model with reinforcement learning using our point-wise rewards. (2) CopyNet-Seq2SQL is a COPYNET model with reinforcement learning using rewards in Seq2SQL Zhong et al. (2017). (3) CopyNet-SL is a COPYNET model supervised by the annotated SQL queries.

We also study the performance of SQLGen with pretraining by the annotated SQL queries, which we denote by SQLGen-Pretrain.


Two evaluation metrics are used, accuracy and redundancy.

Accuracy refers to the ratio of correct SQL queries, where a query is correct if it executes to the correct result. Redundancy refers to the ratio of redundant SQL queries, where a query is redundant if it joins the tables that are in none of the conditions.


The hidden unit size of the encoder and decoder is 32 and 64, respectively. The embedding size is set to 50 due to a small vocabulary size. The models are trained at most 100 epochs with early stopping, using the Adam optimizer. While decoding, we either randomly sample a word from the distribution with probability

, or pick the highest-scoring word with the probability , rendering reinforcement learning more exploration opportunities. We set in the experiments.

Models Movie-Chinese Academic Movie
Accuracy Redundancy Accuracy Redundancy Accuracy Redundancy
Seq2Seq-RL 8.1 24.7 0.0 - 0.0 -
CopyNet-Seq2SQL 17.0 100.0 2.5 29.4 0.0 -
CopyNet-SL 56.9 0.2 62.6 0.0 51.9 0.0
SQLGen 59.8 68.6 64.6 70.2 70.0 76.0
SQLGen-Pretrain 80.4 75.4 67.8 99.4 73.6 97.4
Table 1: The accuracy and redundancy of SQLGen and the baselines on three datasets.

5.1 Main Results

Table 1 shows the accuracy and redundancy results of SQLGen and the baselines on the three datasets. The first two baselines, Seq2Seq-RL and CopyNet-Seq2SQL, have very low accuracy. This result shows the difficulty of the proposed learning paradigm with indirect supervision. We also try the Seq2Seq model with the typical Seq2SQL reward, which hardly learns anything and have an accuracy of , thus we do not take it as a baseline model. CopyNet-SL has better performances than the other two baselines since it learns from the direct supervision of correct SQL queries. SQLGen has higher accuracy than CopyNet-SL. A probable reason is that CopyNet-SL learns from supervised SQL queries but penalizes correct SQL queries with different orders of table joins or conditions.

SQLGen-Pretrain have higher accuracy than SQLGen by 1%-34% for different datasets. This demonstrates that supervised pretraining helps improve the subsequent policy learning but needs the manual annotation. Thus, a suitable method can be selected based on a tradeoff between performance and annotation cost in practical scenarios.

We study the redundancy of different models when the accuracy is higher than . SQLGen has the redundancy of 68%-75% on different datasets. The reason is that the space for the combinations of table joins and conditions is enormous and it is very difficult for the indirect supervised learning. Thus, the exploration of reinforcement learning has a tendency of joining more potential tables, which has a higher probability of generating correct SQL queries. This tendency results in relatively high redundancy. Note that SQLGen does not generate SQL queries with duplicate tables or conditions using the mask proposed in Section 4.1.

CopyNet-SL has very low redundancy since the model learns from direct supervision of SQL queries, which has a low probability of joining redundant tables. SQLGen-Pretrain has higher redundancy than SQLGen. The reason is that the pretrained model has a tendency of joining more tables than a randomly initialized model, since the training data involving multiple tables takes a high proportion, as shown in the data description.

Table 2 shows the accuracy on Movie-Chinese dataset in different cases, including SQL queries containing single and multiple conditions (tables). For SQLGen and the baselines, the accuracy of SQL queries with single condition is higher than that with multiple conditions, because the natural language related to a single condition is easier to learn. SQLGen has much lower accuracy for SQL queries with single table than those with multiple tables. By observing the test cases, we find most of the incorrect SQL queries predict the wrong attributes in the condition clauses. As shown in Figure 4, the attribute is difficult to learn since the patterns querying different attributes could be similar due to the characteristics of Chinese language. In Movie-Chinese domain, such patterns mostly occur in the cases where single table is involved.

Seq2Seq-RL 13.6 0.7 1.1 9.7
CopyNet-Seq2SQL 29.7 0.0 0.0 21.0
CopyNet-SL 65.6 45.2 90.4 49.2
SQLGen 61.8 57.1 34.2 65.7
SQLGen-Pretrain 91.1 57.6 90.9 73.6
Table 2: The accuracy analysis on Movie-Chinese dataset. () is the accuracy of SQL queries with single (multiple) condition(s). () is that with single (multiple) table(s).
Figure 4: An illustration of similar patterns.

Compared to CopyNet-SL, SQLGen shows higher accuracy on SQL queries with multiple conditions (tables) but lower accuracy for single condition (table), because CopyNet-SL penalizes correct SQLs with multiple conditions (tables) with different orders from the training data. SQLGen-Pretrain outperforms SQLGen by better learning attributes of values in natural language, which helps to improve the accuracy for single condition and table.

6 Conclusion and Future Work

We propose a SQL generation learning paradigm from indirect supervision of question-answer pairs in this paper. A COPYNET-based neural model integrating policy-based reinforcement learning is proposed, where a compound reward mechanism is designed to precisely learn the generation policy. Experimental results show that our model has higher accuracy than baselines on various datasets.

In the future work, we would like to design models that can generate more complex SQL queries, e.g. queries with more operators and comparators in the condition clauses.


  • Androutsopoulos et al. (1995) Ion Androutsopoulos, Graeme D. Ritchie, and Peter Thanisch. 1995. Natural language interfaces to databases - an introduction. Natural Language Engineering, 1(1):29–81.
  • Artzi et al. (2015) Yoav Artzi, Kenton Lee, and Luke Zettlemoyer. 2015. Broad-coverage CCG semantic parsing with AMR. In

    Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015

    , pages 1699–1710.
  • Bahdanau et al. (2014) Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR, abs/1409.0473.
  • Cai and Yates (2013) Qingqing Cai and Alexander Yates. 2013.

    Large-scale semantic parsing via schema matching and lexicon extension.

    In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, 4-9 August 2013, Sofia, Bulgaria, Volume 1: Long Papers, pages 423–433.
  • Cai et al. (2017) Ruichu Cai, Boyan Xu, Xiaoyan Yang, Zhenjie Zhang, and Zijian Li. 2017. An encoder-decoder framework translating natural language to database queries. CoRR, abs/1711.06061.
  • Chen et al. (2017) Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages 1870–1879.
  • Cho et al. (2014) Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR, abs/1406.1078.
  • Clarke et al. (2010) James Clarke, Dan Goldwasser, Ming-Wei Chang, and Dan Roth. 2010. Driving semantic parsing from the world’s response. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning, CoNLL 2010, pages 18–27, Stroudsburg, PA, USA. Association for Computational Linguistics.
  • Giordani and Moschitti (2012) Alessandra Giordani and Alessandro Moschitti. 2012. Translating questions to SQL queries with generative parsers discriminatively reranked. In COLING 2012, 24th International Conference on Computational Linguistics, Proceedings of the Conference: Posters, 8-15 December 2012, Mumbai, India, pages 401–410.
  • Gu et al. (2016) Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers.
  • Guu et al. (2017) Kelvin Guu, Panupong Pasupat, Evan Zheran Liu, and Percy Liang. 2017. From language to programs: Bridging reinforcement learning and maximum marginal likelihood. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages 1051–1062.
  • Iyer et al. (2017) Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer. 2017. Learning a neural semantic parser from user feedback. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages 963–973.
  • Li and Jagadish (2014) Fei Li and H. V. Jagadish. 2014. Constructing an interactive natural language interface for relational databases. Proc. VLDB Endow., 8(1):73–84.
  • Liang et al. (2011) Percy Liang, Michael I. Jordan, and Dan Klein. 2011. Learning dependency-based compositional semantics. In The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, pages 590–599.
  • Lu et al. (2016) Zhengdong Lu, Hang Li, and Ben Kao. 2016. Neural enquirer: learning to query tables in natural language. IEEE Data Eng. Bull., 39(3):63–73.
  • Mou et al. (2016) Lili Mou, Zhengdong Lu, Hang Li, and Zhi Jin. 2016. Coupling distributed and symbolic execution for natural language queries. CoRR, abs/1612.02741.
  • Pasupat and Liang (2015) Panupong Pasupat and Percy Liang. 2015. Compositional semantic parsing on semi-structured tables. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, pages 1470–1480.
  • Popescu et al. (2004) Ana-Maria Popescu, Alex Armanasu, Oren Etzioni, David Ko, and Alexander Yates. 2004. Modern natural language interfaces to databases: Composing statistical parsing with semantic tractability. In COLING 2004, 20th International Conference on Computational Linguistics, Proceedings of the Conference, 23-27 August 2004, Geneva, Switzerland.
  • Reddy et al. (2014) Siva Reddy, Mirella Lapata, and Mark Steedman. 2014. Large-scale semantic parsing without question-answer pairs. TACL, 2:377–392.
  • Roy et al. (2013) Senjuti Basu Roy, Martine De Cock, Vani Mandava, Ben Hamner, Ben Hamner, Ben Hamner, Ben Hamner, and Ben Hamner. 2013. The microsoft academic search dataset and kdd cup 2013. In Kdd Cup 2013 Workshop, page 1.
  • Schulman et al. (2015) John Schulman, Nicolas Heess, Theophane Weber, and Pieter Abbeel. 2015.

    Gradient estimation using stochastic computation graphs.

    In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 3528–3536.
  • Serban et al. (2017) Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C. Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In

    Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA

    , pages 3295–3301.
  • Shang et al. (2015) Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, pages 1577–1586.
  • Sutskever et al. (2014) Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 3104–3112.
  • Tu et al. (2016) Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. 2016. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers.
  • Vinyals et al. (2015) Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 2692–2700.
  • Warren and Pereira (1982) David H. D. Warren and Fernando C. N. Pereira. 1982. An efficient easily adaptable system for interpreting natural language queries. American Journal of Computational Linguistics, 8(3-4):110–122.
  • Xiong et al. (2016) Caiming Xiong, Victor Zhong, and Richard Socher. 2016. Dynamic coattention networks for question answering. CoRR, abs/1611.01604.
  • Xu et al. (2017) Xiaojun Xu, Chang Liu, and Dawn Song. 2017. SQLNet: Generating structured queries from natural language without reinforcement learning. CoRR, abs/1711.04436.
  • Yih et al. (2015) Wen-tau Yih, Ming-Wei Chang, Xiaodong He, and Jianfeng Gao. 2015. Semantic parsing via staged query graph generation: Question answering with knowledge base. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, pages 1321–1331.
  • Yu et al. (2018) Tao Yu, Zifan Li, Zilin Zhang, Rui Zhang, and Dragomir R. Radev. 2018. TypeSQL: Knowledge-based type-aware neural text-to-SQL generation. CoRR, abs/1804.09769.
  • Zelle and Mooney (1996) John M. Zelle and Raymond J. Mooney. 1996.

    Learning to parse database queries using inductive logic programming.

    In Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2, AAAI 1996, pages 1050–1055. AAAI Press.
  • Zettlemoyer and Collins (2005) Luke S. Zettlemoyer and Michael Collins. 2005. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence, UAI 2005, pages 658–666, Arlington, Virginia, United States. AUAI Press.
  • Zettlemoyer and Collins (2007) Luke S. Zettlemoyer and Michael Collins. 2007. Online learning of relaxed CCG grammars for parsing to logical form. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL-2007), pages 678–687.
  • Zhong et al. (2017) Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating structured queries from natural language using reinforcement learning. CoRR, abs/1709.00103.