The E2E Dataset: New Challenges For End-to-End Generation

06/28/2017
by   Jekaterina Novikova, et al.
0

This paper describes the E2E data, a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area. The E2E dataset poses new challenges: (1) its human reference texts show more lexical richness and syntactic variation, including discourse phenomena; (2) generating from this set requires content selection. As such, learning from this dataset promises more natural, varied and less template-like system utterances. We also establish a baseline on this dataset, which illustrates some of the difficulties associated with this data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset