nvBench: A Large-Scale Synthesized Dataset for Cross-Domain Natural Language to Visualization Task

12/24/2021
by   Yuyu Luo, et al.
0

NL2VIS - which translates natural language (NL) queries to corresponding visualizations (VIS) - has attracted more and more attention both in commercial visualization vendors and academic researchers. In the last few years, the advanced deep learning-based models have achieved human-like abilities in many natural language processing (NLP) tasks, which clearly tells us that the deep learning-based technique is a good choice to push the field of NL2VIS. However, a big balk is the lack of benchmarks with lots of (NL, VIS) pairs. We present nvBench, the first large-scale NL2VIS benchmark, containing 25,750 (NL, VIS) pairs from 750 tables over 105 domains, synthesized from (NL, SQL) benchmarks to support cross-domain NL2VIS task. The quality of nvBench has been extensively validated by 23 experts and 300+ crowd workers. Deep learning-based models training using nvBench demonstrate that nvBench can push the field of NL2VIS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/09/2017

Recent Trends in Deep Learning Based Natural Language Processing

Deep learning methods employ multiple processing layers to learn hierarc...
research
06/19/2019

Surf at MEDIQA 2019: Improving Performance of Natural Language Inference in the Clinical Domain by Adopting Pre-trained Language Model

While deep learning techniques have shown promising results in many natu...
research
05/14/2019

Is Word Segmentation Necessary for Deep Learning of Chinese Representations?

Segmenting a chunk of text into words is usually the first step of proce...
research
12/12/2021

Weakly Supervised Mapping of Natural Language to SQL through Question Decomposition

Natural Language Interfaces to Databases (NLIDBs), where users pose quer...
research
05/07/2020

Quda: Natural Language Queries for Visual Data Analytics

Visualization-oriented natural language interfaces (V-NLIs) have been ex...
research
03/22/2021

BERT: A Review of Applications in Natural Language Processing and Understanding

In this review, we describe the application of one of the most popular d...
research
08/01/2017

A Continuously Growing Dataset of Sentential Paraphrases

A major challenge in paraphrase research is the lack of parallel corpora...

Please sign up or login with your details

Forgot password? Click here to reset