Neural Data Augmentation via Example Extrapolation

02/02/2021
by   Kenton Lee, et al.
0

In many applications of machine learning, certain categories of examples may be underrepresented in the training data, causing systems to underperform on such "few-shot" cases at test time. A common remedy is to perform data augmentation, such as by duplicating underrepresented examples, or heuristically synthesizing new examples. But these remedies often fail to cover the full diversity and complexity of real examples. We propose a data augmentation approach that performs neural Example Extrapolation (Ex2). Given a handful of exemplars sampled from some distribution, Ex2 synthesizes new examples that also belong to the same distribution. The Ex2 model is learned by simulating the example generation procedure on data-rich slices of the data, and it is applied to underrepresented, few-shot slices. We apply Ex2 to a range of language understanding tasks and significantly improve over state-of-the-art methods on multiple few-shot learning benchmarks, including for relation extraction (FewRel) and intent classification + slot filling (SNIPS).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2021

Semi-Supervised Few-Shot Intent Classification and Slot Filling

Intent classification (IC) and slot filling (SF) are two fundamental tas...
research
10/09/2019

A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification

New conversation topics and functionalities are constantly being added t...
research
01/05/2023

HIT-SCIR at MMNLU-22: Consistency Regularization for Multilingual Spoken Language Understanding

Multilingual spoken language understanding (SLU) consists of two sub-tas...
research
01/28/2021

ProtoDA: Efficient Transfer Learning for Few-Shot Intent Classification

Practical sequence classification tasks in natural language processing o...
research
04/04/2019

HoloDetect: Few-Shot Learning for Error Detection

We introduce a few-shot learning framework for error detection. We show ...
research
12/13/2020

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Slot filling, a fundamental module of spoken language understanding, oft...
research
10/27/2018

Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks

Deep neural networks suffer from over-fitting and catastrophic forgettin...

Please sign up or login with your details

Forgot password? Click here to reset