Word Embedding Perturbation for Sentence Classification

04/22/2018
by   Dongxu Zhang, et al.
0

In this technique report, we aim to mitigate the overfitting problem of natural language by applying data augmentation methods. Specifically, we attempt several types of noise to perturb the input word embedding, such as Gaussian noise, Bernoulli noise, and adversarial noise, etc. We also apply several constraints on different types of noise. By implementing these proposed data augmentation methods, the baseline models can gain improvements on several sentence classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2022

RPN: A Word Vector Level Data Augmentation Algorithm in Deep Learning for Language Understanding

This paper presents a new data augmentation algorithm for natural unders...
research
09/02/2020

On SkipGram Word Embedding Models with Negative Sampling: Unified Framework and Impact of Noise Distributions

SkipGram word embedding models with negative sampling, or SGN in short, ...
research
02/27/2023

A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit

Data augmentations are known to improve robustness in speech-processing ...
research
05/06/2021

Learning to Perturb Word Embeddings for Out-of-distribution QA

QA models based on pretrained language mod-els have achieved remarkable ...
research
08/23/2021

Sarcasm Detection in Twitter – Performance Impact while using Data Augmentation: Word Embeddings

Sarcasm is the use of words usually used to either mock or annoy someone...
research
12/14/2021

ImportantAug: a data augmentation agent for speech

We introduce ImportantAug, a technique to augment training data for spee...
research
04/11/2020

DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus

This paper focuses on how to extract opinions over each Persian sentence...

Please sign up or login with your details

Forgot password? Click here to reset