Contextual Text Denoising with Masked Language Models

10/30/2019
by   Yifu Sun, et al.
0

Recently, with the help of deep learning models, significant advances have been made in different Natural Language Processing (NLP) tasks. Unfortunately, state-of-the-art models are vulnerable to noisy texts. We propose a new contextual text denoising algorithm based on the ready-to-use masked language model. The proposed algorithm does not require retraining of the model and can be integrated into any NLP system without additional training on paired cleaning training data. We evaluate our method under synthetic noise and natural noise and show that the proposed algorithm can use context information to correct noise text and improve the performance of noisy inputs in several downstream tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2021

Robust Embeddings Via Distributions

Despite recent monumental advances in the field, many Natural Language P...
research
09/10/2021

How May I Help You? Using Neural Text Simplification to Improve Downstream NLP Tasks

The general goal of text simplification (TS) is to reduce text complexit...
research
09/14/2019

Ouroboros: On Accelerating Training of Transformer-Based Language Models

Language models are essential for natural language processing (NLP) task...
research
10/14/2021

Compressibility of Distributed Document Representations

Contemporary natural language processing (NLP) revolves around learning ...
research
07/15/2021

Robust Learning for Text Classification with Multi-source Noise Simulation and Hard Example Mining

Many real-world applications involve the use of Optical Character Recogn...
research
11/15/2022

When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE Systems for Downstream Applications

Open Information Extraction (OpenIE) has been used in the pipelines of v...
research
10/03/2016

Nonsymbolic Text Representation

We introduce the first generic text representation model that is complet...

Please sign up or login with your details

Forgot password? Click here to reset