DeepAI
Log In Sign Up

Spelling Correction with Denoising Transformer

05/12/2021
by   Alex Kuznetsov, et al.
0

We present a novel method of performing spelling correction on short input strings, such as search queries or individual words. At its core lies a procedure for generating artificial typos which closely follow the error patterns manifested by humans. This procedure is used to train the production spelling correction model based on a transformer architecture. This model is currently served in the HubSpot product search. We show that our approach to typo generation is superior to the widespread practice of adding noise, which ignores human patterns. We also demonstrate how our approach may be extended to resource-scarce settings and train spelling correction models for Arabic, Greek, Russian, and Setswana languages, without using any labeled data.

READ FULL TEXT

page 1

page 2

page 3

page 4

01/30/2021

Learning From How Human Correct

In industry NLP application, our manually labeled data has a certain num...
04/16/2021

Comparison of Grammatical Error Correction Using Back-Translation Models

Grammatical error correction (GEC) suffers from a lack of sufficient par...
09/13/2021

Post-OCR Document Correction with large Ensembles of Character Sequence Models

In this paper, we propose a novel method based on character sequence-to-...
11/01/2021

VSEC: Transformer-based Model for Vietnamese Spelling Correction

Spelling error correction is one of topics which have a long history in ...
08/02/2021

Correcting Arabic Soft Spelling Mistakes using BiLSTM-based Machine Learning

Soft spelling errors are a class of spelling mistakes that is widespread...