NeuSpell: A Neural Spelling Correction Toolkit

10/21/2020
by   Sai Muralidhar Jayanthi, et al.
15

We introduce NeuSpell, an open-source toolkit for spelling correction in English. Our toolkit comprises ten different models, and benchmarks them on naturally occurring misspellings from multiple sources. We find that many systems do not adequately leverage the context around the misspelt token. To remedy this, (i) we train neural models using spelling errors in context, synthetically constructed by reverse engineering isolated misspellings; and (ii) use contextual representations. By training on our synthetic examples, correction rates improve by 9 trained on randomly sampled character perturbations. Using richer contextual representations boosts the correction rate by another 3 practitioners to use our proposed and existing spelling correction systems, both via a unified command line, as well as a web interface. Among many potential applications, we demonstrate the utility of our spell-checkers in combating adversarial misspellings. The toolkit can be accessed at neuspell.github.io. Code and pretrained models are available at http://github.com/neuspell/neuspell.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2022

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

PaddleSpeech is an open-source all-in-one speech toolkit. It aims at fac...
research
02/17/2018

CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++

This paper presented an open-source neural machine translation toolkit n...
research
09/28/2019

OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction

OpenNRE is an open-source and extensible toolkit that provides a unified...
research
04/07/2021

EXPATS: A Toolkit for Explainable Automated Text Scoring

Automated text scoring (ATS) tasks, such as automated essay scoring and ...
research
09/20/2021

StreamSide: A Fully-Customizable Open-Source Toolkit for Efficient Annotation of Meaning Representations

This demonstration paper presents StreamSide, an open-source toolkit for...
research
01/15/2017

DyNet: The Dynamic Neural Network Toolkit

We describe DyNet, a toolkit for implementing neural network models base...
research
09/30/2020

The MIDI Degradation Toolkit: Symbolic Music Augmentation and Correction

In this paper, we introduce the MIDI Degradation Toolkit (MDTK), contain...

Please sign up or login with your details

Forgot password? Click here to reset