RedPenNet for Grammatical Error Correction: Outputs to Tokens, Attentions to Spans

09/19/2023
by   Bohdan Didenko, et al.
1

The text editing tasks, including sentence fusion, sentence splitting and rephrasing, text simplification, and Grammatical Error Correction (GEC), share a common trait of dealing with highly similar input and output sequences. This area of research lies at the intersection of two well-established fields: (i) fully autoregressive sequence-to-sequence approaches commonly used in tasks like Neural Machine Translation (NMT) and (ii) sequence tagging techniques commonly used to address tasks such as Part-of-speech tagging, Named-entity recognition (NER), and similar. In the pursuit of a balanced architecture, researchers have come up with numerous imaginative and unconventional solutions, which we're discussing in the Related Works section. Our approach to addressing text editing tasks is called RedPenNet and is aimed at reducing architectural and parametric redundancies presented in specific Sequence-To-Edits models, preserving their semi-autoregressive advantages. Our models achieve F_0.5 scores of 77.60 on the BEA-2019 (test), which can be considered as state-of-the-art the only exception for system combination and 67.71 on the UAGEC+Fluency (test) benchmarks. This research is being conducted in the context of the UNLP 2023 workshop, where it was presented as a paper as a paper for the Shared Task in Grammatical Error Correction (GEC) for Ukrainian. This study aims to apply the RedPenNet approach to address the GEC problem in the Ukrainian language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2022

EdiT5: Semi-Autoregressive Text-Editing with T5 Warm-Start

We present EdiT5 - a novel semi-autoregressive text-editing approach des...
research
09/23/2020

Seq2Edits: Sequence Transduction Using Span-level Edit Operations

We propose Seq2Edits, an open-vocabulary approach to sequence editing fo...
research
03/24/2020

Felix: Flexible Text Editing Through Tagging and Insertion

We present Felix — a flexible text-editing approach for generation, desi...
research
12/14/2020

Vartani Spellcheck – Automatic Context-Sensitive Spelling Correction of OCR-generated Hindi Text Using BERT and Levenshtein Distance

Traditional Optical Character Recognition (OCR) systems that generate te...
research
05/20/2022

Lossless Acceleration for Seq2seq Generation with Aggressive Decoding

We study lossless acceleration for seq2seq generation with a novel decod...
research
10/21/2022

Text Editing as Imitation Game

Text editing, such as grammatical error correction, arises naturally fro...
research
10/28/2021

Diversity-Driven Combination for Grammatical Error Correction

Grammatical error correction (GEC) is the task of detecting and correcti...

Please sign up or login with your details

Forgot password? Click here to reset