A Character-Level Approach to the Text Normalization Problem Based on a New Causal Encoder

03/06/2019
by   Adrián Javaloy Bornás, et al.
0

Text normalization is a ubiquitous process that appears as the first step of many Natural Language Processing problems. However, previous Deep Learning approaches have suffered from so-called silly errors, which are undetectable on unsupervised frameworks, making those models unsuitable for deployment. In this work, we make use of an attention-based encoder-decoder architecture that overcomes these undetectable errors by using a fine-grained character-level approach rather than a word-level one. Furthermore, our new general-purpose encoder based on causal convolutions, called Causal Feature Extractor (CFE), is introduced and compared to other common encoders. The experimental results show the feasibility of this encoder, which leverages the attention mechanisms the most and obtains better results in terms of accuracy, number of parameters and convergence time. While our method results in a slightly worse initial accuracy (92.74 obtaining a more robust model for deployment. Furthermore, there is still plenty of room for future improvements that will push even further these advantages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2016

Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction

We demonstrate that an attention-based encoder-decoder model can be used...
research
12/28/2019

Improved Multi-Stage Training of Online Attention-based Encoder-Decoder Models

In this paper, we propose a refined multi-stage multi-task training stra...
research
06/13/2021

Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

Attention-based encoder-decoder framework is widely used in the scene te...
research
05/18/2022

Evaluation of Transfer Learning for Polish with a Text-to-Text Model

We introduce a new benchmark for assessing the quality of text-to-text m...
research
05/14/2021

Empirical Analysis of Image Caption Generation using Deep Learning

Automated image captioning is one of the applications of Deep Learning w...
research
08/26/2019

Adaptive Embedding Gate for Attention-Based Scene Text Recognition

Scene text recognition has attracted particular research interest becaus...

Please sign up or login with your details

Forgot password? Click here to reset