A hybrid text normalization system using multi-head self-attention for mandarin

11/11/2019
by   Junhui Zhang, et al.
0

In this paper, we propose a hybrid text normalization system using multi-head self-attention. The system combines the advantages of a rule-based model and a neural model for text preprocessing tasks. Previous studies in Mandarin text normalization usually use a set of hand-written rules, which are hard to improve on general cases. The idea of our proposed system is motivated by the neural models from recent studies and has a better performance on our internal news corpus. This paper also includes different attempts to deal with imbalanced pattern distribution of the dataset. Overall, the performance of the system is improved by over 1.5 improve further.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2020

CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with Multi-Head Self-Attention Weights based Counterfactual Detection

In this paper, we describe an approach for modelling causal reasoning in...
research
07/27/2022

Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention

Impressive performance of Transformer has been attributed to self-attent...
research
11/15/2021

Searching for TrioNet: Combining Convolution with Local and Global Self-Attention

Recently, self-attention operators have shown superior performance as a ...
research
06/13/2023

Hybrid lemmatization in HuSpaCy

Lemmatization is still not a trivial task for morphologically rich langu...
research
04/07/2022

tmVar 3.0: an improved variant concept recognition and normalization tool

Previous studies have shown that automated text-mining tools are becomin...
research
01/14/2021

Interpretable Multi-Head Self-Attention model for Sarcasm Detection in social media

Sarcasm is a linguistic expression often used to communicate the opposit...
research
12/07/2021

Hybrid Self-Attention NEAT: A novel evolutionary approach to improve the NEAT algorithm

This article presents a "Hybrid Self-Attention NEAT" method to improve t...

Please sign up or login with your details

Forgot password? Click here to reset