KNPTC: Knowledge and Neural Machine Translation Powered Chinese Pinyin Typo Correction

05/02/2018
by   Hengyi Cai, et al.
0

Chinese pinyin input methods are very important for Chinese language processing. Actually, users may make typos inevitably when they input pinyin. Moreover, pinyin typo correction has become an increasingly important task with the popularity of smartphones and the mobile Internet. How to exploit the knowledge of users typing behaviors and support the typo correction for acronym pinyin remains a challenging problem. To tackle these challenges, we propose KNPTC, a novel approach based on neural machine translation (NMT). In contrast to previous work, KNPTC is able to integrate explicit knowledge into NMT for pinyin typo correction, and is able to learn to correct a variety of typos without the guidance of manually selected constraints or languagespecific features. In this approach, we first obtain the transition probabilities between adjacent letters based on large-scale real-life datasets. Then, we construct the "ground-truth" alignments of training sentence pairs by utilizing these probabilities. Furthermore, these alignments are integrated into NMT to capture sensible pinyin typo correction patterns. KNPTC is applied to correct typos in real-life datasets, which achieves 32.77 accuracy rate of typo correction compared against the state-of-the-art system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2017

Memory-augmented Chinese-Uyghur Neural Machine Translation

Neural machine translation (NMT) has achieved notable performance recent...
research
09/26/2019

Large-scale Pretraining for Neural Machine Translation with Tens of Billions of Sentence Pairs

In this paper, we investigate the problem of training neural machine tra...
research
09/30/2022

Blur the Linguistic Boundary: Interpreting Chinese Buddhist Sutra in English via Neural Machine Translation

Buddhism is an influential religion with a long-standing history and pro...
research
10/24/2016

Bridging Neural Machine Translation and Bilingual Dictionaries

Neural Machine Translation (NMT) has become the new state-of-the-art in ...
research
08/27/2018

A Study of Reinforcement Learning for Neural Machine Translation

Recent studies have shown that reinforcement learning (RL) is an effecti...
research
06/10/2022

A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation

Chinese dialect text-to-speech(TTS) system usually can only be utilized ...

Please sign up or login with your details

Forgot password? Click here to reset