Disentangled Phonetic Representation for Chinese Spelling Correction

05/24/2023
by   Zihong Liang, et al.
0

Chinese Spelling Correction (CSC) aims to detect and correct erroneous characters in Chinese texts. Although efforts have been made to introduce phonetic information (Hanyu Pinyin) in this task, they typically merge phonetic representations with character representations, which tends to weaken the representation effect of normal texts. In this work, we propose to disentangle the two types of features to allow for direct interaction between textual and phonetic information. To learn useful phonetic representations, we introduce a pinyin-to-character objective to ask the model to predict the correct characters based solely on phonetic information, where a separation mask is imposed to disable attention from phonetic input to text. To avoid overfitting the phonetics, we further design a self-distillation module to ensure that semantic information plays a major role in the prediction. Extensive experiments on three CSC benchmarks demonstrate the superiority of our method in using phonetic information.

READ FULL TEXT
research
05/26/2021

Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

Chinese Spell Checking (CSC) aims to detect and correct erroneous charac...
research
08/30/2019

Detect Camouflaged Spam Content via StoneSkipping: Graph and Text Joint Embedding for Chinese Character Variation Representation

The task of Chinese text spam detection is very challenging due to both ...
research
04/15/2021

An Alignment-Agnostic Model for Chinese Text Error Correction

This paper investigates how to correct Chinese text errors with types of...
research
07/17/2022

Stroke-Based Autoencoders: Self-Supervised Learners for Efficient Zero-Shot Chinese Character Recognition

Chinese characters carry a wealth of morphological and semantic informat...
research
05/05/2023

Block the Label and Noise: An N-Gram Masked Speller for Chinese Spell Checking

Recently, Chinese Spell Checking(CSC), a task to detect erroneous charac...
research
07/17/2022

Contextual Similarity is More Valuable than Character Similarity: Curriculum Learning for Chinese Spell Checking

Chinese Spell Checking (CSC) task aims to detect and correct Chinese spe...
research
08/26/2022

AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in Educational Applications

To automatically correct handwritten assignments, the traditional approa...

Please sign up or login with your details

Forgot password? Click here to reset