MFE-NER: Multi-feature Fusion Embedding for Chinese Named Entity Recognition

09/16/2021
by   Jiatong Li, et al.
0

Pre-trained language models lead Named Entity Recognition (NER) into a new era, while some more knowledge is needed to improve their performance in specific problems. In Chinese NER, character substitution is a complicated linguistic phenomenon. Some Chinese characters are quite similar for sharing the same components or having similar pronunciations. People replace characters in a named entity with similar characters to generate a new collocation but referring to the same object. It becomes even more common in the Internet age and is often used to avoid Internet censorship or just for fun. Such character substitution is not friendly to those pre-trained language models because the new collocations are occasional. As a result, it always leads to unrecognizable or recognition errors in the NER task. In this paper, we propose a new method, Multi-Feature Fusion Embedding for Chinese Named Entity Recognition (MFE-NER), to strengthen the language pattern of Chinese and handle the character substitution problem in Chinese Named Entity Recognition. MFE fuses semantic, glyph, and phonetic features together. In the glyph domain, we disassemble Chinese characters into components to denote structure features so that characters with similar structures can have close embedding space representation. Meanwhile, an improved phonetic system is also proposed in our work, making it reasonable to calculate phonetic similarity among Chinese characters. Experiments demonstrate that our method improves the overall performance of Chinese NER and especially performs well in informal language environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2021

MECT: Multi-Metadata Embedding based Cross-Transformer for Chinese Named Entity Recognition

Recently, word enhancement has become very popular for Chinese Named Ent...
research
09/22/2019

Using Chinese Glyphs for Named Entity Recognition

Most Named Entity Recognition (NER) systems use additional features like...
research
08/28/2019

Exploiting Multiple Embeddings for Chinese Named Entity Recognition

Identifying the named entities mentioned in text would enrich many seman...
research
01/15/2020

FGN: Fusion Glyph Network for Chinese Named Entity Recognition

Chinese NER is a challenging task. As pictographs, Chinese characters co...
research
07/06/2022

Rethinking the Value of Gazetteer in Chinese Named Entity Recognition

Gazetteer is widely used in Chinese named entity recognition (NER) to en...
research
09/13/2018

On the Strength of Character Language Models for Multilingual Named Entity Recognition

Character-level patterns have been widely used as features in English Na...
research
12/17/2019

Chinese Named Entity Recognition Augmented with Lexicon Memory

Inspired by a concept of content-addressable retrieval from cognitive sc...

Please sign up or login with your details

Forgot password? Click here to reset