VCWE: Visual Character-Enhanced Word Embeddings

02/23/2019
by   Chi Sun, et al.
0

Chinese is a logographic writing system, and the shape of Chinese characters contain rich syntactic and semantic information. In this paper, we propose a model to learn Chinese word embeddings via two-level composition: (1) a convolutional neural network to extract the intra-character compositionality from the visual shape of a character; (2) a recurrent neural network with self-attention to compose character representation into word embeddings. The word embeddings along with the network parameters are learned in the Skip-Gram framework. Evaluations demonstrate the superior performance of our model on four tasks: word similarity, sentiment analysis, named entity recognition and part-of-speech tagging.

READ FULL TEXT
research
08/26/2015

Component-Enhanced Chinese Character Embeddings

Distributed word representations are very useful for capturing semantic ...
research
11/13/2017

Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation

Character-based sequence labeling framework is flexible and efficient fo...
research
11/18/2016

Word and Document Embeddings based on Neural Network Approaches

Data representation is a fundamental task in machine learning. The repre...
research
08/15/2018

Multiple Character Embeddings for Chinese Word Segmentation

Chinese word segmentation (CWS) is often regarded as a character-based s...
research
06/03/2019

Chinese Embedding via Stroke and Glyph Information: A Dual-channel View

Recent studies have consistently given positive hints that morphology is...
research
12/11/2018

Hyperbolic Deep Learning for Chinese Natural Language Understanding

Recently hyperbolic geometry has proven to be effective in building embe...

Please sign up or login with your details

Forgot password? Click here to reset