Long Short-Term Memory for Japanese Word Segmentation

09/23/2017
by   Yoshiaki Kitagawa, et al.
0

This study presents a Long Short-Term Memory (LSTM) neural network approach to Japanese word segmentation (JWS). Previous studies on Chinese word segmentation (CWS) succeeded in using recurrent neural networks such as LSTM and gated recurrent units (GRU). However, in contrast to Chinese, Japanese includes several character types, such as hiragana, katakana, and kanji, that produce orthographic variations and increase the difficulty of word segmentation. Additionally, it is important for JWS tasks to consider a global context, and yet traditional JWS approaches rely on local features. In order to address this problem, this study proposes employing an LSTM-based approach to JWS. The experimental results indicate that the proposed model achieves state-of-the-art accuracy with respect to various Japanese corpora.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2017

DAG-based Long Short-Term Memory for Neural Word Segmentation

Neural word segmentation has attracted more and more research interests ...
research
02/16/2016

Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation

Recurrent neural network(RNN) has been broadly applied to natural langua...
research
12/09/2017

Word Sense Disambiguation with LSTM: Do We Really Need 100 Billion Words?

Recently, Yuan et al. (2016) have shown the effectiveness of using Long ...
research
04/08/2021

M-Net with Bidirectional ConvLSTM for Cup and Disc Segmentation in Fundus Images

Glaucoma is a severe eye disease that is known to deteriorate optic neve...
research
04/12/2016

Disfluency Detection using a Bidirectional LSTM

We introduce a new approach for disfluency detection using a Bidirection...
research
06/14/2016

Neural Word Segmentation Learning for Chinese

Most previous approaches to Chinese word segmentation formalize this pro...

Please sign up or login with your details

Forgot password? Click here to reset