Cross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling

09/12/2017
by   Yan Shao, et al.
0

This paper presents our segmentation system developed for the MLP 2017 shared tasks on cross-lingual word segmentation and morpheme segmentation. We model both word and morpheme segmentation as character-level sequence labelling tasks. The prevalent bidirectional recurrent neural network with conditional random fields as the output interface is adapted as the baseline system, which is further improved via ensemble decoding. Our universal system is applied to and extensively evaluated on all the official data sets without any language-specific adjustment. The official evaluation results indicate that the proposed model achieves outstanding accuracies both for word and morpheme segmentation on all the languages in various types when compared to the other participating systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2022

Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings

Zero-resource cross-lingual transfer approaches aim to apply supervised ...
research
07/09/2018

Universal Word Segmentation: Implementation and Interpretation

Word segmentation is a low-level NLP task that is non-trivial for a cons...
research
03/20/2016

Multi-Task Cross-Lingual Sequence Tagging from Scratch

We present a deep hierarchical recurrent neural network for sequence tag...
research
06/14/2018

Urdu Word Segmentation using Conditional Random Fields (CRFs)

State-of-the-art Natural Language Processing algorithms rely heavily on ...
research
05/11/2018

Neural Factor Graph Models for Cross-lingual Morphological Tagging

Morphological analysis involves predicting the syntactic traits of a wor...
research
10/17/2017

CASICT Tibetan Word Segmentation System for MLWS2017

We participated in the MLWS 2017 on Tibetan word segmentation task, our ...
research
02/08/2021

SLUA: A Super Lightweight Unsupervised Word Alignment Model via Cross-Lingual Contrastive Learning

Word alignment is essential for the down-streaming cross-lingual languag...

Please sign up or login with your details

Forgot password? Click here to reset