Improving part-of-speech tagging via multi-task learning and character-level word representations

07/02/2018
by   Daniil Anastasyev, et al.
2

In this paper, we explore the ways to improve POS-tagging using various types of auxiliary losses and different word representations. As a baseline, we utilized a BiLSTM tagger, which is able to achieve state-of-the-art results on the sequence labelling tasks. We developed a new method for character-level word representation using feedforward neural network. Such representation gave us better results in terms of speed and performance of the model. We also applied a novel technique of pretraining such word representations with existing word vectors. Finally, we designed a new variant of auxiliary loss for sequence labelling tasks: an additional prediction of the neighbour labels. Such loss forces a model to learn the dependencies in-side a sequence of labels and accelerates the process of training. We test these methods on English and Russian languages.

READ FULL TEXT

page 10

page 11

page 12

page 13

research
07/27/2018

Improving Neural Sequence Labelling using Additional Linguistic Information

Sequence labelling is the task of assigning categorical labels to a data...
research
09/29/2019

Gated Task Interaction Framework for Multi-task Sequence Tagging

Recent studies have shown that neural models can achieve high performanc...
research
05/19/2020

On the Choice of Auxiliary Languages for Improved Sequence Tagging

Recent work showed that embeddings from related languages can improve th...
research
08/09/2015

Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation

We introduce a model for constructing vector representations of words by...
research
04/03/2019

Multi-task Learning for Chinese Word Usage Errors Detection

Chinese word usage errors often occur in non-native Chinese learners' wr...
research
10/03/2019

Character Feature Engineering for Japanese Word Segmentation

On word segmentation problems, machine learning architecture engineering...
research
05/01/2020

Low Resource Multi-Task Sequence Tagging – Revisiting Dynamic Conditional Random Fields

We compare different models for low resource multi-task sequence tagging...

Please sign up or login with your details

Forgot password? Click here to reset