On the Choice of Auxiliary Languages for Improved Sequence Tagging

05/19/2020
by   Lukas Lange, et al.
0

Recent work showed that embeddings from related languages can improve the performance of sequence tagging, even for monolingual models. In this analysis paper, we investigate whether the best auxiliary language can be predicted based on language distances and show that the most related language is not always the best auxiliary language. Further, we show that attention-based meta-embeddings can effectively combine pre-trained embeddings from different languages for sequence tagging and set new state-of-the-art results for part-of-speech tagging in five languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2021

HinFlair: pre-trained contextual string embeddings for pos tagging and text classification in the Hindi language

Recent advancements in language models based on recurrent neural network...
research
10/23/2020

Adversarial Learning of Feature-based Meta-Embeddings

Certain embedding types outperform others in different scenarios, e.g., ...
research
06/11/2020

A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages

We use the multilingual OSCAR corpus, extracted from Common Crawl via la...
research
12/14/2022

AsPOS: Assamese Part of Speech Tagger using Deep Learning Approach

Part of Speech (POS) tagging is crucial to Natural Language Processing (...
research
07/02/2018

Improving part-of-speech tagging via multi-task learning and character-level word representations

In this paper, we explore the ways to improve POS-tagging using various ...
research
11/28/2018

GIRNet: Interleaved Multi-Task Recurrent State Sequence Models

In several natural language tasks, labeled sequences are available in se...
research
02/28/2019

Better, Faster, Stronger Sequence Tagging Constituent Parsers

Sequence tagging models for constituent parsing are faster, but less acc...

Please sign up or login with your details

Forgot password? Click here to reset