Low-Resource Text Classification using Domain-Adversarial Learning

by   Daniel Grießhaber, et al.
University of Stuttgart

Deep learning techniques have recently shown to be successful in many natural language processing tasks forming state-of-the-art systems. They require, however, a large amount of annotated data which is often missing. This paper explores the use of domain-adversarial learning as a regularizer to avoid overfitting when training domain invariant features for deep, complex neural network in low-resource and zero-resource settings in new target domains or languages. In the case of new languages, we show that monolingual word-vectors can be directly used for training without pre-alignment. Their projection into a common space can be learnt ad-hoc at training time reaching the final performance of pretrained multilingual word-vectors.


page 1

page 2

page 3

page 4


Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages

The contrast between the need for large amounts of data for current Natu...

Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains

Neural language modeling (LM) has led to significant improvements in sev...

Zero-Resource Multilingual Model Transfer: Learning What to Share

Modern natural language processing and understanding applications have e...

Dealing with Abbreviations in the Slovenian Biographical Lexicon

Abbreviations present a significant challenge for NLP systems because th...

Handwriting Recognition in Low-resource Scripts using Adversarial Learning

Handwritten Word Recognition and Spotting is a challenging field dealing...

FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs

We present FewShotTextGCN, a novel method designed to effectively utiliz...

Please sign up or login with your details

Forgot password? Click here to reset