A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios

10/23/2020
by   Michael A. Hedderich, et al.
0

Current developments in natural language processing offer challenges and opportunities for low-resource languages and domains. Deep neural networks are known for requiring large amounts of training data which might not be available in resource-lean scenarios. However, there is also a growing body of works to improve the performance in low-resource settings. Motivated by fundamental changes towards neural models and the currently popular pre-train and fine-tune paradigm, we give an overview of promising approaches for low-resource natural language processing. After a discussion about the definition of low-resource scenarios and the different dimensions of data availability, we then examine methods that enable learning when training data is sparse. This includes mechanisms to create additional labeled data like data augmentation and distant supervision as well as transfer learning settings that reduce the need for target supervision. The survey closes with a brief look into methods suggested in non-NLP machine learning communities, which might be beneficial for NLP in low-resource scenarios

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2020

Low-Resource Adaptation of Neural NLP Models

Real-world applications of natural language processing (NLP) are challen...
research
09/04/2019

Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set

Development sets are impractical to obtain for real low-resource languag...
research
10/07/2020

Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages

Multilingual transformer models like mBERT and XLM-RoBERTa have obtained...
research
06/14/2021

An Empirical Survey of Data Augmentation for Limited Data Learning in NLP

NLP has achieved great progress in the past decade through the use of ne...
research
02/16/2022

Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective

Knowledge Extraction (KE) which aims to extract structural information f...
research
11/12/2021

Exploiting all samples in low-resource sentence classification: early stopping and initialization parameters

In low resource settings, deep neural models have often shown lower perf...
research
12/07/2019

Unsung Challenges of Building and Deploying Language Technologies for Low Resource Language Communities

In this paper, we examine and analyze the challenges associated with dev...

Please sign up or login with your details

Forgot password? Click here to reset