Effective Transfer Learning for Low-Resource Natural Language Understanding

08/19/2022
by   Zihan Liu, et al.
0

Natural language understanding (NLU) is the task of semantic decoding of human languages by machines. NLU models rely heavily on large training data to ensure good performance. However, substantial languages and domains have very few data resources and domain experts. It is necessary to overcome the data scarcity challenge, when very few or even zero training samples are available. In this thesis, we focus on developing cross-lingual and cross-domain methods to tackle the low-resource issues. First, we propose to improve the model's cross-lingual ability by focusing on the task-related keywords, enhancing the model's robustness and regularizing the representations. We find that the representations for low-resource languages can be easily and greatly improved by focusing on just the keywords. Second, we present Order-Reduced Modeling methods for the cross-lingual adaptation, and find that modeling partial word orders instead of the whole sequence can improve the robustness of the model against word order differences between languages and task knowledge transfer to low-resource languages. Third, we propose to leverage different levels of domain-related corpora and additional masking of data in the pre-training for the cross-domain adaptation, and discover that more challenging pre-training can better address the domain discrepancy issue in the task knowledge transfer. Finally, we introduce a coarse-to-fine framework, Coach, and a cross-lingual and cross-domain parsing framework, X2Parser. Coach decomposes the representation learning process into a coarse-grained and a fine-grained feature learning, and X2Parser simplifies the hierarchical task structures into flattened ones. We observe that simplifying task structures makes the representation learning more effective for low-resource languages and domains.

READ FULL TEXT
research
12/24/2020

Cross-lingual Dependency Parsing as Domain Adaptation

In natural language processing (NLP), cross-lingual transfer learning is...
research
06/07/2021

X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented Compositional Semantic Parsing

Task-oriented compositional semantic parsing (TCSP) handles complex nest...
research
04/20/2021

X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural Language Understanding and Question Answering

Multilingual models, such as M-BERT and XLM-R, have gained increasing po...
research
05/31/2023

MetaXLR – Mixed Language Meta Representation Transformation for Low-resource Cross-lingual Learning based on Multi-Armed Bandit

Transfer learning for extremely low resource languages is a challenging ...
research
12/04/2022

Languages You Know Influence Those You Learn: Impact of Language Characteristics on Multi-Lingual Text-to-Text Transfer

Multi-lingual language models (LM), such as mBERT, XLM-R, mT5, mBART, ha...
research
03/28/2022

Isomorphic Cross-lingual Embeddings for Low-Resource Languages

Cross-Lingual Word Embeddings (CLWEs) are a key component to transfer li...
research
09/03/2021

Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding

Lack of training data presents a grand challenge to scaling out spoken l...

Please sign up or login with your details

Forgot password? Click here to reset