Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models

11/28/2021
by   Zein Shaheen, et al.
0

Zero-shot cross-lingual transfer is an important feature in modern NLP models and architectures to support low-resource languages. In this work, We study zero-shot cross-lingual transfer from English to French and German under Multi-Label Text Classification, where we train a classifier using English training set, and we test using French and German test sets. We extend EURLEX57K dataset, the English dataset for topic classification of legal documents, with French and German official translation. We investigate the effect of using some training techniques, namely Gradual Unfreezing and Language Model finetuning, on the quality of zero-shot cross-lingual transfer. We find that Language model finetuning of multi-lingual pre-trained model (M-DistilBERT, M-BERT) leads to 32.0-34.94 on French and German test sets correspondingly. Also, Gradual unfreezing of pre-trained model's layers during training results in relative improvement of 38-45 Training scheme using English, French and German training sets, zero-shot BERT-based classification model reaches 86 jointly-trained BERT-based classification model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2022

Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification

We consider zero-shot cross-lingual transfer in legal topic classificati...
research
08/03/2022

Cross-lingual Approaches for the Detection of Adverse Drug Reactions in German from a Patient's Perspective

In this work, we present the first corpus for German Adverse Drug Reacti...
research
01/24/2023

Cross-lingual German Biomedical Information Extraction: from Zero-shot to Human-in-the-Loop

This paper presents our project proposal for extracting biomedical infor...
research
09/20/2021

BERT Cannot Align Characters

In previous work, it has been shown that BERT can adequately align cross...
research
11/27/2019

Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction

We describe the design, the evaluation setup, and the results of the 201...
research
03/18/2020

X-Stance: A Multilingual Multi-Target Dataset for Stance Detection

We extract a large-scale stance detection dataset from comments written ...
research
10/09/2021

On the Relation between Syntactic Divergence and Zero-Shot Performance

We explore the link between the extent to which syntactic relations are ...

Please sign up or login with your details

Forgot password? Click here to reset