Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification

06/08/2022
by   Stratos Xenouleas, et al.
0

We consider zero-shot cross-lingual transfer in legal topic classification using the recent MultiEURLEX dataset. Since the original dataset contains parallel documents, which is unrealistic for zero-shot cross-lingual transfer, we develop a new version of the dataset without parallel documents. We use it to show that translation-based methods vastly outperform cross-lingual fine-tuning of multilingually pre-trained models, the best previous zero-shot transfer method for MultiEURLEX. We also develop a bilingual teacher-student zero-shot transfer approach, which exploits additional unlabeled documents of the target language and performs better than a model fine-tuned directly on labeled target language documents.

READ FULL TEXT
research
09/02/2021

MultiEURLEX – A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer

We introduce MULTI-EURLEX, a new multilingual dataset for topic classifi...
research
11/28/2021

Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models

Zero-shot cross-lingual transfer is an important feature in modern NLP m...
research
05/27/2023

Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution

Zero-shot cross-lingual transfer is when a multilingual model is trained...
research
04/11/2021

Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling

Neural topic models can augment or replace bag-of-words inputs with the ...
research
09/25/2022

An Empirical Study on Cross-X Transfer for Legal Judgment Prediction

Cross-lingual transfer learning has proven useful in a variety of Natura...
research
04/16/2020

Cross-lingual Contextualized Topic Models with Zero-shot Learning

Many data sets in a domain (reviews, forums, news, etc.) exist in parall...
research
08/10/2023

Finding Already Debunked Narratives via Multistage Retrieval: Enabling Cross-Lingual, Cross-Dataset and Zero-Shot Learning

The task of retrieving already debunked narratives aims to detect storie...

Please sign up or login with your details

Forgot password? Click here to reset