Cross-lingual Contextualized Topic Models with Zero-shot Learning

04/16/2020
by   Federico Bianchi, et al.
1

Many data sets in a domain (reviews, forums, news, etc.) exist in parallel languages. They all cover the same content, but the linguistic differences make it impossible to use traditional, bag-of-word-based topic models. Models have to be either single-language or suffer from a huge, but extremely sparse vocabulary. Both issues can be addressed by transfer learning. In this paper, we introduce a zero-shot cross-lingual topic model, i.e., our model learns topics on one language (here, English), and predicts them for documents in other languages. By using the text of the same document in different languages, we can evaluate the quality of the predictions. Our results show that topics are coherent and stable across languages, which suggests exciting future research directions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2022

Transfer Language Selection for Zero-Shot Cross-Lingual Abusive Language Detection

We study the selection of transfer languages for automatic abusive langu...
research
06/08/2022

Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification

We consider zero-shot cross-lingual transfer in legal topic classificati...
research
07/02/2020

Bayesian multilingual topic model for zero-shot cross-lingual topic identification

This paper presents a Bayesian multilingual topic model for learning lan...
research
03/05/2020

Zero-Shot Cross-Lingual Transfer with Meta Learning

Learning what to share between tasks has been a topic of high importance...
research
09/14/2021

Improving Zero-shot Cross-lingual Transfer between Closely Related Languages by injecting Character-level Noise

Cross-lingual transfer between a high-resource language and its dialects...
research
04/04/2022

Aligned Weight Regularizers for Pruning Pretrained Neural Networks

While various avenues of research have been explored for iterative pruni...
research
06/27/2022

Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding

This paper studies a transferable phoneme embedding framework that aims ...

Please sign up or login with your details

Forgot password? Click here to reset