Emu: Enhancing Multilingual Sentence Embeddings with Semantic Specialization

09/15/2019
by   Wataru Hirota, et al.
0

We present Emu, a system that semantically enhances multilingual sentence embeddings. Our framework fine-tunes pre-trained multilingual sentence embeddings using two main components: a semantic classifier and a language discriminator. The semantic classifier improves the semantic similarity of related sentences, whereas the language discriminator enhances the multilinguality of the embeddings via multilingual adversarial training. Our experimental results based on several language pairs show that our specialized embeddings outperform the state-of-the-art multilingual sentence embedding model on the task of cross-lingual intent classification using only monolingual labeled data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

Linear Cross-Lingual Mapping of Sentence Embeddings

Semantics of a sentence is defined with much less ambiguity than semanti...
research
12/26/2018

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

We introduce an architecture to learn joint multilingual sentence repres...
research
10/18/2022

Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation

We introduce a new method to improve existing multilingual sentence embe...
research
06/01/2023

Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity

Previous work has shown that the representations output by contextual la...
research
12/21/2022

Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval

Contrastive learning has been successfully used for retrieval of semanti...
research
05/10/2023

LACoS-BLOOM: Low-rank Adaptation with Contrastive objective on 8 bits Siamese-BLOOM

Text embeddings are useful features for several NLP applications, such a...
research
06/03/2018

Learning Semantic Sentence Embeddings using Pair-wise Discriminator

In this paper, we propose a method for obtaining sentence-level embeddin...

Please sign up or login with your details

Forgot password? Click here to reset