Probing Multilingual Sentence Representations With X-Probe

06/12/2019
by   Vinit Ravishankar, et al.
0

This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain. In doing so, we make two contributions: first, we provide datasets for multilingual probing, derived from Wikipedia, in five languages, viz. English, French, German, Spanish and Russian. Second, we evaluate six sentence encoders for each language, each trained by mapping sentence representations to English sentence representations, using sentences in a parallel corpus. We discover that cross-lingually mapped representations are often better at retaining certain linguistic information than representations derived from English encoders trained on natural language inference (NLI) as a downstream task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2020

What does it mean to be language-agnostic? Probing multilingual sentence encoders for typological properties

Multilingual sentence encoders have seen much success in cross-lingual m...
research
06/16/2020

How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation

Sentence encoders map sentences to real valued vectors for use in downst...
research
12/28/2020

Universal Sentence Representation Learning with Conditional Masked Language Model

This paper presents a novel training method, Conditional Masked Language...
research
04/18/2019

Continual Learning for Sentence Representations Using Conceptors

Distributed representations of sentences have become ubiquitous in natur...
research
07/26/2022

Training Effective Neural Sentence Encoders from Automatically Mined Paraphrases

Sentence embeddings are commonly used in text clustering and semantic re...
research
06/02/2021

Discrete Cosine Transform as Universal Sentence Encoder

Modern sentence encoders are used to generate dense vector representatio...
research
08/28/2015

Understanding Editing Behaviors in Multilingual Wikipedia

Multilingualism is common offline, but we have a more limited understand...

Please sign up or login with your details

Forgot password? Click here to reset