Backretrieval: An Image-Pivoted Evaluation Metric for Cross-Lingual Text Representations Without Parallel Corpora

05/11/2021
by   Mikhail Fain, et al.
3

Cross-lingual text representations have gained popularity lately and act as the backbone of many tasks such as unsupervised machine translation and cross-lingual information retrieval, to name a few. However, evaluation of such representations is difficult in the domains beyond standard benchmarks due to the necessity of obtaining domain-specific parallel language data across different pairs of languages. In this paper, we propose an automatic metric for evaluating the quality of cross-lingual textual representations using images as a proxy in a paired image-text evaluation dataset. Experimentally, Backretrieval is shown to highly correlate with ground truth metrics on annotated datasets, and our analysis shows statistically significant improvements over baselines. Our experiments conclude with a case study on a recipe dataset without parallel cross-lingual data. We illustrate how to judge cross-lingual embedding quality with Backretrieval, and validate the outcome with a small human study.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2021

Language Embeddings for Typology and Cross-lingual Transfer Learning

Cross-lingual language tasks typically require a substantial amount of a...
research
04/10/2019

Cross-lingual Visual Verb Sense Disambiguation

Recent work has shown that visual context improves cross-lingual sense d...
research
05/02/2018

Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only

We propose a fully unsupervised framework for ad-hoc cross-lingual infor...
research
09/16/2023

X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs

Understanding when two pieces of text convey the same information is a g...
research
10/13/2020

Modeling the Music Genre Perception across Language-Bound Cultures

The music genre perception expressed through human annotations of artist...
research
05/17/2022

Consistent Human Evaluation of Machine Translation across Language Pairs

Obtaining meaningful quality scores for machine translation systems thro...
research
05/28/2023

Parallel Data Helps Neural Entity Coreference Resolution

Coreference resolution is the task of finding expressions that refer to ...

Please sign up or login with your details

Forgot password? Click here to reset