Learning Bilingual Sentence Embeddings via Autoencoding and Computing Similarities with a Multilayer Perceptron

06/05/2019
by   Yunsu Kim, et al.
0

We propose a novel model architecture and training algorithm to learn bilingual sentence embeddings from a combination of parallel and monolingual data. Our method connects autoencoding and neural machine translation to force the source and target sentence embeddings to share the same space without the help of a pivot language or an additional transformation. We train a multilayer perceptron on top of the sentence embeddings to extract good bilingual sentence pairs from nonparallel or noisy parallel data. Our approach shows promising performance on sentence alignment recovery and the WMT 2018 parallel corpus filtering tasks with only a single model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2021

Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining

Existing models of multilingual sentence embeddings require large parall...
research
07/31/2018

Effective Parallel Corpus Mining using Bilingual Sentence Embeddings

This paper presents an effective approach for parallel corpus mining usi...
research
06/06/2017

Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext

We consider the problem of learning general-purpose, paraphrastic senten...
research
04/17/2021

Sentence Alignment with Parallel Documents Helps Biomedical Machine Translation

The existing neural machine translation system has achieved near human-l...
research
01/18/2022

Improve Sentence Alignment by Divide-and-conquer

In this paper, we introduce a divide-and-conquer algorithm to improve se...
research
10/15/2020

Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings

We describe an unsupervised method to create pseudo-parallel corpora for...
research
05/31/2023

Sentence Simplification Using Paraphrase Corpus for Initialization

Neural sentence simplification method based on sequence-to-sequence fram...

Please sign up or login with your details

Forgot password? Click here to reset