Third-Party Aligner for Neural Word Alignments

11/08/2022
by   Jinpeng Zhang, et al.
0

Word alignment is to find translationally equivalent words between source and target sentences. Previous work has demonstrated that self-training can achieve competitive word alignment results. In this paper, we propose to use word alignments generated by a third-party word aligner to supervise the neural word alignment training. Specifically, source word and target word of each word pair aligned by the third-party aligner are trained to be close neighbors to each other in the contextualized embedding space when fine-tuning a pre-trained cross-lingual language model. Experiments on the benchmarks of various language pairs show that our approach can surprisingly do self-correction over the third-party supervision by finding more accurate word alignments and deleting wrong word alignments, leading to better performance than various third-party word aligners, including the currently best one. When we integrate all supervisions from various third-party aligners, we achieve state-of-the-art word alignment performances, with averagely more than two points lower alignment error rates than the best third-party aligner. We released our code at https://github.com/sdongchuanqi/Third-Party-Supervised-Aligner.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2021

Word Alignment by Fine-tuning Embeddings on Parallel Corpora

Word alignment over parallel corpora has a wide variety of applications,...
research
06/11/2021

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

The cross-lingual language models are typically pretrained with masked l...
research
10/09/2022

Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment

Word alignment which aims to extract lexicon translation equivalents bet...
research
10/06/2021

Using Optimal Transport as Alignment Objective for fine-tuning Multilingual Contextualized Embeddings

Recent studies have proposed different methods to improve multilingual w...
research
03/16/2022

Graph Neural Networks for Multiparallel Word Alignment

After a period of decrease, interest in word alignments is increasing ag...
research
06/09/2023

WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction

Most existing word alignment methods rely on manual alignment datasets o...
research
06/30/2021

Learning a Reversible Embedding Mapping using Bi-Directional Manifold Alignment

We propose a Bi-Directional Manifold Alignment (BDMA) that learns a non-...

Please sign up or login with your details

Forgot password? Click here to reset