A Survey on Zero Pronoun Translation

05/17/2023
by   Longyue Wang, et al.
0

Zero pronouns (ZPs) are frequently omitted in pro-drop languages (e.g. Chinese, Hungarian, and Hindi), but should be recalled in non-pro-drop languages (e.g. English). This phenomenon has been studied extensively in machine translation (MT), as it poses a significant challenge for MT systems due to the difficulty in determining the correct antecedent for the pronoun. This survey paper highlights the major works that have been undertaken in zero pronoun translation (ZPT) after the neural revolution, so that researchers can recognise the current state and future directions of this field. We provide an organisation of the literature based on evolution, dataset, method and evaluation. In addition, we compare and analyze competing models and evaluation metrics on different benchmarks. We uncover a number of insightful findings such as: 1) ZPT is in line with the development trend of large language model; 2) data limitation causes learning bias in languages and domains; 3) performance improvements are often reported on single benchmarks, but advanced methods are still far from real-world use; 4) general-purpose metrics are not reliable on nuances and complexities of ZPT, emphasizing the necessity of targeted metrics; 5) apart from commonly-cited errors, ZPs will cause risks of gender bias.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/01/2019

One Model to Learn Both: Zero Pronoun Prediction and Translation

Zero pronouns (ZPs) are frequently omitted in pro-drop languages, but sh...
research
06/09/2023

Good, but not always Fair: An Evaluation of Gender Bias for three commercial Machine Translation Systems

Machine Translation (MT) continues to make significant strides in qualit...
research
12/20/2022

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

The rapid growth of machine translation (MT) systems has necessitated co...
research
06/24/2011

Translation of Pronominal Anaphora between English and Spanish: Discrepancies and Evaluation

This paper evaluates the different tasks carried out in the translation ...
research
05/09/2022

Building Machine Translation Systems for the Next Thousand Languages

In this paper we share findings from our effort to build practical machi...
research
09/10/2021

Dynamic Terminology Integration for COVID-19 and other Emerging Domains

The majority of language domains require prudent use of terminology to e...

Please sign up or login with your details

Forgot password? Click here to reset