Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method

by   Yutong Shao, et al.

Idiom translation is a challenging problem in machine translation because the meaning of idioms is non-compositional, and a literal translation is likely to be wrong. In this paper, we assess the quality of idiom translation of a modern neural MT system. We introduce a new evaluation method based on an idiom-specific blacklist of literal translations, based on the insight that the occurrence of any blacklisted words in the translation output indicates a likely translation error. We introduce a dataset, CIBB (Chinese Idioms Blacklists Bank), and perform an evaluation of a state-of-the-art Chinese-English neural MT system. Our evaluation confirms that our blacklist method is effective at identifying literal translation errors, and that a sizable number of idioms in our test set are mistranslated (36.5


