Translating the Unseen? Yorùbá → English MT in Low-Resource, Morphologically-Unmarked Settings

03/07/2021
by   Ife Adebara, et al.
0

Translating between languages where certain features are marked morphologically in one but absent or marked contextually in the other is an important test case for machine translation. When translating into English which marks (in)definiteness morphologically, from Yorùbá which uses bare nouns but marks these features contextually, ambiguities arise. In this work, we perform fine-grained analysis on how an SMT system compares with two NMT systems (BiLSTM and Transformer) when translating bare nouns in Yorùbá into English. We investigate how the systems what extent they identify BNs, correctly translate them, and compare with human translation patterns. We also analyze the type of errors each model makes and provide a linguistic description of these errors. We glean insights for evaluating model performance in low-resource settings. In translating bare nouns, our results show the transformer model outperforms the SMT and BiLSTM models for 4 categories, the BiLSTM outperforms the SMT model for 3 categories while the SMT outperforms the NMT models for 1 category.

READ FULL TEXT
research
08/18/2017

Neural machine translation for low-resource languages

Neural machine translation (NMT) approaches have improved the state of t...
research
02/27/2017

A case study on English-Malayalam Machine Translation

In this paper we present our work on a case study on Statistical Machine...
research
12/12/2018

SMT vs NMT: A Comparison over Hindi & Bengali Simple Sentences

In the present article, we identified the qualitative differences betwee...
research
05/06/2019

English-Bhojpuri SMT System: Insights from the Karaka Model

This thesis has been divided into six chapters namely: Introduction, Kar...
research
03/04/2020

Evaluating Low-Resource Machine Translation between Chinese and Vietnamese with Back-Translation

Back translation (BT) has been widely used and become one of standard te...
research
04/09/2020

On optimal transformer depth for low-resource language translation

Transformers have shown great promise as an approach to Neural Machine T...
research
06/22/2023

xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages

We introduce a new proxy score for evaluating bitext mining based on sim...

Please sign up or login with your details

Forgot password? Click here to reset