Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
The goal of this work is to design a machine translation system for a low-resource family of dialects, collectively known as Swiss German. We list the parallel resources that we collected, and present three strategies for normalizing Swiss German input in order to address the regional and spelling diversity. We show that character-based neural MT is the best solution for text normalization and that in combination with phrase-based statistical MT we reach 36 dialect becomes more remote from the training one.
READ FULL TEXT