Learning Language Representations for Typology Prediction

07/29/2017
by   Chaitanya Malaviya, et al.
0

One central mystery of neural NLP is what neural models "know" about their subject matter. When a neural machine translation system learns to translate from one language to another, does it learn the syntax or semantics of the languages? Can this knowledge be extracted from the system to fill holes in human scientific knowledge? Existing typological databases contain relatively full feature specifications for only a few hundred languages. Exploiting the existence of parallel texts in more than a thousand languages, we build a massive many-to-one neural machine translation (NMT) system from 1017 languages into English, and use this to predict information missing from typological databases. Experiments show that the proposed method is able to infer not only syntactic, but also phonological and phonetic inventory features, and improves over a baseline that has access to information about the languages' geographic and phylogenetic neighbors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/29/2019

A Baseline Neural Machine Translation System for Indian Languages

We present a simple, yet effective, Neural Machine Translation system fo...
research
06/11/2019

A Focus on Neural Machine Translation for African Languages

African languages are numerous, complex and low-resourced. The datasets ...
research
03/24/2020

Towards Neural Machine Translation for Edoid Languages

Many Nigerian languages have relinquished their previous prestige and pu...
research
03/28/2021

PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation

In this paper we present our submission for the EACL 2021 SRW; a methodo...
research
02/20/2021

Understanding and Enhancing the Use of Context for Machine Translation

To understand and infer meaning in language, neural models have to learn...
research
12/17/2022

Beyond the C: Retargetable Decompilation using Neural Machine Translation

The problem of reversing the compilation process, decompilation, is an i...
research
05/25/2018

Recursive Neural Network Based Preordering for English-to-Japanese Machine Translation

The word order between source and target languages significantly influen...

Please sign up or login with your details

Forgot password? Click here to reset