Exploiting Similarities among Languages for Machine Translation

09/17/2013
by   Tomas Mikolov, et al.
0

Dictionaries and phrase tables are the basis of modern statistical machine translation systems. This paper develops a method that can automate the process of generating and extending dictionaries and phrase tables. Our method can translate missing word and phrase entries by learning language structures based on large monolingual data and mapping between languages from small bilingual data. It uses distributed representation of words and learns a linear mapping between vector spaces of languages. Despite its simplicity, our method is surprisingly effective: we can achieve almost 90 of words between English and Spanish. This method makes little assumption about the languages, so it can be used to extend and refine dictionaries and translation tables for any language pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2021

Design and Implementation of English To Yoruba Verb Phrase Machine Translation System

We aim to develop an English to Yoruba machine translation system which ...
research
05/09/2017

Word and Phrase Translation with word2vec

Word and phrase tables are key inputs to machine translations, but costl...
research
11/19/2019

A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

We propose a language-independent approach for improving statistical mac...
research
03/25/2019

Aligning Vector-spaces with Noisy Supervised Lexicons

The problem of learning to translate between two vector spaces given a s...
research
03/19/2015

Phrase database Approach to structural and semantic disambiguation in English-Korean Machine Translation

In machine translation it is common phenomenon that machine-readable dic...
research
09/29/2015

Polish -English Statistical Machine Translation of Medical Texts

This new research explores the effects of various training methods on a ...
research
12/28/2018

Machine Translation: A Literature Review

Machine translation (MT) plays an important role in benefiting linguists...

Please sign up or login with your details

Forgot password? Click here to reset