DeepAI AI Chat
Log In Sign Up

Feature-based Decipherment for Large Vocabulary Machine Translation

08/10/2015
by   Iftekhar Naim, et al.
0

Orthographic similarities across languages provide a strong signal for probabilistic decipherment, especially for closely related language pairs. The existing decipherment models, however, are not well-suited for exploiting these orthographic similarities. We propose a log-linear model with latent variables that incorporates orthographic similarity features. Maximum likelihood training is computationally expensive for the proposed log-linear model. To address this challenge, we perform approximate inference via MCMC sampling and contrastive divergence. Our results show that the proposed log-linear model with contrastive divergence scales to large vocabularies and outperforms the existing generative decipherment models by exploiting the orthographic features.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/05/2017

Building Morphological Chains for Agglutinative Languages

In this paper, we build morphological chains for agglutinative languages...
09/26/2017

Learning Multi-grid Generative ConvNets by Minimal Contrastive Divergence

This paper proposes a minimal contrastive divergence method for learning...
05/06/2014

Training Restricted Boltzmann Machine by Perturbation

A new approach to maximum likelihood learning of discrete graphical mode...
03/17/2018

A two-stage estimation procedure for non-linear structural equation models

Applications of structural equation models (SEMs) are often restricted t...
07/04/2018

Modeling outcomes of soccer matches

We compare various extensions of the Bradley-Terry model and a hierarchi...