Pragmatic Neural Language Modelling in Machine Translation

12/22/2014
by   Paul Baltescu, et al.
0

This paper presents an in-depth investigation on integrating neural language models in translation systems. Scaling neural language models is a difficult task, but crucial for real-world applications. This paper evaluates the impact on end-to-end MT quality of both new and existing scaling techniques. We show when explicitly normalising neural models is necessary and what optimisation tricks one should use in such scenarios. We also focus on scalable training algorithms and investigate noise contrastive estimation and diagonal contexts as sources for further speed improvements. We explore the trade-offs between neural models and back-off n-gram models and find that neural models make strong candidates for natural language applications in memory constrained environments, yet still lag behind traditional models in raw translation quality. We conclude with a set of recommendations one should follow to build a scalable neural language model for MT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2019

MetaMT,a MetaLearning Method Leveraging Multiple Domain Data for Low Resource Machine Translation

Manipulating training data leads to robust neural models for MT....
research
04/05/2023

Document-Level Machine Translation with Large Language Models

Large language models (LLMs) such as Chat-GPT can produce coherent, cohe...
research
02/20/2018

On the scaling of polynomial features for representation matching

In many neural models, new features as polynomial functions of existing ...
research
08/20/2015

Auto-Sizing Neural Networks: With Applications to n-gram Language Models

Neural networks have been shown to improve performance across a range of...
research
10/31/2019

Naver Labs Europe's Systems for the Document-Level Generation and Translation Task at WNGT 2019

Recently, neural models led to significant improvements in both machine ...
research
09/22/2017

Improving Language Modelling with Noise-contrastive estimation

Neural language models do not scale well when the vocabulary is large. N...
research
03/12/2023

DTT: An Example-Driven Tabular Transformer by Leveraging Large Language Models

Many organizations rely on data from government and third-party sources,...

Please sign up or login with your details

Forgot password? Click here to reset