Scalable Machine Translation in Memory Constrained Environments

10/06/2016
by   Paul Baltescu, et al.
0

Machine translation is the discipline concerned with developing automated tools for translating from one human language to another. Statistical machine translation (SMT) is the dominant paradigm in this field. In SMT, translations are generated by means of statistical models whose parameters are learned from bilingual data. Scalability is a key concern in SMT, as one would like to make use of as much data as possible to train better translation systems. In recent years, mobile devices with adequate computing power have become widely available. Despite being very successful, mobile applications relying on NLP systems continue to follow a client-server architecture, which is of limited use because access to internet is often limited and expensive. The goal of this dissertation is to show how to construct a scalable machine translation system that can operate with the limited resources available on a mobile device. The main challenge for porting translation systems on mobile devices is memory usage. The amount of memory available on a mobile device is far less than what is typically available on the server side of a client-server application. In this thesis, we investigate alternatives for the two components which prevent standard translation systems from working on mobile devices due to high memory usage. We show that once these standard components are replaced with our proposed alternatives, we obtain a scalable translation system that can work on a device with limited memory.

READ FULL TEXT
research
10/25/2016

Statistical Machine Translation for Indian Languages: Mission Hindi 2

This paper presents Centre for Development of Advanced Computing Mumbai'...
research
10/17/2016

Neural Machine Translation Advised by Statistical Machine Translation

Neural Machine Translation (NMT) is a new approach to machine translatio...
research
10/26/2020

Towards Adjusting Mobile Devices to User's Behaviour

Mobile devices are a special class of resource-constrained embedded devi...
research
07/28/2017

Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation

The goal of counterfactual learning for statistical machine translation ...
research
04/11/2018

Mobile Device Synchronisation with Central Database based on Data Relevance

Distributed applications are broadly used due the existence of mobile de...
research
12/07/2021

Training end-to-end speech-to-text models on mobile phones

Training the state-of-the-art speech-to-text (STT) models in mobile devi...
research
10/13/2016

Fast, Scalable Phrase-Based SMT Decoding

The utilization of statistical machine translation (SMT) has grown enorm...

Please sign up or login with your details

Forgot password? Click here to reset