Machine Translation of Mathematical Text

10/11/2020
by   Aditya Ohri, et al.
0

We have implemented a machine translation system, the PolyMath Translator, for LaTeX documents containing mathematical text. The current implementation translates English LaTeX to French LaTeX, attaining a BLEU score of 53.5 on a held-out test corpus of mathematical sentences. It produces LaTeX documents that can be compiled to PDF without further editing. The system first converts the body of an input LaTeX document into English sentences containing math tokens, using the pandoc universal document converter to parse LaTeX input. We have trained a Transformer-based translator model, using OpenNMT, on a combined corpus containing a small proportion of domain-specific sentences. Our full system uses both this Transformer model and Google Translate, the latter being used as a backup to better handle linguistic features that do not appear in our training dataset. If the Transformer model does not have confidence in its translation, as determined by a high perplexity score, then we use Google Translate with a custom glossary. This backup was used 26 test corpus of mathematical sentences. The PolyMath Translator is available as a web service at www.polymathtrans.ai.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2019

A Parallel Corpus of Theses and Dissertations Abstracts

In Brazil, the governmental body responsible for overseeing and coordina...
research
09/28/2022

Effective General-Domain Data Inclusion for the Machine Translation Task by Vanilla Transformers

One of the vital breakthroughs in the history of machine translation is ...
research
12/26/2015

The Improvement of Negative Sentences Translation in English-to-Korean Machine Translation

This paper describes the algorithm for translating English negative sent...
research
08/01/2019

JUMT at WMT2019 News Translation Task: A Hybrid approach to Machine Translation for Lithuanian to English

In the current work, we present a description of the system submitted to...
research
08/29/2022

Extracting Mathematical Concepts from Text

We investigate different systems for extracting mathematical entities fr...
research
07/17/2018

A Hand-Held Multimedia Translation and Interpretation System with Application to Diet Management

We propose a network independent, hand-held system to translate and disa...
research
07/30/2019

English-Czech Systems in WMT19: Document-Level Transformer

We describe our NMT systems submitted to the WMT19 shared task in Englis...

Please sign up or login with your details

Forgot password? Click here to reset