THUMT: An Open Source Toolkit for Neural Machine Translation

06/20/2017
by   Jiacheng Zhang, et al.
0

This paper introduces THUMT, an open-source toolkit for neural machine translation (NMT) developed by the Natural Language Processing Group at Tsinghua University. THUMT implements the standard attention-based encoder-decoder framework on top of Theano and supports three training criteria: maximum likelihood estimation, minimum risk training, and semi-supervised training. It features a visualization tool for displaying the relevance between hidden states in neural networks and contextual words, which helps to analyze the internal workings of NMT. Experiments on Chinese-English datasets show that THUMT using minimum risk training significantly outperforms GroundHog, a state-of-the-art toolkit for NMT.

READ FULL TEXT
research
01/10/2017

OpenNMT: Open-Source Toolkit for Neural Machine Translation

We describe an open-source toolkit for neural machine translation (NMT)....
research
02/17/2018

CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++

This paper presented an open-source neural machine translation toolkit n...
research
12/15/2017

Sockeye: A Toolkit for Neural Machine Translation

We describe Sockeye (version 1.12), an open-source sequence-to-sequence ...
research
04/11/2018

VoroTop: Voronoi Cell Topology Visualization and Analysis Toolkit

This paper introduces a new open-source software program called VoroTop,...
research
12/08/2015

Minimum Risk Training for Neural Machine Translation

We propose minimum risk training for end-to-end neural machine translati...
research
08/14/2023

SOTASTREAM: A Streaming Approach to Machine Translation Training

Many machine translation toolkits make use of a data preparation step wh...
research
02/27/2023

Inseq: An Interpretability Toolkit for Sequence Generation Models

Past work in natural language processing interpretability focused mainly...

Please sign up or login with your details

Forgot password? Click here to reset