Log In Sign Up

Power Law Graph Transformer for Machine Translation and Representation Learning

by   Burc Gokden, et al.

We present the Power Law Graph Transformer, a transformer model with well defined deductive and inductive tasks for prediction and representation learning. The deductive task learns the dataset level (global) and instance level (local) graph structures in terms of learnable power law distribution parameters. The inductive task outputs the prediction probabilities using the deductive task output, similar to a transductive model. We trained our model with Turkish-English and Portuguese-English datasets from TED talk transcripts for machine translation and compared the model performance and characteristics to a transformer model with scaled dot product attention trained on the same experimental setup. We report BLEU scores of 17.79 and 28.33 on the Turkish-English and Portuguese-English translation tasks with our model, respectively. We also show how a duality between a quantization set and N-dimensional manifold representation can be leveraged to transform between local and global deductive-inductive outputs using successive application of linear and non-linear transformations end-to-end.


page 23

page 24

page 28

page 30

page 32

page 34

page 35

page 41


Korean-English Machine Translation with Multiple Tokenization Strategy

This work was conducted to find out how tokenization methods affect the ...

Fully Quantized Transformer for Improved Translation

State-of-the-art neural machine translation methods employ massive amoun...

English-Twi Parallel Corpus for Machine Translation

We present a parallel machine translation training corpus for English an...

Graph Transformer for Graph-to-Sequence Learning

The dominant graph-to-sequence transduction models employ graph neural n...

Effective General-Domain Data Inclusion for the Machine Translation Task by Vanilla Transformers

One of the vital breakthroughs in the history of machine translation is ...

G-Transformer for Document-level Machine Translation

Document-level MT models are still far from satisfactory. Existing work ...

Code Repositories


Implementation of Power Law Graph Transformer for Machine Translation and Representation Learning.

view repo


A framework for generating subword vocabulary from a tensorflow dataset and building custom BERT tokenizer models.

view repo


A framework for training and evaluating a transformer with scaled dot product attention on a tensorflow dataset.

view repo