Finnish Language Modeling with Deep Transformer Models

03/14/2020
by   Abhilash Jain, et al.
0

Transformers have recently taken the center stage in language modeling after LSTM's were considered the dominant model architecture for a long time. In this project, we investigate the performance of the Transformer architectures-BERT and Transformer-XL for the language modeling task. We use a sub-word model setting with the Finnish language and compare it to the previous State of the art (SOTA) LSTM model. BERT achieves a pseudo-perplexity score of 14.5, which is the first such measure achieved as far as we know. Transformer-XL improves upon the perplexity score to 73.58 which is 27% better than the LSTM model.

READ FULL TEXT
research
04/20/2019

Language Models with Transformers

The Transformer architecture is superior to RNN-based models in computat...
research
10/25/2021

Paradigm Shift in Language Modeling: Revisiting CNN for Modeling Sanskrit Originated Bengali and Hindi Language

Though there has been a large body of recent works in language modeling ...
research
07/10/2018

Revisiting the Hierarchical Multiscale LSTM

Hierarchical Multiscale LSTM (Chung et al., 2016a) is a state-of-the-art...
research
04/18/2023

Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions

The success of transformer models trained with a language modeling objec...
research
03/16/2020

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Bidirectional Encoder Representations from Transformers (BERT) has recen...
research
07/13/2022

N-Grammer: Augmenting Transformers with latent n-grams

Transformer models have recently emerged as one of the foundational mode...
research
02/19/2020

LAMBERT: Layout-Aware language Modeling using BERT for information extraction

In this paper we introduce a novel approach to the problem of understand...

Please sign up or login with your details

Forgot password? Click here to reset