Investigating the effect of sub-word segmentation on the performance of transformer language models

05/09/2023
by   Jue Hou, et al.
0

We would like to explore how morphemes can affect the performance of a language model. We trained GPT-2 and Bert model with StateMorph for both Finnish and Russian, which is a morpheme segmenting algorithm. As a comparison, we also trained a model with BPE and Morfessor. Our preliminary result shows that StateMorph can help the model to converge more efficiently and achieve a better validation score.

READ FULL TEXT
research
11/09/2021

FPM: A Collection of Large-scale Foundation Pre-trained Language Models

Recent work in language modeling has shown that training large-scale Tra...
research
09/17/2019

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

Recent work in unsupervised language modeling demonstrates that training...
research
09/17/2019

Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism

Recent work in unsupervised language modeling demonstrates that training...
research
09/12/2017

Language Models of Spoken Dutch

In Flanders, all TV shows are subtitled. However, the process of subtitl...
research
05/24/2021

Neural Language Models for Nineteenth-Century English

We present four types of neural language models trained on a large histo...
research
10/23/2022

Do Language Models Understand Measurements?

Recent success of pre-trained language models (PLMs) has stimulated inte...
research
05/25/2022

Segmenting Numerical Substitution Ciphers

Deciphering historical substitution ciphers is a challenging problem. Ex...

Please sign up or login with your details

Forgot password? Click here to reset