Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones

07/20/2017
by   Zhenisbek Assylbekov, et al.
0

Syllabification does not seem to improve word-level RNN language modeling quality when compared to character-based segmentation. However, our best syllable-aware language model, achieving performance comparable to the competitive character-aware model, has 18 1.2-2.2 times faster.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2018

Reusing Weights in Subword-aware Neural Language Models

We propose several ways of reusing subword embeddings and other weights ...
research
03/13/2018

Neural Lattice Language Models

In this work, we propose a new language modeling paradigm that has the a...
research
12/03/2018

Comparing Neural- and N-Gram-Based Language Models for Word Segmentation

Word segmentation is the task of inserting or deleting word boundary cha...
research
11/23/2018

Unsupervised Word Discovery with Segmental Neural Language Models

We propose a segmental neural language model that combines the represent...
research
09/17/2023

A novel approach to measuring patent claim scope based on probabilities obtained from (large) language models

This work proposes to measure the scope of a patent claim as the recipro...
research
08/27/2019

Bridging the Gap for Tokenizer-Free Language Models

Purely character-based language models (LMs) have been lagging in qualit...
research
12/14/2022

MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling

Static subword tokenization algorithms have been an essential component ...

Please sign up or login with your details

Forgot password? Click here to reset