Revisiting the Hierarchical Multiscale LSTM

07/10/2018
by   Akos Kadar, et al.
0

Hierarchical Multiscale LSTM (Chung et al., 2016a) is a state-of-the-art language model that learns interpretable structure from character-level input. Such models can provide fertile ground for (cognitive) computational linguistics studies. However, the high complexity of the architecture, training procedure and implementations might hinder its applicability. We provide a detailed reproduction and ablation study of the architecture, shedding light on some of the potential caveats of re-purposing complex deep-learning architectures. We further show that simplifying certain aspects of the architecture can in fact improve its performance. We also investigate the linguistic units (segments) learned by various levels of the model, and argue that their quality does not correlate with the overall performance of the model on language modeling.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/14/2020

Finnish Language Modeling with Deep Transformer Models

Transformers have recently taken the center stage in language modeling a...
04/19/2021

When FastText Pays Attention: Efficient Estimation of Word Representations using Constrained Positional Weighting

Since the seminal work of Mikolov et al. (2013a) and Bojanowski et al. (...
09/26/2018

Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis

Recent work using auxiliary prediction task classifiers to investigate t...
08/27/2018

Pyramidal Recurrent Unit for Language Modeling

LSTMs are powerful tools for modeling contextual information, as evidenc...
01/05/2020

Automatic Business Process Structure Discovery using Ordered Neurons LSTM: A Preliminary Study

Automatic process discovery from textual process documentations is highl...
09/01/2018

Finding the Answers with Definition Models

Inspired by a previous attempt to answer crossword questions using neura...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.