Revisiting the Hierarchical Multiscale LSTM

07/10/2018
by   Akos Kadar, et al.
0

Hierarchical Multiscale LSTM (Chung et al., 2016a) is a state-of-the-art language model that learns interpretable structure from character-level input. Such models can provide fertile ground for (cognitive) computational linguistics studies. However, the high complexity of the architecture, training procedure and implementations might hinder its applicability. We provide a detailed reproduction and ablation study of the architecture, shedding light on some of the potential caveats of re-purposing complex deep-learning architectures. We further show that simplifying certain aspects of the architecture can in fact improve its performance. We also investigate the linguistic units (segments) learned by various levels of the model, and argue that their quality does not correlate with the overall performance of the model on language modeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2020

Finnish Language Modeling with Deep Transformer Models

Transformers have recently taken the center stage in language modeling a...
research
12/06/2021

Tele-EvalNet: A Low-cost, Teleconsultation System for Home based Rehabilitation of Stroke Survivors using Multiscale CNN-LSTM Architecture

Technology has an important role to play in the field of Rehabilitation,...
research
04/19/2021

When FastText Pays Attention: Efficient Estimation of Word Representations using Constrained Positional Weighting

Since the seminal work of Mikolov et al. (2013a) and Bojanowski et al. (...
research
09/26/2018

Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis

Recent work using auxiliary prediction task classifiers to investigate t...
research
08/27/2018

Pyramidal Recurrent Unit for Language Modeling

LSTMs are powerful tools for modeling contextual information, as evidenc...
research
09/02/2020

Application of LSTM architectures for next frame forecasting in Sentinel-1 images time series

L'analyse prédictive permet d'estimer les tendances des évènements futur...
research
01/05/2020

Automatic Business Process Structure Discovery using Ordered Neurons LSTM: A Preliminary Study

Automatic process discovery from textual process documentations is highl...

Please sign up or login with your details

Forgot password? Click here to reset