An Exploration of Dropout with RNNs for Natural Language Inference

10/22/2018
by   Amit Gajbhiye, et al.
1

Dropout is a crucial regularization technique for the Recurrent Neural Network (RNN) models of Natural Language Inference (NLI). However, dropout has not been evaluated for the effectiveness at different layers and dropout rates in NLI models. In this paper, we propose a novel RNN model for NLI and empirically evaluate the effect of applying dropout at different layers in the model. We also investigate the impact of varying dropout rates at these layers. Our empirical evaluation on a large (Stanford Natural Language Inference (SNLI)) and a small (SciTail) dataset suggest that dropout at each feed-forward connection severely affects the model accuracy at increasing dropout rates. We also show that regularizing the embedding layer is efficient for SNLI whereas regularizing the recurrent layer improves the accuracy for SciTail. Our model achieved an accuracy 86.14

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2022

Macro-block dropout for improved regularization in training end-to-end speech recognition models

This paper proposes a new regularization algorithm referred to as macro-...
research
11/02/2018

Analysing Dropout and Compounding Errors in Neural Language Models

This paper carries out an empirical analysis of various dropout techniqu...
research
11/18/2019

RotationOut as a Regularization Method for Neural Network

In this paper, we propose a novel regularization method, RotationOut, fo...
research
10/31/2017

Fraternal Dropout

Recurrent neural networks (RNNs) are important class of architectures am...
research
01/23/2013

Regularization and nonlinearities for neural language models: when are they needed?

Neural language models (LMs) based on recurrent neural networks (RNN) ar...
research
01/30/2018

Fast Power system security analysis with Guided Dropout

We propose a new method to efficiently compute load-flows (the steady-st...
research
05/13/2022

Structural Dropout for Model Width Compression

Existing ML models are known to be highly over-parametrized, and use sig...

Please sign up or login with your details

Forgot password? Click here to reset