Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer

05/28/2019
by   Yanshuai Cao, et al.
0

In this work, we develop a novel regularizer to improve the learning of long-range dependency of sequence data. Applied on language modelling, our regularizer expresses the inductive bias that sequence variables should have high mutual information even though the model might not see abundant observations for complex long-range dependency. We show how the `next sentence prediction (classification)' heuristic can be derived in a principled way from our mutual information estimation framework, and be further extended to maximize the mutual information of sequence variables. The proposed approach not only is effective at increasing the mutual information of segments under the learned model but more importantly, leads to a higher likelihood on holdout data, and improved generation quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2021

Minimizing couplings in renormalization by preserving short-range mutual information

The connections between renormalization in statistical mechanics and inf...
research
05/10/2019

Mutual Information Scaling and Expressive Power of Sequence Models

Sequence models assign probabilities to variable-length sequences such a...
research
08/26/2021

Quadratic mutual information regularization in real-time deep CNN models

In this paper, regularized lightweight deep convolutional neural network...
research
11/08/2018

On the Statistical and Information-theoretic Characteristics of Deep Network Representations

It has been common to argue or imply that a regularizer can be used to a...
research
12/30/2020

Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients

As reinforcement learning techniques are increasingly applied to real-wo...
research
06/24/2022

Mutual-Information Based Optimal Experimental Design for Hyperpolarized ^13C-Pyruvate MRI

A key parameter of interest recovered from hyperpolarized (HP) MRI measu...
research
12/07/2020

Diverse Melody Generation from Chinese Lyrics via Mutual Information Maximization

In this paper, we propose to adapt the method of mutual information maxi...

Please sign up or login with your details

Forgot password? Click here to reset