Decoupled Rationalization with Asymmetric Learning Rates: A Flexible Lipshitz Restraint

05/23/2023
by   Wei Liu, et al.
0

A self-explaining rationalization model is generally constructed by a cooperative game where a generator selects the most human-intelligible pieces from the input text as rationales, followed by a predictor that makes predictions based on the selected rationales. However, such a cooperative game may incur the degeneration problem where the predictor overfits to the uninformative pieces generated by a not yet well-trained generator and in turn, leads the generator to converge to a sub-optimal model that tends to select senseless pieces. In this paper, we theoretically bridge degeneration with the predictor's Lipschitz continuity. Then, we empirically propose a simple but effective method named DR, which can naturally and flexibly restrain the Lipschitz constant of the predictor, to address the problem of degeneration. The main idea of DR is to decouple the generator and predictor to allocate them with asymmetric learning rates. A series of experiments conducted on two widely used benchmarks have verified the effectiveness of the proposed method. Codes: \href{https://github.com/jugechengzi/Rationalization-DR}{https://github.com/jugechengzi/Rationalization-DR}.

READ FULL TEXT
research
05/08/2023

MGR: Multi-generator based Rationalization

Rationalization is to employ a generator and a predictor to construct a ...
research
10/26/2021

Understanding Interlocking Dynamics of Cooperative Rationalization

Selective rationalization explains the prediction of complex neural netw...
research
09/17/2022

FR: Folded Rationalization with a Unified Encoder

Conventional works generally employ a two-phase model in which a generat...
research
07/25/2020

Bollyrics: Automatic Lyrics Generator for Romanised Hindi

Song lyrics convey a meaningful story in a creative manner with complex ...
research
05/31/2023

Too Large; Data Reduction for Vision-Language Pre-Training

This paper examines the problems of severe image-text misalignment and h...
research
10/27/2021

Identifying the key components in ResNet-50 for diabetic retinopathy grading from fundus images: a systematic investigation

Although deep learning based diabetic retinopathy (DR) classification me...
research
02/20/2023

Toward Asymptotic Optimality: Sequential Unsupervised Regression of Density Ratio for Early Classification

Theoretically-inspired sequential density ratio estimation (SDRE) algori...

Please sign up or login with your details

Forgot password? Click here to reset