Improving Robustness and Generality of NLP Models Using Disentangled Representations

09/21/2020
by   Jiawei Wu, et al.
0

Supervised neural networks, which first map an input x to a single representation z, and then map z to the output label y, have achieved remarkable success in a wide range of natural language processing (NLP) tasks. Despite their success, neural models lack for both robustness and generality: small perturbations to inputs can result in absolutely different outputs; the performance of a model trained on one domain drops drastically when tested on another domain. In this paper, we present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning. Instead of mapping x to a single representation z, the proposed strategy maps x to a set of representations {z_1,z_2,...,z_K} while forcing them to be disentangled. These representations are then mapped to different logits ls, the ensemble of which is used to make the final prediction y. We propose different methods to incorporate this idea into currently widely-used models, including adding an L2 regularizer on zs or adding Total Correlation (TC) under the framework of variational information bottleneck (VIB). We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.

READ FULL TEXT
research
08/27/2021

Evaluating the Robustness of Neural Language Models to Input Perturbations

High-performance neural language models have obtained state-of-the-art r...
research
06/05/2020

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Recent progress in pre-trained neural language models has significantly ...
research
02/07/2016

Disentangled Representations in Neural Models

Representation learning is the foundation for the recent success of neur...
research
05/29/2019

A Heuristic for Unsupervised Model Selection for Variational Disentangled Representation Learning

Disentangled representations have recently been shown to improve data ef...
research
10/05/2020

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Large-scale language models such as BERT have achieved state-of-the-art ...
research
01/21/2021

Blocked and Hierarchical Disentangled Representation From Information Theory Perspective

We propose a novel and theoretical model, blocked and hierarchical varia...
research
11/21/2022

Disentangled Representation Learning

Disentangled Representation Learning (DRL) aims to learn a model capable...

Please sign up or login with your details

Forgot password? Click here to reset