Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Text Classifiers

03/14/2023
by   Kamil Bujel, et al.
0

Long-sequence transformers are designed to improve the representation of longer texts by language models and their performance on downstream document-level tasks. However, not much is understood about the quality of token-level predictions in long-form models. We investigate the performance of such architectures in the context of document classification with unsupervised rationale extraction. We find standard soft attention methods to perform significantly worse when combined with the Longformer language model. We propose a compositional soft attention architecture that applies RoBERTa sentence-wise to extract plausible rationales at the token-level. We find this method to significantly outperform Longformer-driven baselines on sentiment classification datasets, while also exhibiting significantly lower runtimes.

READ FULL TEXT

page 11

page 12

research
04/15/2021

Hierarchical Learning for Generation with Long Source Sequences

One of the challenges for current sequence to sequence (seq2seq) models ...
research
04/15/2020

Document-level Representation Learning using Citation-informed Transformers

Representation learning is a critical ingredient for natural language pr...
research
09/07/2021

Sequential Attention Module for Natural Language Processing

Recently, large pre-trained neural language models have attained remarka...
research
04/16/2022

Unsupervised Attention-based Sentence-Level Meta-Embeddings from Contextualised Language Models

A variety of contextualised language models have been proposed in the NL...
research
03/21/2022

Language modeling via stochastic processes

Modern language models can generate high-quality short texts. However, t...
research
11/14/2018

Jointly Learning to Label Sentences and Tokens

Learning to construct text representations in end-to-end systems can be ...
research
09/02/2022

Extend and Explain: Interpreting Very Long Language Models

While Transformer language models (LMs) are state-of-the-art for informa...

Please sign up or login with your details

Forgot password? Click here to reset