Extend and Explain: Interpreting Very Long Language Models

09/02/2022
by   Joel Stremmel, et al.
0

While Transformer language models (LMs) are state-of-the-art for information extraction, long text introduces computational challenges requiring suboptimal preprocessing steps or alternative model architectures. Sparse-attention LMs can represent longer sequences, overcoming performance hurdles. However, it remains unclear how to explain predictions from these models, as not all tokens attend to each other in the self-attention layers, and long sequences pose computational challenges for explainability algorithms when runtime depends on document length. These challenges are severe in the medical context where documents can be very long, and machine learning (ML) models must be auditable and trustworthy. We introduce a novel Masked Sampling Procedure (MSP) to identify the text blocks that contribute to a prediction, apply MSP in the context of predicting diagnoses from medical text, and validate our approach with a blind review by two clinicians. Our method identifies about 1.7x more clinically informative text blocks than the previous state-of-the-art, runs up to 100x faster, and is tractable for generating important phrase pairs. MSP is particularly well-suited to long LMs but can be applied to any text classifier. We provide a general implementation of MSP.

READ FULL TEXT

page 39

page 42

research
09/19/2021

Do Long-Range Language Models Actually Use Long-Range Context?

Language models are generally trained on short, truncated input sequence...
research
09/11/2023

Long-Range Transformer Architectures for Document Understanding

Since their release, Transformers have revolutionized many fields from N...
research
06/01/2023

Faster Causal Attention Over Large Sequences Through Sparse Flash Attention

Transformer-based language models have found many diverse applications r...
research
11/18/2021

The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

On a wide range of natural language processing and information retrieval...
research
02/28/2023

A Survey on Long Text Modeling with Transformers

Modeling long texts has been an essential technique in the field of natu...
research
03/14/2023

Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Text Classifiers

Long-sequence transformers are designed to improve the representation of...
research
06/08/2023

Interpretable Medical Diagnostics with Structured Data Extraction by Large Language Models

Tabular data is often hidden in text, particularly in medical diagnostic...

Please sign up or login with your details

Forgot password? Click here to reset