Context-Sensitive Visualization of Deep Learning Natural Language Processing Models

05/25/2021
by   Andrew Dunn, et al.
0

The introduction of Transformer neural networks has changed the landscape of Natural Language Processing (NLP) during the last years. So far, none of the visualization systems has yet managed to examine all the facets of the Transformers. This gave us the motivation of the current work. We propose a new NLP Transformer context-sensitive visualization method that leverages existing NLP tools to find the most significant groups of tokens (words) that have the greatest effect on the output, thus preserving some context from the original text. First, we use a sentence-level dependency parser to highlight promising word groups. The dependency parser creates a tree of relationships between the words in the sentence. Next, we systematically remove adjacent and non-adjacent tuples of n tokens from the input text, producing several new texts with those tokens missing. The resulting texts are then passed to a pre-trained BERT model. The classification output is compared with that of the full text, and the difference in the activation strength is recorded. The modified texts that produce the largest difference in the target classification output neuron are selected, and the combination of removed words are then considered to be the most influential on the model's output. Finally, the most influential word combinations are visualized in a heatmap.

READ FULL TEXT

page 6

page 7

research
06/21/2021

Ad Text Classification with Transformer-Based Natural Language Processing Methods

In this study, a natural language processing-based (NLP-based) method is...
research
07/18/2023

Can Model Fusing Help Transformers in Long Document Classification? An Empirical Study

Text classification is an area of research which has been studied over t...
research
04/19/2023

Scaling Transformer to 1M tokens and beyond with RMT

This technical report presents the application of a recurrent memory to ...
research
01/30/2023

Using n-aksaras to model Sanskrit and Sanskrit-adjacent texts

Despite – or perhaps because of – their simplicity, n-grams, or contiguo...
research
08/31/2018

Nightmare at test time: How punctuation prevents parsers from generalizing

Punctuation is a strong indicator of syntactic structure, and parsers tr...
research
03/29/2021

Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors

Transformer networks have revolutionized NLP representation learning sin...
research
04/08/2022

Improving Tokenisation by Alternative Treatment of Spaces

Tokenisation is the first step in almost all NLP tasks, and state-of-the...

Please sign up or login with your details

Forgot password? Click here to reset