Concept backpropagation: An Explainable AI approach for visualising learned concepts in neural network models

07/24/2023
by   Patrik Hammersborg, et al.
0

Neural network models are widely used in a variety of domains, often as black-box solutions, since they are not directly interpretable for humans. The field of explainable artificial intelligence aims at developing explanation methods to address this challenge, and several approaches have been developed over the recent years, including methods for investigating what type of knowledge these models internalise during the training process. Among these, the method of concept detection, investigates which concepts neural network models learn to represent in order to complete their tasks. In this work, we present an extension to the method of concept detection, named concept backpropagation, which provides a way of analysing how the information representing a given concept is internalised in a given neural network model. In this approach, the model input is perturbed in a manner guided by a trained concept probe for the described model, such that the concept of interest is maximised. This allows for the visualisation of the detected concept directly in the input space of the model, which in turn makes it possible to see what information the model depends on for representing the described concept. We present results for this method applied to a various set of input modalities, and discuss how our proposed method can be used to visualise what information trained concept probes use, and the degree as to which the representation of the probed concept is entangled within the neural network model itself.

READ FULL TEXT

page 1

page 4

page 5

page 6

research
04/26/2023

GENIE-NF-AI: Identifying Neurofibromatosis Tumors using Liquid Neural Network (LTC) trained on AACR GENIE Datasets

In recent years, the field of medicine has been increasingly adopting ar...
research
09/18/2023

Information based explanation methods for deep learning agents – with applications on large open-source chess models

With large chess-playing neural network models like AlphaZero contesting...
research
05/07/2023

Opening the TAR Black Box: Developing an Interpretable System for eDiscovery Using the Fuzzy ARTMAP Neural Network

This foundational research provides additional support for using the Fuz...
research
03/30/2022

Interpretable Vertebral Fracture Diagnosis

Do black-box neural network models learn clinically relevant features fo...
research
08/02/2017

A Novel Neural Network Model Specified for Representing Logical Relations

With computers to handle more and more complicated things in variable en...
research
11/27/2018

Using Attribution to Decode Dataset Bias in Neural Network Models for Chemistry

Deep neural networks have achieved state of the art accuracy at classify...
research
08/08/2023

Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning

A major challenge in Explainable AI is in correctly interpreting activat...

Please sign up or login with your details

Forgot password? Click here to reset