A Disentangling Invertible Interpretation Network for Explaining Latent Representations

04/27/2020
by   Patrick Esser, et al.
12

Neural networks have greatly boosted performance in computer vision by learning powerful representations of input data. The drawback of end-to-end training for maximal overall performance are black-box models whose hidden representations are lacking interpretability: Since distributed coding is optimal for latent layers to improve their robustness, attributing meaning to parts of a hidden feature vector or to individual neurons is hindered. We formulate interpretation as a translation of hidden representations onto semantic concepts that are comprehensible to the user. The mapping between both domains has to be bijective so that semantic modifications in the target domain correctly alter the original representation. The proposed invertible interpretation network can be transparently applied on top of existing architectures with no need to modify or retrain them. Consequently, we translate an original representation to an equivalent yet interpretable one and backwards without affecting the expressiveness and performance of the original. The invertible interpretation network disentangles the hidden representation into separate, semantically meaningful concepts. Moreover, we present an efficient approach to define semantic concepts by only sketching two images and also an unsupervised strategy. Experimental evaluation demonstrates the wide applicability to interpretation of existing classification and image generation networks as well as to semantically guided image manipulation.

READ FULL TEXT

page 4

page 5

page 6

page 7

page 16

page 17

page 18

page 19

research
07/10/2023

Hierarchical Semantic Tree Concept Whitening for Interpretable Image Classification

With the popularity of deep neural networks (DNNs), model interpretabili...
research
08/04/2020

Making Sense of CNNs: Interpreting Deep Representations Their Invariances with INNs

To tackle increasingly complex tasks, it has become an essential ability...
research
01/08/2018

Deep Supervision with Intermediate Concepts

Recent data-driven approaches to scene interpretation predominantly pose...
research
05/28/2019

EDUCE: Explaining model Decisions through Unsupervised Concepts Extraction

With the advent of deep neural networks, some research focuses towards u...
research
04/10/2022

Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention

Interpretability is an important property for visual models as it helps ...
research
07/25/2019

Interpretability Beyond Classification Output: Semantic Bottleneck Networks

Today's deep learning systems deliver high performance based on end-to-e...
research
01/11/2021

Learning Semantically Meaningful Features for Interpretable Classifications

Learning semantically meaningful features is important for Deep Neural N...

Please sign up or login with your details

Forgot password? Click here to reset