Making Sense of CNNs: Interpreting Deep Representations Their Invariances with INNs

08/04/2020
by   Robin Rombach, et al.
0

To tackle increasingly complex tasks, it has become an essential ability of neural networks to learn abstract representations. These task-specific representations and, particularly, the invariances they capture turn neural networks into black box models that lack interpretability. To open such a black box, it is, therefore, crucial to uncover the different semantic concepts a model has learned as well as those that it has learned to be invariant to. We present an approach based on INNs that (i) recovers the task-specific, learned invariances by disentangling the remaining factor of variation in the data and that (ii) invertibly transforms these recovered invariances combined with the model representation into an equally expressive one with accessible semantic concepts. As a consequence, neural network representations become understandable by providing the means to (i) expose their semantic meaning, (ii) semantically modify a representation, and (iii) visualize individual learned semantic concepts and invariances. Our invertible approach significantly extends the abilities to understand black box models by enabling post-hoc interpretations of state-of-the-art networks without compromising their performance. Our implementation is available at https://compvis.github.io/invariances/ .

READ FULL TEXT

page 9

page 13

page 25

page 26

page 27

page 31

page 32

page 36

research
04/27/2020

A Disentangling Invertible Interpretation Network for Explaining Latent Representations

Neural networks have greatly boosted performance in computer vision by l...
research
10/12/2018

Explaining Black Boxes on Sequential Data using Weighted Automata

Understanding how a learned black box works is of crucial interest for t...
research
06/27/2022

Thermodynamics of Interpretation

Over the past few years, different types of data-driven Artificial Intel...
research
06/24/2021

Promises and Pitfalls of Black-Box Concept Learning Models

Machine learning models that incorporate concept learning as an intermed...
research
04/22/2021

Patch Shortcuts: Interpretable Proxy Models Efficiently Find Black-Box Vulnerabilities

An important pillar for safe machine learning (ML) is the systematic mit...
research
09/16/2020

Malicious Network Traffic Detection via Deep Learning: An Information Theoretic View

The attention that deep learning has garnered from the academic communit...
research
01/10/2018

Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks

In an effort to understand the meaning of the intermediate representatio...

Please sign up or login with your details

Forgot password? Click here to reset