Input Switched Affine Networks: An RNN Architecture Designed for Interpretability

by   Jakob N. Foerster, et al.

There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations - in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture's solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.


page 4

page 5

page 6

page 8


Neural Speed Reading via Skim-RNN

Inspired by the principles of speed reading, we introduce Skim-RNN, a re...

Input-to-Output Gate to Improve RNN Language Models

This paper proposes a reinforcing method that refines the output layers ...

Training Input-Output Recurrent Neural Networks through Spectral Methods

We consider the problem of training input-output recurrent neural networ...

Attention-based Conditioning Methods for External Knowledge Integration

In this paper, we present a novel approach for incorporating external kn...

Framing RNN as a kernel method: A neural ODE approach

Building on the interpretation of a recurrent neural network (RNN) as a ...

Survey on the attention based RNN model and its applications in computer vision

The recurrent neural networks (RNN) can be used to solve the sequence to...

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

RNN models have achieved the state-of-the-art performance in a wide rang...