Neural Nets via Forward State Transformation and Backward Loss Transformation

03/25/2018 ∙ by Bart Jacobs, et al. ∙ Radboud Universiteit 0

This article studies (multilayer perceptron) neural networks with an emphasis on the transformations involved --- both forward and backward --- in order to develop a semantical/logical perspective that is in line with standard program semantics. The common two-pass neural network training algorithms make this viewpoint particularly fitting. In the forward direction, neural networks act as state transformers. In the reverse direction, however, neural networks change losses of outputs to losses of inputs, thereby acting like a (real-valued) predicate transformer. In this way, backpropagation is functorial by construction, as shown earlier in recent other work. We illustrate this perspective by training a simple instance of a neural network.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Though interest in artificial intelligence and machine learning have always been high, the public’s exposure to successful applications has markedly increased in recent years. From consumer-oriented applications like recommendation engines, speech face recognition, and text prediction to prominent examples of superhuman performance (DeepMind’s AlphaGo, IBM’s Watson), the impressive results of machine learning continue to grow.

Though the understandable excitement around the expanding catalog of successful applications lends a kind of mystique, neural networks and the algorithms which train them are, at their core, a special kind of computer program. One perspective on programs which is relevant in this domain are so-called state-and-effect triangles, which emphasize the dual nature of programs as both state and predicate transformers. This framework originated in quantum computing, but has a wide variety of applications including deterministic and probabilistic computations [Jacobs17b].

The common two-pass training scheme in neural networks makes their dual role particularly evident. Operating in the “forward direction” neural networks are like a function: given an input signal they behave like (a mathematical model of) a brain to produce an output signal. This is a form of state transformation. In the “backwards direction”, however, the derivative of a loss function with respect to the output of the network is

backpropagated [Rumelhart86] to the derivative of the loss function with respect to the inputs to the network. This is a kind of predicate transformation, taking a real-valued predicate about the loss at the output and producing a real-valued predicate about the source of loss at the input. The main novel perspective offered by this paper uses such state-and-effect ‘triangles’ for neural networks. We expect that such more formal approaches to neural networks can be of use in trends towards explainable AI, where the goal is to extend automated decisions/classifications with human understandable explanations.

In recent years, it has become apparent that the architecture of a neural network is very important for its accuracy and trainability in particular problem domains [Goodfellow16]. This has resulted in a profligation of specialized architectures, each adapted to its application. Our goal here is not to express the wide variety of special neural networks in a single framework, but rather to describe neural networks generally as an instance of this duality between state and predicate transformers. Therefore, we shall work with a simple, suitably generic neural network type called the multilayer perceptron (MLP).

We see this paper as one of recent steps towards the application of modern semantical and logical techniques to neural networks, following for instance [FongST17, GhicaMCDR18].

Outline. In this paper, we begin by describing MLPs, the layers they are composed of, and their forward semantics as a state transformation (Section 2). In Section 3, we give the corresponding backwards transformation on loss functions and use that to formulate backpropagation in Section 4. Finally, in Section 5, we discuss the compositional nature of backpropagation by casting it as a functor, and compare our work in particular to [FongST17].

2 Forward state transformation

Much like ordinary programs, neural networks are often subdivided into functional units which can then be composed both in sequence and in parallel. These subnetworks are usually called layers, and the sequential composition of several layers is by definition a “deep” network111In contrast, the “width” of a layer typically refers to the number of input and output units, which can be thought of as the repeated parallel composition of yet another architecture.. There are a number of common layer types, and a neural network can often be described by naming the layer types and the way these layers are composed.

Feedforward networks are an important class of neural networks where the composition structure of layers forms a directed acyclic graph—the layers can be put in an order so that no layer is used as the input to an earlier layer. A multilayer perceptron is a particular kind of feedforward network where all layers have the same general architecture, called a fully-connected layer, and are composed strictly in sequence. As mentioned in the introduction, the MLP is perhaps the prototypical neural network architecture, so we treat this network type as a representative example. In the sequel, we will use the phrase “neural network” to denote this particular network architecture.

More concretely, a layer consists of two lists of nodes with directed edges between them. For instance, a neural network with two layers may be depicted as follows.