DeepAI AI Chat
Log In Sign Up

Neural Status Registers

by   Lukas Faber, et al.

Neural networks excel at approximating functions and finding patterns in complex and challenging domains. Yet, they fail to learn simple but precise computation. Recent work addressed the ability to add, subtract, and multiply numbers but is lacking a component to drive control flow. True computer intelligence should also be able to decide when to perform what operation. In this paper, we introduce the Neural Status Register (NSR), inspired by physical Status Registers. At the heart of the NSR are arithmetic comparisons between inputs. With theoretically principled changes to physical Status Registers, the NSR allows end-to-end differentiation and learns such comparisons reliably. But the NSR also extrapolates: it generalizes to unseen data distributions. For example, the NSR trains on single digits and correctly predicts numbers that are up to 14 orders of magnitude larger. This suggests that the NSR captures the true underlying arithmetic. In follow-up experiments, we use the NSR to control the computation of a downstream arithmetic unit to learn piecewise functions. We can also learn more challenging tasks through redundancy. Finally, we use the NSR to learn an upstream convolutional neural network to compare images of MNIST digits to decide which image contains the larger digit.


page 1

page 2

page 3

page 4


Neural Power Units

Conventional Neural Networks can approximate simple arithmetic operation...

Neural Arithmetic Units

Neural networks can approximate complex functions, but they struggle to ...

Neural Arithmetic Logic Units

Neural networks can learn to represent and manipulate numerical informat...

Investigating the Limitations of Transformers with Simple Arithmetic Tasks

The ability to perform arithmetic tasks is a remarkable trait of human i...

Neural Programmer: Inducing Latent Programs with Gradient Descent

Deep neural networks have achieved impressive supervised classification ...

Visual Learning of Arithmetic Operations

A simple Neural Network model is presented for end-to-end visual learnin...