Improving the Robustness of Neural Multiplication Units with Reversible Stochasticity

11/10/2022
by   Bhumika Mistry, et al.
0

Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image tasks.

READ FULL TEXT

page 3

page 22

page 23

page 24

research
01/14/2020

Neural Arithmetic Units

Neural networks can approximate complex functions, but they struggle to ...
research
03/17/2020

iNALU: Improved Neural Arithmetic Logic Unit

Neural networks have to capture mathematical relationships in order to l...
research
08/01/2018

Neural Arithmetic Logic Units

Neural networks can learn to represent and manipulate numerical informat...
research
10/04/2019

Measuring Arithmetic Extrapolation Performance

The Neural Arithmetic Logic Unit (NALU) is a neural network layer that c...
research
11/02/2016

Extensions and Limitations of the Neural GPU

The Neural GPU is a recent model that can learn algorithms such as multi...
research
01/23/2021

A Primer for Neural Arithmetic Logic Modules

Neural Arithmetic Logic Modules have become a growing area of interest, ...
research
01/03/2023

Improving Performance in Neural Networks by Dendrites-Activated Connections

Computational units in artificial neural networks compute a linear combi...

Please sign up or login with your details

Forgot password? Click here to reset