Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning

05/28/2019
by   Wonjae Kim, et al.
0

Without relevant human priors, neural networks may learn uninterpretable features. We propose Dynamics of Attention for Focus Transition (DAFT) as a human prior for machine reasoning. DAFT is a novel method that regularizes attention-based reasoning by modelling it as a continuous dynamical system using neural ordinary differential equations. As a proof of concept, we augment a state-of-the-art visual reasoning model with DAFT. Our experiments reveal that applying DAFT yields similar performance to the original model while using fewer reasoning steps, showing that it implicitly learns to skip unnecessary steps. We also propose a new metric, Total Length of Transition (TLT), which represents the effective reasoning step size by quantifying how much a given model's focus drifts while reasoning about a question. We show that adding DAFT results in lower TLT, demonstrating that our method indeed obeys the human prior towards shorter reasoning paths in addition to producing more interpretable attention maps.

READ FULL TEXT

page 4

page 7

page 12

page 13

page 15

page 17

research
05/04/2022

Virtual Analog Modeling of Distortion Circuits Using Neural Ordinary Differential Equations

Recent research in deep learning has shown that neural networks can lear...
research
09/06/2020

OnsagerNet: Learning Stable and Interpretable Dynamics using a Generalized Onsager Principle

We propose a systematic method for learning stable and interpretable dyn...
research
06/05/2023

Infusing Lattice Symmetry Priors in Attention Mechanisms for Sample-Efficient Abstract Geometric Reasoning

The Abstraction and Reasoning Corpus (ARC) (Chollet, 2019) and its most ...
research
03/08/2018

Compositional Attention Networks for Machine Reasoning

We present the MAC network, a novel fully differentiable neural network ...
research
04/20/2022

Attention in Reasoning: Dataset, Analysis, and Modeling

While attention has been an increasingly popular component in deep neura...
research
05/31/2021

ACE-NODE: Attentive Co-Evolving Neural Ordinary Differential Equations

Neural ordinary differential equations (NODEs) presented a new paradigm ...
research
07/28/2020

AiR: Attention with Reasoning Capability

While attention has been an increasingly popular component in deep neura...

Please sign up or login with your details

Forgot password? Click here to reset