Prototype-based interpretation of the functionality of neurons in winner-take-all neural networks

08/20/2020
by   Ramin Zarei Sabzevar, et al.
10

Prototype-based learning (PbL) using a winner-take-all (WTA) network based on minimum Euclidean distance (ED-WTA) is an intuitive approach to multiclass classification. By constructing meaningful class centers, PbL provides higher interpretability and generalization than hyperplane-based learning (HbL) methods based on maximum Inner Product (IP-WTA) and can efficiently detect and reject samples that do not belong to any classes. In this paper, we first prove the equivalence of IP-WTA and ED-WTA from a representational point of view. Then, we show that naively using this equivalence leads to unintuitive ED-WTA networks in which the centers have high distances to data that they represent. We propose ±ED-WTA which models each neuron with two prototypes: one positive prototype representing samples that are modeled by this neuron and a negative prototype representing the samples that are erroneously won by that neuron during training. We propose a novel training algorithm for the ±ED-WTA network, which cleverly switches between updating the positive and negative prototypes and is essential to the emergence of interpretable prototypes. Unexpectedly, we observed that the negative prototype of each neuron is indistinguishably similar to the positive one. The rationale behind this observation is that the training data that are mistaken with a prototype are indeed similar to it. The main finding of this paper is this interpretation of the functionality of neurons as computing the difference between the distances to a positive and a negative prototype, which is in agreement with the BCM theory. In our experiments, we show that the proposed ±ED-WTA method constructs highly interpretable prototypes that can be successfully used for detecting outlier and adversarial examples.

READ FULL TEXT

page 1

page 8

page 9

page 10

research
01/11/2022

Feature Extraction Framework based on Contrastive Learning with Adaptive Positive and Negative Samples

In this study, we propose a feature extraction framework based on contra...
research
04/22/2023

N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models

Understanding the function of individual neurons within language models ...
research
10/05/2021

NEWRON: A New Generalization of the Artificial Neuron to Enhance the Interpretability of Neural Networks

In this work, we formulate NEWRON: a generalization of the McCulloch-Pit...
research
07/03/2020

Interpretable Sequence Classification Via Prototype Trajectory

We propose a novel interpretable recurrent neural network (RNN) model, c...
research
04/10/2017

Learning Important Features Through Propagating Activation Differences

The purported "black box"' nature of neural networks is a barrier to ado...
research
08/29/2006

Neural Network Clustering Based on Distances Between Objects

We present an algorithm of clustering of many-dimensional objects, where...
research
12/04/2018

Prototype-based Neural Network Layers: Incorporating Vector Quantization

Neural networks currently dominate the machine learning community and th...

Please sign up or login with your details

Forgot password? Click here to reset