Softmax-based Classification is k-means Clustering: Formal Proof, Consequences for Adversarial Attacks, and Improvement through Centroid Based Tailoring

01/07/2020
by   Sibylle Hess, et al.
0

We formally prove the connection between k-means clustering and the predictions of neural networks based on the softmax activation layer. In existing work, this connection has been analyzed empirically, but it has never before been mathematically derived. The softmax function partitions the transformed input space into cones, each of which encompasses a class. This is equivalent to putting a number of centroids in this transformed space at equal distance from the origin, and k-means clustering the data points by proximity to these centroids. Softmax only cares in which cone a data point falls, and not how far from the centroid it is within that cone. We formally prove that networks with a small Lipschitz modulus (which corresponds to a low susceptibility to adversarial attacks) map data points closer to the cluster centroids, which results in a mapping to a k-means-friendly space. To leverage this knowledge, we propose Centroid Based Tailoring as an alternative to the softmax function in the last layer of a neural network. The resulting Gauss network has similar predictive accuracy as traditional networks, but is less susceptible to one-pixel attacks; while the main contribution of this paper is theoretical in nature, the Gauss network contributes empirical auxiliary benefits.

READ FULL TEXT
research
02/10/2020

On Approximation Capabilities of ReLU Activation and Softmax Output Layer in Neural Networks

In this paper, we have extended the well-established universal approxima...
research
02/22/2021

Resilience of Bayesian Layer-Wise Explanations under Adversarial Attacks

We consider the problem of the stability of saliency-based explanations ...
research
05/27/2019

Radial Prediction Layer

For a broad variety of critical applications, it is essential to know ho...
research
03/23/2022

Enhancing Classifier Conservativeness and Robustness by Polynomiality

We illustrate the detrimental effect, such as overconfident decisions, t...
research
03/24/2013

Generalizing k-means for an arbitrary distance matrix

The original k-means clustering method works only if the exact vectors r...
research
02/11/2020

Fine-grained Uncertainty Modeling in Neural Networks

Existing uncertainty modeling approaches try to detect an out-of-distrib...
research
04/27/2022

Detecting Backdoor Poisoning Attacks on Deep Neural Networks by Heatmap Clustering

Predicitions made by neural networks can be fraudulently altered by so-c...

Please sign up or login with your details

Forgot password? Click here to reset