Utility-Probability Duality of Neural Networks

05/24/2023
by   Huang Bojun, et al.
0

It is typically understood that the training of modern neural networks is a process of fitting the probability distribution of desired output. However, recent paradoxical observations in a number of language generation tasks let one wonder if this canonical probability-based explanation can really account for the empirical success of deep learning. To resolve this issue, we propose an alternative utility-based explanation to the standard supervised learning procedure in deep learning. The basic idea is to interpret the learned neural network not as a probability model but as an ordinal utility function that encodes the preference revealed in training data. In this perspective, training of the neural network corresponds to a utility learning process. Specifically, we show that for all neural networks with softmax outputs, the SGD learning dynamic of maximum likelihood estimation (MLE) can be seen as an iteration process that optimizes the neural network toward an optimal utility function. This utility-based interpretation can explain several otherwise-paradoxical observations about the neural networks thus trained. Moreover, our utility-based theory also entails an equation that can transform the learned utility values back to a new kind of probability estimation with which probability-compatible decision rules enjoy dramatic (double-digits) performance improvements. These evidences collectively reveal a phenomenon of utility-probability duality in terms of what modern neural networks are (truly) modeling: We thought they are one thing (probabilities), until the unexplainable showed up; changing mindset and treating them as another thing (utility values) largely reconcile the theory, despite remaining subtleties regarding its original (probabilistic) identity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/23/2017

Learning uncertainty in regression tasks by deep neural networks

We suggest a general approach to quantification of different types of un...
research
03/20/2013

Integrating Probabilistic Rules into Neural Networks: A Stochastic EM Learning Algorithm

The EM-algorithm is a general procedure to get maximum likelihood estima...
research
10/05/2014

On the Computational Efficiency of Training Neural Networks

It is well-known that neural networks are computationally hard to train....
research
02/06/2013

Conditional Utility, Utility Independence, and Utility Networks

We introduce a new interpretation of two related notions - conditional u...
research
03/15/2020

Iterative training of neural networks for intra prediction

This paper presents an iterative training of neural networks for intra p...
research
07/03/2020

Learning to Prune in Training via Dynamic Channel Propagation

In this paper, we propose a novel network training mechanism called "dyn...
research
03/27/2019

Neural-networks for geophysicists and their application to seismic data interpretation

Neural-networks have seen a surge of interest for the interpretation of ...

Please sign up or login with your details

Forgot password? Click here to reset