DeepAI
Log In Sign Up

Two "correlation games" for a nonlinear network with Hebbian excitatory neurons and anti-Hebbian inhibitory neurons

12/31/2018
by   H. Sebastian Seung, et al.
0

A companion paper introduces a nonlinear network with Hebbian excitatory (E) neurons that are reciprocally coupled with anti-Hebbian inhibitory (I) neurons and also receive Hebbian feedforward excitation from sensory (S) afferents. The present paper derives the network from two normative principles that are mathematically equivalent but conceptually different. The first principle formulates unsupervised learning as a constrained optimization problem: maximization of S-E correlations subject to a copositivity constraint on E-E correlations. A combination of Legendre and Lagrangian duality yields a zero-sum continuous game between excitatory and inhibitory connections that is solved by the neural network. The second principle defines a zero-sum game between E and I cells. E cells want to maximize S-E correlations and minimize E-I correlations, while I cells want to maximize I-E correlations and minimize power. The conflict between I and E objectives effectively forces the E cells to decorrelate from each other, although only incompletely. Legendre duality yields the neural network.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

12/30/2018

Unsupervised learning by a nonlinear network with Hebbian excitatory and anti-Hebbian inhibitory neurons

This paper introduces a rate-based nonlinear neural network in which exc...
11/27/2022

Linear Classification of Neural Manifolds with Correlated Variability

Understanding how the statistical and geometric properties of neural act...
09/04/2018

Metabolize Neural Network

The metabolism of cells is the most basic and important part of human fu...
11/13/2020

Continual Learning with Deep Artificial Neurons

Neurons in real brains are enormously complex computational units. Among...
05/08/2019

Unsupervised Learning through Temporal Smoothing and Entropy Maximization

This paper proposes a method for machine learning from unlabeled data in...
08/10/2021

Logical Information Cells I

In this study we explore the spontaneous apparition of visible intelligi...

1 Network model with disynaptic inhibition

The disynaptic inhibition network (Fig. 1, left) has the activity dynamics,

(1)
(2)

Here

is a step size parameter, which can be set at a small constant value or adjusted adaptively. The activation function

is half-wave rectification. After the activities converge to a steady state, update the connection matrices via

(3)
(4)

where , , and . After the updates (3) and (4), any negative elements of and are zeroed to maintain nonnegativity. The divisive factor in Eq. (1) is updated via

(5)

Intuitions behind the model definitions are explained in the companion paper (Seung, 2018). The goal of the present paper is show how the network can be interpreted as a method of solving a zero-sum game.

2 Correlation game between connections

2.1 Formulation as constrained optimization

The first normative principle concerns transformation of a sequence of input vectors

into a sequence of output vectors . Both input and output are assumed nonnegative. Define the input matrix as the matrix containing input vectors as its columns. The element is the th component of . Similarly, define the output matrix as containing output vectors as its columns. Define the output-input correlation matrix is

Its element is the time average of , or . Similarly, define the output-output correlation matrix

Its element is the time average of , or

. Note that “correlation matrix” is used to mean second moment matrix rather than covariance matrix. In other words, the correlation matrix does not involve subtraction of mean values. This is natural for sparse nonnegative variables, but covariance matrices may be substituted in other settings.

Problem 1 (Constrained optimization).

Define the goal of unsupervised learning as the constrained optimization

(6)

where is a fixed matrix and is a scalar-valued function that is assumed monotone nondecreasing as a function of every element of its matrix-valued argument.

Monotonicity is an important assumption because it allows us to interpret the objective of Eq. (6) as maximization of input-output correlations.

2.2 Copositivity vs. nonnegativity

Seung and Zung (2017) introduced the principle

(7)

which differs from Eq. (6) only by the substitution of “nonnegativity” for “copositivity.” (Here nonnegativity of a matrix is defined to mean nonnegativity of all its elements.) While the formalisms here are valid for arbitrary , a convenient choice is to set diagonal elements of to be and off-diagonal elements of to be

(8)

If is much smaller than , the nonnegativity constraint in Eq. (7) amounts to decorrelation.

A symmetric matrix is said to be copositive when for every nonnegative vector . This constraint is analogous to positive semidefiniteness but is more complex because it cannot be reduced to a single eigenvalue constraint. Hahnloser et al. (2003) give sufficient and necessary conditions for copositivity involving eigenvalues of submatrices.

Nonnegativity of is a sufficient condition for copositivity of , but it is not a necessary condition. In particular, copositivity of does not require nonnegativity, so a solution of Problem 1 may have for some and .

A necessary condition for copositivity of is nonnegativity of its diagonal elements, since if where denotes the standard basis for . In particular copositivity of requires that for all . These inequalities will be called “power constraints,” because they limit the power in the outputs.

If either of the diagonal elements and vanish, then a necessary condition for copositivity is nonnegativity of the off-diagonal element . Therefore may exceed in a solution of Problem 1 only if the power constraints for and are not saturated.

2.3 Correlation game from Legendre-Lagrangian duality

The copositivity constraint in Eq. (6) can be enforced by introducing Lagrange multipliers and ,

(9)

The Lagrange multiplier is a nonnegative matrix. The outer maximum must choose so that is copositive because otherwise the minimum with respect to is . The Lagrange multiplier is a nonnegative diagonal matrix. The outer maximum must choose so that the diagonal elements of are nonnegative because otherwise the minimum with respect to is .

As mentioned above, copositivity of by itself already implies that the diagonal elements are nonnegative. It follows that the Lagrange multiplier is redundant for the primal problem, though it does affect the dual problem. Similarly, adding extra rows to the Lagrange multiplier does not change the primal problem. For enforcing the copositivity constraint, it would be sufficient for to be . However, making does affect the dual problem.

Problem 2 (Game between cells and connections).

Switching the order of min and max in Eq. (1) yields the dual problem,

(10)

This is an upper bound for Eq. (9) by the minimax inequality.

At this point, it is convenient to define the objective function as the convex conjugate (Legendre-Fenchel transform) of a function ,

(11)

The nonnegativity constraint on in Eq. (11) guarantees that is monotone nondecreasing as a function of every element of . The function can be interpreted as a regularizer or prior for the weight matrix .

With Legendre duality, a maximization with respect to is implicit in Eq. (11). Switching the order of and maximizations yields the following equivalent problem.

Problem 3 (Game between connections).

The Lagrangian dual of the constrained optimization in Problem 1 is

(12)

with payoff function defined by

(13)

The min-max problem can be interpreted as a zero-sum game between on the one hand and and on the other.

Problem 3 is closely related to the correlation game previously introduced by Seung and Zung (2017),

(14)

Problem 3 constrains for some nonnegative and , so it is an upper bound for Eq. (14). This is the mathematical interpretation of choosing a parametrized form for the Lagrange multiplier , as was done by Seung (2018).

The network model of Section 1 follows by setting

in Eq. (13) and applying online projected gradient ascent to perform the maximizations in Eq. (12) and online projected gradient descent to perform the minimizations. For a more general choice of , Eq. (3) should be replaced by

3 Correlation game between cells

The second normative principle concerns transformation of a sequence of nonnegative input vectors into two sequences of nonnegative output vectors and . Define the input matrix and the two output matrices and .

Problem 4 (Game between cells).

Define the goal of unsupervised learning as the zero-sum game between and

(15)

where and are scalar-valued functions assumed monotone nondecreasing as a function of every element of their matrix-valued arguments.

Note that only nonnegativity constraints remain in Problem 4; the copositivity constraint of Problem 1 is completely hidden. This correlation game can be interpreted as follows. The cells would like to maximize correlations (make large) and minimize correlations (make small). The cells would like to maximize correlations (make large) and minimize power (make small). There is conflict between the and cells because cells would like to minimize correlations while cells would like to maximize them. The compromise is that cells incompletely decorrelate from each other.

Problem 4 is equivalent to Problem 1 given the definition of

(16)

as the Legendre transform of

(17)
Proof.

Substituting the definitions of Eqs. (16) and (17) into Eq. (15) yields

This is minimized when , attaining the value

This is identical to Eq. (9), except for the omission of the Lagrange multiplier which, as mentioned previously, is redundant in the primal problem. ∎

4 Discussion

The second normative principle is also interesting because it can be generalized to include and connections. This will be the subject of future work.

Acknowledgments

The author is grateful for helpful discussions with J. Zung, C. Pehlevan and D. Chklovskii. The research was supported in part by the Intelligence Advanced Research Projects Activity (IARPA) via DoI/IBC contract number D16PC0005, and by the National Institutes of Health via U19 NS104648 and U01 NS090562.

References

  • Földiák [1990] Peter Földiák. Forming sparse representations by local anti-hebbian learning. Biological cybernetics, 64(2):165–170, 1990.
  • Hahnloser et al. [2003] Richard HR Hahnloser, H Sebastian Seung, and Jean-Jacques Slotine. Permitted and forbidden sets in symmetric threshold-linear networks. Neural computation, 15(3):621–638, 2003.
  • Pehlevan and Chklovskii [2015] Cengiz Pehlevan and Dmitri Chklovskii. A normative theory of adaptive dimensionality reduction in neural networks. In Advances in neural information processing systems, pages 2269–2277, 2015.
  • Pehlevan et al. [2018] Cengiz Pehlevan, Anirvan M Sengupta, and Dmitri B Chklovskii. Why do similarity matching objectives lead to hebbian/anti-hebbian networks? Neural computation, 30(1):84–124, 2018.
  • Seung [2018] H Sebastian Seung. Unsupervised learning by a nonlinear network with hebbian excitatory and anti-hebbian inhibitory neurons. arXiv, 2018.
  • Seung and Zung [2017] H Sebastian Seung and Jonathan Zung. A correlation game for unsupervised learning yields computational interpretations of hebbian excitation, anti-hebbian inhibition, and synapse elimination. arXiv preprint arXiv:1704.00646, 2017.
  • Zylberberg et al. [2011] Joel Zylberberg, Jason Timothy Murphy, and Michael Robert DeWeese. A sparse coding model with synaptically local plasticity and spiking neurons can account for the diverse shapes of v1 simple cell receptive fields. PLoS Comput Biol, 7(10):e1002250, 2011.