# Collective stability of networks of winner-take-all circuits

The neocortex has a remarkably uniform neuronal organization, suggesting that common principles of processing are employed throughout its extent. In particular, the patterns of connectivity observed in the superficial layers of the visual cortex are consistent with the recurrent excitation and inhibitory feedback required for cooperative-competitive circuits such as the soft winner-take-all (WTA). WTA circuits offer interesting computational properties such as selective amplification, signal restoration, and decision making. But, these properties depend on the signal gain derived from positive feedback, and so there is a critical trade-off between providing feedback strong enough to support the sophisticated computations, while maintaining overall circuit stability. We consider the question of how to reason about stability in very large distributed networks of such circuits. We approach this problem by approximating the regular cortical architecture as many interconnected cooperative-competitive modules. We demonstrate that by properly understanding the behavior of this small computational module, one can reason over the stability and convergence of very large networks composed of these modules. We obtain parameter ranges in which the WTA circuit operates in a high-gain regime, is stable, and can be aggregated arbitrarily to form large stable networks. We use nonlinear Contraction Theory to establish conditions for stability in the fully nonlinear case, and verify these solutions using numerical simulations. The derived bounds allow modes of operation in which the WTA network is multi-stable and exhibits state-dependent persistent activities. Our approach is sufficiently general to reason systematically about the stability of any network, biological or technological, composed of networks of small modules that express competition through shared inhibition.

## Authors

• 2 publications
• 2 publications
• 18 publications
01/13/2012

### Competition through selective inhibitory synchrony

Models of cortical neuronal circuits commonly depend on inhibitory feedb...
06/19/2019

### Resonator Circuits for factoring high-dimensional vectors

We describe a type of neural network, called a Resonator Circuit, that f...
11/11/2021

### Why large time-stepping methods for the Cahn-Hilliard equation is stable

We consider the Cahn-Hilliard equation with standard double-well potenti...
11/09/2020

### Neuromorphic Control

Neuromorphic engineering is a rapidly developing field that aims to take...
02/28/2016

### Stability and Structural Properties of Gene Regulation Networks with Coregulation Rules

Coregulation of the expression of groups of genes has been extensively d...
06/13/2019

### Do we need two forms of feedback in the Rate Control Protocol (RCP)?

There is considerable interest in the networking community in explicit c...
06/16/2021

### Recursive Construction of Stable Assemblies of Recurrent Neural Networks

Advanced applications of modern machine learning will likely involve com...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Large biological and artificial systems often consist of a highly interconnected assembly of components (Fig 1). The connectivity between these elements is often densely recurrent, resulting in various loops that differ in strength and time-constant Girard . (2008); Slotine  Lohmiller (2001); Hopfield (1982); Amari (1977); Douglas . (1995); Liu . (2006). This organization is true of the neocortex, where the statistics of connectivity between neurons indicate that recurrent connections are a fundamental feature of the cortical networks Douglas  Martin (2004); Binzegger . (2004); Douglas . (1995). These recurrent connections are able to provide the excitatory and inhibitory feedback necessary for computations such as selective amplification, signal restoration, and decision making. But, this recurrence poses a challenge for the stability of a network Slotine  Lohmiller (2001); Tegnér . (2002); Cohen  Grossberg (1983). Connections may neither be too strong (leading to instability) or too weak (resulting in inactivity) for the network to function properly Koch  Laurent (1999)

. In addition connections are continually changing as a function of learning, or are accumulated semi-randomly throughout development or evolution. How then, do these networks ensure stability? Artificial neural networks can rely on their bounded (e.g. sigmoid) activation functions, but biological neurons do not usually enter saturation. Instead, their stability depends crucially on the balance between inhibition and excitation

Hahnloser . (2000); McCormick  Contreras (2001). In this paper we explore how the stability of such systems is achieved, not only because we wish to understand the biological case, but also because of our interest in building large neuromorphic electronic systems that emulate their biological counterparts Indiveri . (2009).

Reasoning about the computational ability as well as the stability of neural systems usually proceeds in a top-down fashion by considering the entire system as single entity able to enter many states (as for example in Hopfield networks Izhikevich (2007); Hopfield (1982); Hertz . (1991)). Unfortunately, the number of states that must be considered grows exponentially with the size of the network, and so this approach quickly becomes intractable. For this reason stability analysis of large-scale simulations of the brain are proving difficult Izhikevich  Edelman (2008); Ananthanarayanan . (2009); Markram (2006).

We present an alternative approach, which uses bottom-up reasoning about the modules that constitute the network. The idea is that the stability of the modules should be conferred on the networks that they compose. Of course, simply combining several modules, each of which is stable in isolation, to form a larger system does not necessarily imply that the new system is stable Slotine  Lohmiller (2001); Slotine (2003). However, we explore the possibility that when the modules employ a certain kind of stability mechanism, then they are indeed able to confer stability also on the super-system in which they are embedded. We show that modules that achieve their own stability by observing constraints on their inhibitory/excitatory balance, can be stable alone as well as in combination.

We have chosen to examine this problem in networks of WTA circuits Yuille  Geiger (2003), because these circuits are consistent with the observed neuroanatomical connections of cortex Douglas  Martin (2004); Binzegger . (2004). Moreover, the WTA is interesting because it can implement useful computational operations such as signal restoration, amplification, max-like winner selection (i.e. decision making) or filtering Maass (2000); Hahnloser . (1999); Douglas  Martin (2007); Yuille  Geiger (2003). And, combining multiple WTAs in a systematic manner extends these possibilities further by allowing persistent activity and state-dependent operations Rutishauser  Douglas (2009); Neftci . (2008, 2010).

Typically, WTA networks operate in a high-gain regime in which their operation is non-linear (e.g. selective amplification). While the stability of a WTA can be analyzed by linearizing around the possible steady-states, rigorous analysis that takes the non-linearities into account is difficult using linear analysis tools Strogatz (1994); Izhikevich (2007); Hahnloser (1998); Hahnloser . (2003). Instead, we use nonlinear Contraction Analysis Lohmiller  Slotine (1998); Slotine (2003); Lohmiller  Slotine (2000) to investigate the stability of WTA networks. The concept of contraction is a generalization of stability analysis for linear systems, allowing Contraction Analysis Lohmiller  Slotine (1998) to be used for the analysis of circuits in the fully-non linear case, without making linearized approximations.

A nonlinear time-varying system is said to be contracting if initial conditions or temporary disturbances are forgotten exponentially fast. Thus, any two initial conditions will result in the same system trajectory after exponentially fast transients. Importantly, the properties of contracting systems are preserved when they are combined to form a larger systems Slotine (2003). Also, contraction allows parameter regimes which are not unduly restrictive. For instance, it can describe strong feedback loops; and, ranges of parameters can be found where the system is both contracting and operating in a highly non-linear regime. In addition, contraction analysis can deal with systems that are multi-stable (expressing several stable attractors or behaviors), where it guarantees exponentially fast convergence to one of the possible attractors. Such systems are capable of rich state-dependent computations, while at the same being contracting. We have used Contraction Analysis to reason about the permissible kinds and strengths of connectivity within and between WTA modules embedded in a network. If the individual modules are contracting, then observing our constraints is sufficient to guarantee stability (boundedness) of a system composed of such modules. Thus, Contraction Analysis permits the derivation of simple bounds on the network parameters that will guarantee exponential convergence to equilibria in the fully non-linear case. This approach enables the systematic synthesis of large circuits, which are guaranteed to be stable if the set of bounds is observed. While we will demonstrate the feasibility of our approach in the case of WTA-type networks, our approach is not restricted to such networks. It can be applied as well to any simple non-linear circuit that is capable of non-linear computational operations.

## 2 Results

Our results are organized as follows. First, we introduce the basic organization of the WTA circuit. Second, we apply contraction theory to analyze the stability of networks of WTA circuits. We derive analytically the bounds on the parameters of the network that permit it to operate properly in either a soft-or hard WTA configuration. We conclude by performing numerical simulations to confirm that the analytical bounds are valid and not unnecessarily restrictive.

### 2.1 The winner-take all network

Each winner-take all (WTA) consists of excitatory units and one inhibitory unit (See Fig 2A). Each excitatory unit receives recurrent input from itself () and its neighbors (). For simplicity, only self-recurrence is considered here (), but similar arguments obtain when recurrence from neighboring units is included (see section 2.6). The inhibitory unit receives input from each excitatory unit with weight , and projects to each excitatory unit with weight . The dynamics of each unit are described by Eqs 1 and 2. The firing rate activation function is a non-saturating rectification non-linearity . The dynamics of this network, and in particular the boundedness of its trajectories, depends on the balance of excitation and inhibition.

 τ˙xi+Gxi=f(Ii+αxi−β1xN−Ti) (1)
 τ˙xN+GxN=f(β2N−1∑j=1xj−TN) (2)

Where is external input to unit . All thresholds are constant and equal. is a constant that represents the load (conductance) and is assumed , unless stated otherwise. All parameters are positive: . We will refer to such a system either as a WTA or a ”recurrent map” throughout the paper. “Map“ will denote a WTA throughout, and not a discrete dynamical system.

### 2.2 Combining several WTAs

A single WTA network can implement some useful computational operations (see Introduction). However, more sophisticated computational operations can be achieved by combining several WTAs Rutishauser  Douglas (2009) by sparse and selective connections between some of the excitatory units of the various WTAs. We consider two ways of combining WTAs: bidirectional and unidirectional. A bidirectional (and symmetric) connection establishes a recurrent connection between two WTAs. A unidirectional connection provides the activity of one WTA as input to a second WTA (feed-forward). The inhibitory units neither receive input from, nor do they project to, any other map. Thus, activity between maps is always excitatory (positive). This arrangement is motivated by the long range connections in cortex, which are predominantly excitatory Douglas  Martin (2004); Douglas . (1995) (but they can contact both excitatory and inhibitory targets). While long-range inhibitory projections in cortex exist as well, we focus exclusively on excitatory long-range connectivity in this paper.

These two kinds of connectivity are motivated by our previous finding that three WTAs connected by a combination of bi-and unidirectional connections are sufficient to implement state-dependent processing in the form of an automaton Rutishauser  Douglas (2009). An automaton consists of two components: states, and transitions between states. By connecting two maps bidirectionally, the network is able to maintain one region of persistent activity in the absence of external input, and this winning region represents the current state of the automaton. (State dependence is a form of memory and we thus refer to these localized regions of persistent activity as memory states.) Transition circuits allow the network to select a new winner, conditioned on the current state as well as an external input. The implementation of these transitions requires a third WTA (to select the most appropriate transition) as well as unidirectional connections between the maps that drive the transition (see below). In this paper we explore what constraints the presence of these additional connections poses on the stability of this and larger (more than three WTAs) networks.

First, consider two identical WTAs and (see Fig 2C). Each WTA consists of units (2 excitatory, 1 inhibitory). The only connection between the two networks is , which symmetrically (bidirectional) connects and . Thus, this network can only represent one state.

The update equations for and thus become:

 τ˙x2+Gx2=f(I2+αx2+γy2−β1xN−T) (3)
 τ˙y2+Gy2=f(αy2+γx2−β1yN−T) (4)

Second, we consider unidirectional connections between WTAs. These are feed-forward connections between two maps: For example, when units on map provide input to units on map . However, such feed-forward connections can result in (indirect) recurrence: For example, when map in turn provides input to . Thus, analysis of unidirectional connections requires that we consider three maps , and simultaneously. The two maps and are connected bidirectionally as shown above, whereas contains units that receive external input as well as input from and also provide output to (Fig 2D). In this way, strong enough activation of units on can bias the ongoing competition in the network and thereby induce a switch to a new winner (so changing state).

A given unit on can either receive input from a different unit than it projects to (so providing a transition from one state to an other); or it can receive from and project to the same state. In Fig 2D, is an example of a unit that initiates a transition from state 1 to 2, whereas receives input from and projects to state 2. Thus, establishes an additional loop of recurrent feedback and is the more restrictive case when considering stability.

Following Fig 2D, the dynamics of and become

 τ˙x1+x1=f(I1+αx1+γy1−β1xN−T) (5)
 τ˙x2+x2=f(I2+αx2+γy2+ϕz1+ϕz2−β1xN−T) (6)

and similarly for , .

The dynamics for the two new units and are

 τ˙z1+z1=f(ITN1+αz1+ϕy1−β1zN−T−TTN) (7)
 τ˙z2+z2=f(ITN2+αz2+ϕy2−β1zN−T−TTN) (8)

The equations for the other units of the system are equivalent to the standard WTA.

The term is an additional constant threshold for activation of the transition unit, so that in the absence of an external input , the transition unit will remain inactive . The external input can be used to selectively initiate a transition. An appropriate choice of the threshold will ensure that the transition unit is active only when both the external input and the input from the projecting map are present. The activation of is thus state dependent, because it depends both on an external input as well as the current winner of the map.

Now we will explore what constraints the presence of and impose on stability. We will use Contraction Analysis to show that, if the single WTAs are contracting, and can be used (with an upper bound) to arbitrarily combine WTAs without compromising the stability of the aggregate system. Since we base our arguments on contraction analysis, we will first introduce its basic concepts.

### 2.3 Contraction Analysis

Essentially, a nonlinear time-varying dynamic system will be called contracting if arbitrary initial conditions or temporary disturbances are forgotten exponentially fast, i.e., if trajectories of the perturbed system return to their unperturbed behavior with an exponential convergence rate. It turns out that relatively simple algebraic conditions can be given for this stability-like property to be verified, and that this property is preserved through basic system combinations and aggregations.

A nonlinear contracting system has the following properties Lohmiller  Slotine (1998, 2000); Slotine (2003); Wang  Slotine (2005)

• global exponential convergence and stability are guaranteed

• convergence rates can be explicitly computed as eigenvalues of well-defined Hermitian matrices

• combinations and aggregations of contracting systems are also contracting

• robustness to variations in dynamics can be easily quantified

Before stating the main contraction theorem, recall first the following. The symmetric part of a matrix is . A complex square matrix is Hermitian if , where denotes matrix transposition and complex conjugation. The Hermitian part of any complex square matrix is the Hermitian matrix . All eigenvalues of a Hermitian matrix are real numbers. A Hermitian matrix is said to be positive definite

if all its eigenvalues are strictly positive. This condition implies in turn that for any non-zero real or complex vector

, . A Hermitian matrix is called negative definite if is positive definite.

A Hermitian matrix dependent on state or time will be called uniformly positive definite if there exists a strictly positive constant such that for all states and all the eigenvalues of remain larger than that constant. A similar definition holds for uniform negative definiteness.

Consider now a general dynamical system in ,

 ˙x=f(x,t) (9)

with a smooth non-linear function. The central result of Contraction Analysis, derived in Lohmiller  Slotine (1998) in both real and complex forms, can be stated as:

Theorem Denote by the Jacobian matrix of with respect to . Assume that there exists a complex square matrix such that the Hermitian matrix is uniformly positive definite, and the Hermitian part of the matrix

 F=(˙Θ+Θ∂f∂x)Θ−1

is uniformly negative definite. Then, all system trajectories converge exponentially to a single trajectory, with convergence rate . The system is said to be contracting, is called its generalized Jacobian, and its contraction metric. The contraction rate is the absolute value of the largest eigenvalue (closest to zero, although still negative) .

In the linear time-invariant case, a system is globally contracting if and only if it is strictly stable, and can be chosen as a normal Jordan form of the system, with a real matrix defining the coordinate transformation to that form Lohmiller  Slotine (1998). Alternatively, if the system is diagonalizable, can be chosen as the diagonal form of the system, with a complex matrix diagonalizing the system. In that case, is a diagonal matrix composed of the real parts of the eigenvalues of the original system matrix.

Note that the activation function (see Eqs 1-2) is not continuously differentiable, but it is continuous in both space and time, so that contraction results can still be directly applied Lohmiller  Slotine (2000). Furthermore, the activation function is piecewise linear with a derivative of either or . This simple property is exploited in the following by inserting dummy terms , which can either be or according to the derivative of : . For a single WTA, there are a total of dummy terms.

### 2.4 Stability of a single WTA

We begin the contraction analysis by considering a single WTA. The conditions obtained in this section guarantee that the dynamics of the single map converge exponentially to a single equilibrium point for a given set of inputs. Actually, the WTA has several equilibrium points (corresponding to each possible winner), but contraction analysis shows that for a given input a particular equilibrium will be reached exponentially fast, while all others are unstable. Thus, as long as the network does not start out exactly at one of the unstable equilibria (which is impossible in practice), it is guaranteed to converge to the unique equilibrium point (the winner) determined by the given set of inputs. Our strategy is two-fold: first we show that the WTA is contracting only if one of the excitatory units is active (the ”winner” in a hard-WTA configuration). Second, we show that in the presence of multiple active excitatory units, the dynamics diverge exponentially from the non-winning states.

Following section 2.3, a system with Jacobian is contracting if

 τΘ J Θ−1 < 0 (10)

The Jacobian has dimension and describes the dynamics of a single WTA, and is a transformation matrix (see section 2.3 and below). Using dummy terms as shown in the previous section, the Jacobian of the WTA is

 τJ=⎡⎢⎣l1α−G0− l1β10l2α−G− l2β1l3β2l3β2−G⎤⎥⎦ (11)

This WTA has two possible winners ( or ) that are represented by or , respectively ( for both). Assuming the second unit is the winner, the Jacobian becomes

 τJW2=⎡⎢⎣−G000α−G− β1β2β2−G⎤⎥⎦ (12)

Our approach consists in first finding a constant metric transformation describing the contraction properties of the simple Jacobian (12) for appropriate parameter ranges, a process equivalent to standard linear stability analysis, and then using the same metric transformation to assess the contraction properties of the general nonlinear system.

Let us first find ranges for the parameters such that is contracting. This is the case if , where defines a coordinate transform into a suitable metric. The left hand-side is the generalized Jacobian (see section 2.3). Based on the eigendecomposition , where the columns of

correspond to the eigenvectors of

, define . This transformation represents a change of basis which diagonalizes Horn (1985). This choice of a constant invertible also implies that is positive definite (since , ).

Using this transformation and assuming , the Hermitian part of (Eq 10) is negative definite if 222These solutions are derived by considering the eigenvalues of the Hermitian part of (10), which is diagonal and real, and then solving the system of inequalities .

 0<α<2√β1β2 (13)
 0<β2 (14)
 0<β1β2<1 (15)

Note that these conditions systematically relate to the inhibitory loop gain , and also permit (see below for discussion).

The above conditions guarantee contraction for the cases where inhibition () and one excitatory unit are active (here and but the same bounds are valid for and ). The next key step is to use the same metric to study arbitrary terms and , so as to show that the system is contracting for all combinations of , except the combinations from which we want the system to be exponentially diverging. In the same metric and using the Jacobian Eq 11 with the Hermitian part of becomes (with )

 FH=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣−1000−1+12αl2−(2iβ1β2+α(−iα+√−α2+4β1β2))(l2−l3)2√−α2+4β1β20−(−2iβ1β2+α(iα+√−α2+4β1β2))(l2−l3)2√−α2+4β1β2−1+12αl2⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦ (16)

Note that Eq (16) was simplified assuming the bound given in Eq 13. We require . A matrix of the form is negative definite if and Wang  Slotine (2005). For (16), this results in

 (β1β2(l2−l3))2−α2+4β1β2<(−1+12αl2)2 (17)

The bounds (13)-(15) on the parameters satisfy this condition whenever and . As expected, for the case (only excitation active) the system is not contracting for . Rather, we require that in this case the system is exponentially diverging, as we detail below.

Next, we consider the full Jacobian (Eq 11) with all . For the network to be a hard-WTA, we require that this configuration is exponentially diverging. The dynamics of interest are the excitatory units, so that, following Pham  Slotine (2007), the system is exponentially diverging away from this state if

 VJVT>0 (18)

where is the projection matrix

 V=[α0− β10α− β1] (19)

The constraint (18) assures that the system diverges from certain invariant subspaces where is constant. For as shown in (19), . Each row represents one excitatory unit. If condition (18) is satisfied, the network is guaranteed to diverge exponentially away from this equilibrium.

Condition (18) is satisfied (for ) if

 1<α (20)
 0<β1 (21)
 0<β1β2<(1−1α)(β21+α22) (22)

The above conditions were derived based on the system of inequalities given by the eigenvalues of the Hermitian part of the left-hand side of (18). The same calculation using instead and (excitation, but no inhibition) results in the same bounds for exponential divergence from the state of no inhibition.

Combining (i) conditions (13)-(15) for exponential convergence to the winner state and (ii) conditions (20)-(22) for exponential divergence from the non-winning and the excitation-only states, yields

 1<α<2√β1β2 (23)
 14<β1β2<1 (24)
 β1β2<(1−1α)(β21+α22) (25)

Note the two key components: the excitatory gain and the inhibitory gain . The above conditions establish lower and upper bounds on the parameters for global exponential convergence to a unique winner for a given set of inputs.

Under these constraints (in particular on the excitatory loop strength ) the system is globally convergent yet always selects a winner. The system does not depend on saturation to acquire this stability. Also, the constraints guarantee that the system does not oscillate, apart from transient oscillations during convergence. This has been established by demonstrating that the system is either contracting or exponentially diverging for any subset of the dummy terms . Note that the system is contracting in the same metric for all contracting subsets. While we defined the metric for a particular winner, the same constraints result from defining a similar for any of the other possible winners. Similar conditions can be derived for conditions where the winner is represented by multiple active units such as when a ”bump of activity” is introduced by adding excitatory nearest-neighbor connections Rutishauser  Douglas (2009); Douglas  Martin (2007) (see section 2.6). Numerically, these ranges permit a wide range of parameters. For example, for and , . Under these conditions, the system operates in a highly non-linear regime (where the loop gain can be up to  50!).

The analysis above focused on the regime where (with ). In this mode, the system acts as a highly non-linear WTA, always selecting a binary winner. What if the system operates in ? In this configuration, the winner unit is still contracting (Eq 13).

What happens when all units () are active and ? Defining based on the Jacobian with all units on and solving , we find that this system is contracting for . The system where all excitatory units are active is thus contracting under this condition, implying that the system is in a ”soft-WTA” configuration. While the system still selects a winning unit, the activity of the loosing unit is not completely suppressed. Also note that in this configuration, no persistent activity in the absence of external input is possible. A graphical illustration of both modes of operation is shown in Fig 3.

Finally, note that the time-constant was assumed to be equal for all units. Note that, in this case, the numerical value of does not influence the bounds (since multiplies the entire Jacobian, see Eq 10). Similar conditions can be derived for conditions where the time-constants are not equal (see Appendix C), in which case only the ratio of the time-constants is relevant.

### 2.5 Stability of single WTA of arbitrary size

Can this analysis be extended to maps of arbitrary size? While the approach in the previous section can be applied to maps of any size, an alternative approach is to first define contraction for a map consisting only of a single excitatory and inhibitory unit and then extend it recursively by one unit at a time, while showing that this extension does not change the contraction properties. This approach is illustrated in Fig 4.

The most simple map consists of one excitatory and one inhibitory unit (Fig 4A) . While there is no competition between different inputs, this map otherwise preserves all the properties of a WTA (such as non-linear amplification of the input). The Jacobian of this map is:

 τA=[\l1α−G− \l1β1\l2β2−G] (26)

This system (Fig 4A) is contracting if the conditions shown in Eqs 23-25 for the parameters hold. The approach used to derive the bounds is equivalent to the one described above: first, define a , where is based on the eigendecomposition of with . Then, define the valid parameter ranges based on Eq 10. The same permissible parameters result (see Eqs 13, 15, 14).

Combining two such maps by feeding excitatory input to both inhibitory neurons by both excitatory neurons leads to a WTA with two excitatory units (Fig 4B). This map is equivalent to the map shown previously, except for that it contains two inhibitory neurons. These are, however, functionally equivalent (their activity and derivatives are the same at all points of time). Thus the behavior of both systems will be equivalent. The Jacobian of the combined system is:

 τJ=[A1G1G2A2] (27)

where

 τG=[00l2β20] (28)

and after adjusting the terms appropriately ( and for and , respectively). Similarly, for and , respectively. Note that combining the two systems in this way adds only two (strictly positive) terms to the equations describing the dynamics of the inhibitory neurons. Thus, inhibition in this new system can only be larger compared to the smaller system. Thus, if the smaller system is contracting (as shown above), the combined system must also be contracting (shown in the next section).

Defining a metric based on the eigendecomposition of for either and or and and then solving

 τΘ J Θ−1 < 0 (29)

results in the same constraints for the system to be contracting (see Eqs 13, 15, 14).

This result can be generalized so that it is valid for adding one unit to a map that is already contracting. This can be seen directly by considering the eigenvalues of the Hermitian part of , defined either for a system with units or units. A system with units has Jacobian and is contracting as shown previously. The condition for it to be stable requires (for the real part only)

 12(−2+α±√α2−4β1β2)<0 (30)

A system with units has Jacobian (Eq 27) and is stable if (29) holds. This requires

 12(−2+α±√α2−4β1β2)<0 (31)

Comparing Eqs 30 and 31 reveals that adding a unit to a system of units does not change the conditions for contraction to a single winner. Thus, if the recurrent map consisting of excitatory unit is contracting the system of units is also contracting. By recursion this proof can be applied to maps of arbitrary size.

What if multiple units on the map are active? Above conditions show that a single winner is contracting on an arbitrary sized map. In a hard-WTA configuration, the system should always emerge with a single winner. We have previously shown that our system has this property when (see Eq 18). Here, we extend this argument to maps of arbitrary size. Note that only the for the excitatory units can be switched inactive. All inhibitory neurons (since they all represent the same signal) are always .

Here, we start with a system that has units (since a system of does not have competition). The goal is to find conditions that enforce a single winner for units. For the system ( with all ), enforcing (see Eq 18) with

 V=[α−β10−β10−β1α− β1] (32)

gives conditions for this configuration (both units on, i.e. ) to be exponentially unstable (thus converging to an other subset of the terms). Similar to (19), the system diverges from invariant subspaces where is constant. For the projection (32), and defines the equilibrium. If condition (18) is satisfied, the network is guaranteed to diverge exponentially away from this equilibrium.

The eigenvalues of the Hermitian part of this system (same as for Eq 18) are uniformly positive if the following two conditions hold

 −α2+α3>0,−α2+α3−β21+2αβ21−4αβ1β2>0 (33)

Note that any solution requires (solutions are shown in (20)-(22)). This condition thus shows that any two simultaneously active units can not be contracting if .

For the 3 unit system, applying (27) recursively results in the Jacobian

 (34)

Applying an appropriate constructed in analog to (32) shows that for this system if

 −α2+α3>0,−α2+α3−6β21+3αβ21−9αβ1β2>0 (35)

Note that a sufficient solution continues to require . We have thus shown, under that a system with as well as can only have one active unit. By recursion, the same argument can be used to show that any system can not have a subset of units (where ) active. Any such system is thus always converging to a single winner. Any such system will have these properties if the parameters are within the ranges shown in Eqs 20, 21, 22) hold.

For purposes of this proof, we used additional inhibitory units (one for each excitatory unit). Note that this arrangement is for mathematical convenience only: In an implemented system these units can be collapsed to one unit only (or several to implement local inhibition). Collapsing all units to one does not change the dynamics of the system, because all inhibitory units have the same activity (and its derivatives) at all times.

#### 2.5.1 Example

This example will show how to apply the approach outlined above in order to calculate the permissible range of parameters for a toy recurrent map consisting of one excitatory and one inhibitory unit (Fig 4A), whose Jacobian is (Eq 26) with . Our intention is to illustrate in detail the procedural aspects involved in the calculation.

First construct based on the eigenvectors of and then set

 Θ=Q−1=⎡⎢ ⎢ ⎢⎣−β2√α2−4β1β212(1+α√α2−4β1β2)β2√α2−4β1β212−α2√α2−4β1β2⎤⎥ ⎥ ⎥⎦ (36)

Then, transforming using results in the generalized Jacobian

 (37)

Due to the choice of the metric , only terms on the diagonal remain. The network is contracting if the Hermitian part (37) is negative definite. A sufficient condition for this to be the case is . Solving this system of inequalities results in the conditions shown in (13, 14, 15).

#### 2.5.2 Comparison with numerical simulations

Do the analytical bounds derived above match the behavior of the system when it is simulated? We simulated a WTA network as described (with 2 excitatory units) and systematically tested different combinations of the parameters . For each simulation we determined whether all units in the network reach steady state with

after a sufficient amount of time. Such networks where classified as stable or unstable, respectively (Fig

5). Here, we vary and while keeping . While the analytically derived solution is slightly more conservative than necessary it closely matches the results of the simulations (see Fig 5 and legend for details). The crucial parameter is the excitatory strength relative to the inhibitory strength. This can be seen from the general increase of the permissible value of as a function of (Fig 5). Note however that our analytical solution assigns an upper bound to as well, which is unnecessary for the numerical simulations. However, strong values of lead the system to oscillate and keeping the parameter within the range derived analytically prevents this problem.

### 2.6 Stability of single WTA - bump of activity

The previous analysis considered WTA networks where only and (Fig 2A). In this configuration, the winner of the competition is represented by a single active unit. However, cooperation between neighboring units can also be introduced, by setting . The winner is now represented by a more distributed ”hill of activity”. Our analysis can also be extended to this case.

For the simplest case of 2 units, this network has the Jacobian

 τJ=⎡⎢⎣l1α1−Gl1α2− l1β1l2α2l2α1−G− l2β1l3β2l3β2−G⎤⎥⎦ (38)

with . Using the approach outlined previously, this system is stable if . After applying the coordinate transform, examining the eigenvalues of the Hermitian part of this system reveals that

 12(−2+α1+α2±√(α1+α2)2−8β1β2)<0 (39)

is a required condition (plus others, not shown). Comparing this condition to the eigenvalues of the system with (see (31)) reveals that was replaced by (plus some other minor modifications). This result confirms the intuition that the crucial factor is the total excitatory input to any one unit. A sufficient condition for this system to be contracting is (compare to Eq 13)

 0<α1+α2<√8β1β2 (40)

This condition applies as long as and . Together, these conditions similarly permit a fairly wide range of parameters, including . For example, if , and , . Note the critical trade-off between the inhibitory gain and the excitatory gain that is expressed in this section.

### 2.7 Stability of two bidirectionally coupled WTAs

Next we consider how two WTAs and can be coupled stably (by connections as shown above). The key idea is first to give sufficient conditions for stable synchronization of the two WTA’s. Note that by synchronization we mean here that two variables have the same value (in contrast to other meanings of synchronization i.e. in population coding). This allows the dimensionality of the stability analysis to be reduced. Indeed, synchronization implies that the overall system stability can then be analyzed simply by considering the stability of the individual target dynamics, i.e., of any one of the subsystems where the external coupling variables have been replaced by the corresponding (endogenous) variables in the subsystem. For instance, in the target dynamics of , equation (3) is replaced by

 τ˙x2+Gx2=f(I2+αx2+γx2−β1xN−T) (41)

Next, we shall see that in fact, given the form of coupling we assume, stable synchronization of the subsystems comes “for free”. That is, it is automatically satisfied as long as sufficient conditions for the stability of the individual target dynamics are satisfied.

Following Pham  Slotine (2007), synchronization occurs stably if the following holds:

 VJVT<0 (42)

where

 V=[IN  −IN] (43)

and is the Jacobian of the entire system. Here, we define synchrony as equal activity on both maps, i.e. for all . This condition is embedded in as shown. Note that the system need not start out as to begin with but rather the condition embedded in guarantees that the system will converge towards this solution. Other conditions of synchrony (such as only some neurons synchronizing) can similarly be specified by modifying accordingly. specifies a metric which is orthogonal to the linear subspace in which the system synchronizes (i.e. a flow-invariant subspace, see Theorem 3 in Pham  Slotine (2007)).

The Jacobian has dimension and is composed of the two sub-Jacobians and (as shown in Eq 11), which describe a single WTA, and of the Jacobians of the couplings.

Introducing the coupling term

 C=⎡⎢⎣γ000γ0000⎤⎥⎦ (44)

results in the Jacobian of the full system:

 J=[J1CCJ2] (45)

which can be written, using again dummy terms , as

 τJ=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣\parl1α−G0−l1β1l1γ000l2α−G−l2β10l2γ0l3β2l3β2−G000l4γ00l4α−G0−l4β10l5γ00l5α−G−l5β1000l6β2l6β2−G⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦ (46)

The above expression yields:

 τVJVT=⎡⎢⎣(l1+l4)(α−γ)−2G0−β1(l1+l4)0(l2+l5)(α−γ)−2G−β1(l2+l5)β2(l3+l6)β2(l3+l6)−2G⎤⎥⎦ (47)

Note that .

Consider now the Jacobian of e.g. subsystem-1 once synchronized, i.e., with the coupling terms from subsystem-2 variables replaced by the same terms using subsystem-1 variables (this is what we called earlier the target subsystem-1). Given equation (11) and (41), this Jacobian can be written

 τJ1sync=⎡⎢⎣l1(α+γ)−G0− l1β10l2(α+γ)−G− l2β1l3β2l3β2−G⎤⎥⎦ (48)

Comparing (47) and (48), we see that sufficient conditions for (and similarly ) to be negative definite automatically imply that is negative definite. Indeed, since ,

 ∀ lj , J1sync< 0     =>     ∀ lj , VJVT< 0 (49)

In other words, the basic requirement that the individual target dynamics are stable (as shown in the previous section) automatically implies stability of the synchronization mechanism itself.

Note the opposite signs of in Eqs (48) and (47). Intuitively, these express a key trade-off. Indeed, the stronger is, the easier and stronger the synchrony of the memory state (Eq (47)). However, a stronger connection also makes the system less stable. This is expressed by the positive in Eq (48), which imposes stricter constraints on the permissible values of the other weights for to remain negative definite.

Synchronization of the two maps in this way allows reduction of the two coupled systems to a single virtual system with the additional parameter for the coupling strength (Eq 47,48). Stability of this hybrid system guarantees stability of the synchronization mechanism itself (Eq 49). The upper-bounds for are thus (based on Eq 13)

 γ<2√β1β2−α (50)

As long as this condition is met, the dynamics of each map are contracting and their synchronization is stable. The lower-bound on is determined by the minimal activity necessary to begin ”charging” the second map (which gets no external input in our configuration). The minimal activity that a unit on the second map gets as input from the first map needs to be larger than its activation threshold , i.e. where is the steady-state amplitude during the application of input (which is a function of the gain ). Thus,

 TgImax(t)<γ (51)

### 2.8 Stability of unidirectionally coupled WTAs

Next, we extend our analysis to networks consisting of 3 WTAs , and of the kind shown in Fig 2D and described in section 2.2. WTAs and are bidirectionally coupled to express the current state and are equivalent to the network considered in the previous sections. A further WTA is added that contains units , referred to as transition neurons (TNs). In this example, there are two TNs and (Fig 2D). Activation of the first () leads the network to transition from state to , if the network is currently in . Activation of the second () leaves the network in state , if the network is currently in this state. If it is not so, then no activity is triggered. The TN is an example of a transition from one state to another. TN is an example of a transition that starts and ends in the same state (a loop). This loop is intentionally introduced here, because it poses a limit to stability. TNs receive and project input with weight .

The Jacobian of the full system consists of variables:

 τJ=⎡⎢ ⎢⎣J1CP2CJ200P1J3⎤⎥ ⎥⎦ (52)

Since there are two memory states,

 C=⎡⎢⎣γ000γ0000⎤⎥⎦ (53)

describes the input and the output of the TNs. Here,

 P1=⎡⎢⎣ϕ000ϕ0000⎤⎥⎦ (54)
 P2=⎡⎢⎣000ϕϕ0000⎤⎥⎦ (55)

#### 2.8.1 Case 1: loop

For purposes of worst-case analysis, assume that the TN (which receives and projects to state 2) is permanently active. This is achieved by setting . In this case, we require that the network remains synchronized in state .

The state is stable with activated if the synchrony between and is not disrupted. This is the case if

 τVJVT=⎡⎢⎣(l1+l4)(α−γ)−20−β1(l1+l4)0(l2+l5)(α−γ)−2−β1(l2+l5)β2(l3+l6)β2(l3+l6)−2⎤⎥⎦ (56)

with and the Jacobian of the entire system (9 variables).

Note the similarity to equation (47). None of the non-linearity terms of the 3rd WTA , nor , appear in this equation. Thus, the synchrony of the states is not influenced by the presence of a consistently active loop TN. The presence of does thus not influence the synchrony between and which represent the state. However, this combined system also needs to be contracting for this system to be stable (i.e. reach steady state). Thus, we next derive the limits on for this to be the case.

Using the insight gained in section 2.7, we replace the terms by terms for purposes of stability analysis. Note that the principle of showing synchronization first introduces a hierarchy (or series) of dynamic systems, so that the overall result converges if each step (sync and simplified system) does, with convergence rate the slowest of the two. In our case the synchronization step is always the fastest, so the overall convergence rate is that of the reduced system.

Next, we analyze the stability of the reduced system (consisting of and ). Here, only (loop TN) is used, is not connected. The corresponding Jacobian is:

 JTN=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣l1(α+γ)−10−l1β10000l2(α+γ)−1−l2β10l2ϕ0l3β2l3β2−1000000l7α−10−l7β10l8ϕ00l8α−1−l8β1000l9β2l9β2−1⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦ (57)

Having be negative definite in the metric ,

 ∀ lj , ΘJTNΘ−1< 0 (58)

guarantees that the coupled system is stable. Following Wang  Slotine (2005) and Slotine (2003) (section 3.4), if the uncoupled systems are stable with contraction rates and , then the coupled system is stable if

 ϕ2 < λxλz (59)

and its contraction rate is

 λx,z=(λx+λz)2−√(λx−λz2)2+ϕ2 (60)

Note that is equivalent to condition (59). One then has . Note that if the connection weights are not symmetric, in the expressions above can be replaced by .

The contraction rate for a single WTA is equal to the absolute value of the largest eigenvalue of (its real part) Wang  Slotine (2005). Following (10), the contraction rate for a WTA (such as ) is . Similarly, for a symmetrically coupled system with coupling weight , the contraction rate is . These two conditions thus establish the upper-bound on the permissible weight of . Since , a good approximation is

 ϕ<λx (61)

#### 2.8.2 Case 2: transition

Here, the transition from one pattern of synchrony to an other (two states) is investigated. For this purpose, both states and exist. Also, the TN is connected. Since activating leads from a transition from state 1 to 2 (represented by and ). In the following, we assume that the network is in when initialized and is active (i.e. ). We then show that the network will end up in the second synchrony pattern, representing .

Defining as above and the appropriate Jacobian of the full system yields

 τVJVT=⎡⎢⎣(l1+l4)(α−γ)−20−β1(l1+l4)0(l2+l5)(α−γ)−2−β1(l2+l5)β2(l3+l6)β2(l3+l6)−2⎤⎥