## I Introduction

The brain makes use of a variety of energy-efficient and robust encoding strategies to process information. Examples vary from population coding, where information is encoded as statistical properties of neural populations, to precise spike timings, where information capacity is only bounded by noise bohte2004evidence; thorpe1990spike. Moreover, studies of highly active neural systems, e.g. the vertebrate’s olfactory bulb or the fly’s antennal lobe laurent1994odorant, suggest that information may also be encoded as patterns of activity tracing complex trajectories in high-dimensional state spaces. Together, such findings have inspired the conceptualization of a number of novel dynamical mechanisms for information processing, yielding insights on central interdisciplinary issues such as pattern generation Steingrube2010pattern; ijspeert2005simulation, pre-processing to facilitate computations maass2002real; jaeger2004harnessing; Rabinovich2008processing; lukovsevivcius2009reservoir; Larger2012photonic, noise-enhanced information storage kirst2016dynamic and computing and signal encoding Ashwin2005; Mazor2005representation; Wordsworth2008; Neves2012; Neves2020Reconfigurable.

A particularly transparent example of dynamics supporting complex trajectories in high dimensional state-spaces are heteroclinic networks, which naturally emerge in systems of spiking Ernst1998delay; Timme2002prevalence; TWG2003 and non-spiking oscillators Krupa1997; Wordsworth2008, as well as non-oscillatory systems bick2010occurrence in artificial and, potentially, in natural systems. A heteroclinic network is a network composed of saddle states in which each connection is a heteroclinic orbit, i.e. an orbit connecting part of the unstable manifold of a saddle with the stable manifold of a second saddle^{1}^{1}1Referring to these saddles as “states” stems from a parallel with finite state machines. In fact, when describing a perturbation-driven sequence of saddle-to-saddle switching in heteroclinic networks, it is rather natural to regard saddles in the network as states, and heteroclinic connections between saddles as transitions, with the resulting sequence of visited states being a function of a starting state and applied input. For this reason, we here refer to saddles in a heteroclinic network interchangeably as “saddle states” or, simply, “states”..
Here, we study a particular class of symmetrical heteroclinic networks, in which each saddle is locally surrounded by basins of attraction of other saddles and these basins combined locally have full measure. Due to the pulse-coupled nature of the class of systems studied, almost all trajectories perturbed away from a given saddle periodic orbit would converge to one other saddle in finite time. All basin volume of each attractor periodic orbit is not in that local volume, but remotely located, close to other saddles in the network Ernst1995Synchronization; Ernst1998delay; Timme2002prevalence; TWG2003; AT2005_2; Neves2009. Through sufficiently small perturbations applied to the system at one saddle state, state trajectories thus end up in other symmetry-related saddles. In this sense, the network of saddle states constitutes a "clean" heteroclinic network as termed by Field Field2016Patterns; Ashwin2020Almost; Podvigina2020Asymptotic.

It has been shown that heteroclinic networks can support information encoding Ashwin2005; Wordsworth2008; Neves2012 via switching dynamics Krupa1997; TWG2003; AT2005_2; KGT2009; Neves2009 among its constituting states. Furthermore, through the Heteroclinic Computing paradigm Neves2012, such networks have been shown to be capable of implementing logic gates and operators, and thus support arbitrary n-ary computations. In this paradigm, switchings are driven by external signals serving as inputs, forcing the dynamics towards specific unstable directions AT2005; Neves2009 (see Figure 1), thus generating a complex trajectory approaching saddles sequentially. At each switching event (i.e. near each saddle) the dominant signal components on the unstable manifolds are computed. Moreover, if an external signal persists for long enough, a cyclic sequence of states is established Wordsworth2008; Neves2012, resulting in the computation of a partial rank order of the signal’s components, also know as -winners-take-all, where the strongest components out of a total of are identified.

Alongside computation, information transmission is a fundamental function of neural systems, which is performed in an intrinsically noisy environment via complex patterns of activity. Here, to investigate the information transmission properties of bio-inspired dynamical systems exhibiting complex trajectories in state-space, we study heteroclinic networks as input-output noisy communication channels. Previous work has shown that 1) noise adds a stochastic component to an otherwise deterministic switching process and, thereby, modifies transition probabilities between saddles in a network-of-states

Stone1999Noise; Armbruster2003Noisy; Bakhtin2010Small; Ashwin2010Designing; bakhtin2011noisy; ashwin2016quantifying; Voit2019Dynamical, 2) perturbations at saddle states grow exponentially with time Neves2009, and 3) noise also accelerates the saddle-to-saddle, defining an upper boundary for the switching time depending on the noise level itself neves2017noise. To capture all these aspects in one measure, we characterize the heteroclinic channel by computing the mutual information rate (MIR) between external input signals and the resulting sequences of states, for varying noise level.Typically, noise tends to lower the performance of communication channels, as it introduces errors in the transmission of information, thus reducing confidence in the reconstruction of the original signal. In contrast, we here report that intermediate levels of noise maximize the information transmission capacity of the system – a stochastic facilitation effect. The mechanism underlying such effect relies on two factors: first, noise can reduce the time spent in the vicinity of each saddle (see neves2017noise, as well as Supplementary Material); and second, it can promote an increased yet controlled exploration of the underling network of states. As a consequence, the MIR between input signals and the sequences of visited states depends non-monotonically on the noise levels. The MIR increases for intermediate levels of noise, before monotonically decreasing, until the resulting switching direction at each saddle becomes virtually random.

We argue that our results are general to systems exhibiting heteroclinic networks, because they arise simply from the stochastic nature of the dynamics close to the saddles, for which more than one (typically many) exit options are available in their unstable manifold. These results suggest a positive role of noise for a range of natural and artificial information processing systems relying on complex state-space trajectories, as those may also rely on unstable states or similar structures, and may thus be of broad interdisciplinary interest.

## Ii Networks of Oscillators and Heteroclinic Dynamics

Heteroclinic networks naturally emerge in a variety of symmetrical systems composed of oscillators AT2005_2; Wordsworth2008; KGT2009; Neves2009

, both phase- and pulse-coupled, providing a model-independent framework for encoding and computing. We here consider networks of pulse-coupled leaky integrate-and-fire neurons. This simple model already captures the fundamental aspects required for the emergence of heteroclinic networks, i.e. symmetrical and excitatory delayed couplings. Furthermore, this model provides a closed-form solution for the system’s time evolution between pulse events. This allows for efficient event-based simulations, when compared to more expensive time-based numerical integration.

Between pulse events, the dynamics of each node is defined by a voltage-like variable , satisfying the following differential equation and reset condition:

(1a) | |||

(1b) |

where is a dissipation parameter and a base driving current; the network coupling is given by , a sum of incoming pulses at time , where is the connection strength and the connection delay; represents an external input source such that ; and represents a Gaussian noise source. The second equation defines the reset condition that depends on the firing times of pulse events, which are themselves defined in terms of a threshold criterion . Herein, we consider and . To preserve the existence of closed-form solutions for the system’s time-evolution between events, we approximate through a pair of high-frequency, low-amplitude pulse generators which, in practice, just add a large number of new events. Each oscillator in the network is connected to two independent noise sources, one producing excitatory pulses, the other one producing inhibitory ones, for a mean input current of (see Supplementary Material).

Saddle states arising in such networks of oscillatory neurons are characterized by the presence of poly-synchrony Neves2009, that is, neurons synchronize in groups, here simply referred to as “clusters”. While pulse-induced simultaneous resets promote synchronization, full synchronization is prevented by the delayed connections, because the neuron(s) sending the pulse(s) typically cannot synchronize with neurons receiving it(them). Some examples of poly-synchronous states were reported by Ashwin and BorresenAshwin2004encoding with permutation symmetry , by Wordsworth and AshwinWordsworth2008 with permutation symmetry and by Neves and TimmeNeves2012 with permutation symmetry . Here stand for all permutations of elements, i.e. all elements in a cluster are in an identical state. The symbol is used to describe (compose) a state with multiple clusters. Furthermore, poly-synchronous saddle states often show instability to perturbation only over a single cluster. In this case, with the exception of one cluster, oscillators in each cluster are synchronized by incoming pulses causing their simultaneous reset, which also erases any small variation in voltage and thus the effect of any small perturbation. Neurons on the remaining cluster, here loosely called the “unstable cluster”, are not reset by pulses, but rather independently reach threshold. Small voltage differences in the unstable cluster actually keep increasing at each cycle (one pulse per neuron) due to the fact that, the greater the neuron’s voltage, the quicker an incoming pulse will cause it to reach to its threshold, in an exponential fashion. Lastly, networks of states exhibiting persistent switching dynamics are typically formed by states with the same symmetry. That is, all states in the network are simple permutations of each other, thus exhibiting the same stability properties after permutation.

As the calculation of the Mutual Information (in the next section) requires averaging the results of many trials over all orderings of inputs and initial conditions, and both these quantities grow exponentially with the system’s size, we here choose a small but representative network of five neurons as a concrete example. For parameters , , , the system exhibits saddle orbits with a three-cluster formation and permutation symmetry (see Figure 2), i.e. two groups of two synchronized oscillators and one singleton. Moreover, it has been established before Neves2012; Ashwin2004encoding that these states are unstable only to perturbation to one of their clusters. Let be labels for the clusters and

be the label for the unstable cluster. We denote each saddle with a symbolic vector, e.g.

, in which each component corresponds to one of the oscillators. Such a vector uniquely labels a saddle and explicitly denotes the permutation symmetry of these states, as we can simply permute the vector to express any other state. By doing so, we obtain a total of saddle states. These states are interconnected via heteroclinic orbits following a simple transition rule Neves2009: given a general perturbation where , then(2) |

where the arrow denotes the dynamical switch between two saddles via a heteroclinic connection (see Figure 2). In other words, the oscillator in the unstable cluster receiving the largest perturbation component becomes the new singleton, the original stable cluster loses stability and becomes the new unstable cluster, and the remaining two oscillators synchronize forming a new stable cluster (see Figure 2a for an example). By permuting this relation, we obtain the set of all heteroclinic connections forming the heteroclinic network. Notice that because all nodes have the same characteristics and the connections are symmetric, heteroclinic networks can be represented as directed graphs Ashwin2010Designing, in which the saddles are the nodes and the edges are the connections, see Figure 3.

In this work, we are interested in characterizing the transmission of information through heteroclinic networks. It is thus of particular interest to understand how long-lasting signals are processed. In has been shown Neves2012 that, in the absence of noise, for every input having the same partial ordering, the sequence of approached saddle states deterministically realizes one of two possible trajectories approaching 6 specific saddles, depending on initial conditions. For example, given an input signals and an initial state , the system realizes the orbit shown in Figure 4. Permuting any two neurons between the clusters, yields the other possible orbit. In the noiseless case, we ignore any potential transient towards this orbit, because once it is approached, the system is locked there for as long as the signal is present. Statistically speaking, the transient doesn’t play any relevant role. Given the transition rule in Equation 2 and all its permutations, observing the periodic orbit in Figure 4 reveals that are all larger than . Generalizing this result for all possible orbits, shows that the system computes a -winners-take-all function over inputs, i.e. it determines the three strongest input signal components. Note that no additional information about the overall rank order is known from observing a cyclic sequence of saddle orbits.

The introduction of noise, which would be present in any realization of the system in the physical world, changes the input-driven switching dynamics in at least one fundamental way. Whereas the noiseless system’s dynamics is characterized by the approach of six saddle orbits, noise introduces an element of randomness, disrupting the cycle. For the rest of this work, we conceptualize our system as an input-output device, receiving an input and producing state sequences of some arbitrary length (see Figure 1c) as an output. Furthermore, because the state-to-state switching is driven by the difference between currents rather the values themselves (see Supplementary Material for more details), in what follows we will define the input current vectors in terms of the difference between its components.

## Iii Quantifying Information in noisy Heteroclinic communication channels

In what follows, we show how to quantify information transmission via heteroclinic networks, studying them as noisy communication channels. Specifically, we measure the Mutual Information Rate (MIR) between input signals and the resulting outputs, subject to different noise levels. Formally, the mutual information between two random variables (r.v.)

, here respectively taking values from the set of possible inputs signals and the set of possible output responses , is defined as the difference between the marginal entropy of the input r.v. and the conditional entropy between the input r.v. and the output r.v. , that is(3) |

where the marginal entropy measures the uncertainty about the variable , while the conditional entropy measures the uncertainty in given that is known, averaged over all possible ’s. Here is the probability of a signal being transmitted and is the conditional probability of given that is known. Therefore, their difference measures a drop in uncertainty in by observing . For example, if predicts with absolute certainty, and . On the other hand, if and are independent, and . We define the MIR simply as where is the average number of saddle states visited per unit of time. Calculating the mutual information between the input and output of a system from Equation 3

relies upon estimating the three distributions

, and . Notice that no analytical formulation is available for for the system studied in this paper. Thus, we approximate these distributions. To do so, we first properly define our input set and output set ; choose our source of noise; and finally, numerically compute the probabilities.Input Set: In this work, each input to the system (the external input source in Equation 1) is simply a vector of small and constant currents with targeting oscillator .

As the computation performed by heteroclinic networks is that of a partial ordering of their input currents, and we want all orderings to be equally represented, the chosen input set must contain vectors with elements in all possible orderings in the same proportion. Therefore, we first pick a random input ( stands for ”generating”); and we then generate our input set , the set of all possible permutations of . The cardinality of the resulting input set is thus . Notice that the defining computation at the vicinity of each saddle is the direction of the pairwise differences between the input components, rather than their magnitude. Thus, to generate one instance of , we randomly generate the differences between consecutive pairs of inputs. Specifically, we generate vectors of the form

(4) |

where is some constant, and

are independent and identically distributed random variables from a uniform distribution in the interval

. Permuting each generated vector in all possible configurations provides the complete set of inputs for each instance of our simulations. Furthermore, to better characterize the system response to signals and noise, we simulate the system for a variety of randomly generated input sets.Output Set: To define the general form of the output of heteroclinic networks, we describe a network of states as a directed graph , in which vertices are the set of sufficiently close neighborhoods of saddle states and the edges are the heteroclinic connections between those states (see Figure 1c). Any sequence of states can be represented as a walk on this graph. A set thus has the general form of the set of all walks of some finite length on , that is

(5) |

with being an element of this set. We remark that

is a hyperparameter that will be chosen taking in consideration arguments of convergence of measured information and the numerical computability time.

Quantifying noise: The effect of noise on the MIR between input and output is proportional to its strength compared to the input amplitude. For this reason, we introduce a quantity relating the strength of input and noise. Because the dynamical response of the system depends fundamentally on the differences

between signals, which are drawn from a uniform distribution, we introduce a signal-to-noise ratio defined as follows:

(6) |

where is a uniformly distributed r.v., is the expected value of its square, and

is the variance of the noise r.v.

(see Supplementary Material). Stronger noise leads to a lower SNR, and vice-versa.Numerical simulations: As discussed above, our objective is to characterize a noisy heteroclinic information channel in terms of MIR. To do so, we numerically approximate the distribution of the channel’s outputs from given inputs from , as they are defined above, under varying levels of noise. Furthermore, our specific choice for the length of our outputs (here a hyperparameter) is , determining the output set , i.e. the set of all walks of length on . The value has been chosen as a trade-off between clarity of presentation of the results and increase in computation time needed to analyze the data. Results for different choices of are reported in the Supplementary Material; we observe the same qualitative results for all .

As shown in Figure 5, we generate MIR-SNR curves, allowing us to better understand the system’s properties. We start by generating input sets via , the set of all permutations of , where indicates a specific set instance. For each input set we pick only one element to serve as an input for a simulation (see Figure 5A). Each simulation is actually a set of numerical simulations using the same input, where we test different noise levels sampled logarithmically in the interval. Furthermore, we run the system starting from each of the possible states in the network, with independent noise realizations. Thus, a “simulation" actually consists of independent runs. For each level of noise and initial condition, we collect a switching sequence of length . To extract the needed -walks from each complete switching sequence , a moving window function is used. For one simulation , the total number of collected output sequences for any element of the input set and one SNR is equal to .

Note that a simulation using only one element from not only provides an approximation for but also an approximation for the full distribution. Due to the symmetries in the system’s network of states and oscillator connectivity, any real distribution for this system equals any other up to a simple reordering of the vectors’ elements. In this way, a single simulation run can actually provide an approximation of the full distribution. Then, by taking into consideration that we can marginalize over to obtain , and that is a uniform distribution, where every input is equally likely, we obtain

by Bayes’ theorem. We thereby compute the mutual information as defined in Equation

3. By multiplying this quantity by the average number of state switches per unit of time in the simulation run, we obtain the MIR of the run.To summarize, we simulate the network for different inputs; for each, we test different noise levels; for each noise level, we start the system from all initial states. In this way we obtain curves of mutual information rate as a function of the SNR (again, see Figure 5 for an overview).

## Iv Results

Our computational experiments reveal that moderate levels of noise increase the MIR in the system in a predictable fashion (see Figure 6). Specifically, for intermediate levels of noise we observe an increase in MIR of up to 15% with respect to the MIR measured at the smallest noise level tested ().

This effect relies on the nature of the computation performed by heteroclinic networks: at each state, a different feature of the input is computed, i.e. which is the strongest input signal component over the unstable cluster, but only a subset of all states is ever visited in the noiseless system. Figure 7 shows how these dynamics are modified by noise, by presenting how often each state is visited and the frequency of error for each state, for different levels of noise in runs of recorded states. Here we say an error occurred during switching from state if the resulting state is different from the state predicted for a deterministic noiseless system. For low noise, a heteroclinic network only ever visits a subset of all states. In the specific case of the small leaky integrate-and-fire network we analyze, the switching dynamics are essentially confined to an orbit of six states, thus performing the same comparisons between pairs of input currents over and over again. For high noise, the switching dynamics become highly unpredictable and largely independent of the input; this is reflected by the high spread of state frequencies in Figure 7 and high error frequency for each state, approaching chance. For intermediate noise, instead, occasional errors performed at some states allow the system to explore more of the network of states via short transient orbits, thus computing a greater range of input features (comparisons between pairs of input currents) and providing more information per time. The noise is low enough for mostly predictable orbits to exist, but high enough for the dynamics to be varied. That is, correctly computed transients orbit towards correct periodic orbits arise.

Because each switch between saddles computes the largest input signal between a pair of inputs, there is a trade-off between the increased exploration due to noise, and a decreased accuracy of the computation at each switch: at intermediate noise, the system performs a more diverse range of computations on the input, although with lower accuracy at the level of the single computation. Approaching the system as a channel, shifts the focus from computation to information transmission and puts the richer information content to the forefront. Furthermore, higher noise levels are associated with a faster rate of switching (see Supplementary Material), thus it may also positively impact the MIR. Concurrently, the switching rate is also affected by the absolute strength of the difference between input currents, i.e. greater differences are associated to a higher rate of switching, but not to a larger error rate or a larger exploration. Together, these two features account for the shifting maximum in Figure 6, where inputs with smaller mean difference between currents tend to have their maximum MIR at higher SNRs (i.e. lower noise levels). Note that the reason behind the “disalignment” in the plot curves in panel (C) of Figure 6 is ultimately due to the formulation of SNR, which simplicity facilitates exposition, yet does not capture the full relationship between signal and noise in the system. In fact, rescaling each simulation’s SNR by the mean difference between the three strongest currents and the two weakest as in Panel (D) of the Figure, substantially aligns the MIR results.

To put our results into perspective, we now shortly discuss how to interpret the reported increase in MIR. The fundamental constraint on our information measure is our choice of output. By choosing sequences of saddle states, we constrain the possible knowledge about the inputs to their full rank order (when observing the output), because each sequence of two saddles encode only the rank order between two input signals. In the noiseless case, in our example of dynamics only six comparisons are ever made and repeated cyclically, revealing a partial rank order. Particularly, an increase in MI (as reported) mathematically implies an increase in the amount of knowledge that can be gathered about the input by observing the output. Given our choice of encoding, an increase in MI can only mean that more about the full rank order is known. What exactly is learned depends on details of the system, the input, and the noise level (and type) and is therefore outside the scope of this article. Our results thus simply show, in a system agnostic way, that noise is capable of increasing the MI (knowledge about the complete rank order) and MIR in heteroclinic information channels, for this particular encoding.

Markov Chain analysis: to show that our results really hinge on a simple trade-off between local errors in state-to-state transitions and global exploration of the network of states, and thus generalize beyond the specific choice of pulse-coupled system analyzed here, we now turn our attention to the heteroclinic networks’ graphs. As previously discussed, a heteroclinic network can be described as a directed graph, and sequences of state transitions as walks on this graph. A persistent input signal induces cycles on the graph, with each transition corresponding to a comparison between input components, i.e. state-to-state switch. In the noiseless case, all switches are deterministic. When noise is present, transitions become probabilistic and the probability of "correct" switches (as prescribed in the noiseless case), decreases with increasing noise strength.

Assuming that walks on this probabilistic graph exhibit the Markov property (i.e. the probability of the next state only depends on the current one), the graph can be seen as a Discrete-Time Markov chain, where a given input and a comparison success probability vector define the transition matrix . For any such system, it is possible to derive the success probability vector (or equivalently, the error probability vector ) associated with a given noise level. Pulse-coupled systems such as the subject of our previous simulations, for example, do exhibit the Markov property, because of the memory-erasing effect of simultaneous pulse-driven resets in their stable clusters. The probability vectors for three different SNR are shown in Figure 7 (B).

For the sake of exposition, we here simplify analysis by setting a single parameter as a global comparison success probability at each state. Note that the probability parameter is tied to the noise level in the heteroclinic network the Markov chain is abstracting. Manipulating this parameter is akin to manipulating a signal-to-noise ratio (SNR) parameter in the system implementing the heteroclinic network, as we have done in our previous simulations. As the SNR decreases, the probability of a correct transition approaches chance. Similarly, in this analysis, we manipulate the parameter in the range between determinism and chance , with being the number of possible transitions at each state.

For , the Markov chain is ergodic, allowing for the analytical derivation of the limiting distribution of its states, i.e. the probability for each state to be the active state at time , for . By knowing the limiting distribution and the transition matrix , it is possible to calculate the probability of any -walk as follows:

(7) |

It is then straightforward, given any input and comparison success probability

, to calculate the probability distribution

of output -walks on the graph and, thus, to calculate the Mutual Information between inputs and walks.For , the Markov chain is non-ergodic, thus requiring a different approach for the calculation of walk probabilities and resulting Mutual Information. In this case, given an input, all state sequences converge to one of a finite number of cycles. In the limit of , only -walks on those cycles have probability greater than zero, because any transient walks any of the cycles will only ever happen once, and the corresponding probabilities will thus converge to zero as all subsequent walks are confined to one of the cycles. The initial state, however, stills determines which specific cycle is approached. If we assume a uniform probability over starting states, it is possible to derive the probability of observing a cycle by simply taking the proportion of starting states eventually leading to that cycle. In turn, the probability of any given -walk, is either , if the -walk is not a walk on one of the cycles, or equal to the probability of the traced cycle divided by the number of possible -walks on that cycle. Having derived the -walk probability distribution for a given input, it is thus possible to calculate the Mutual Information between inputs and walks.

In Figure 8, we show the result of this analysis performed on three graphs corresponding to four known heteroclinic networks. All of these display the same non-monotonicity emerging from the previously discussed numerical simulations. In particular, the peak MI emerging from the Markov chain analysis of the system, closely resembles the one found in numerical simulation (see Supplementary Material). Note that, for this analysis, the number of -walk distributions to be considered in order to compute the Mutual Information depends on the size of the input set. As this grows factorially with the number of oscillators in the heteroclinic system, systems with more than a few oscillators are still numerically challenging. For this reason, our analysis is here limited to smaller systems.

## V Conclusion

In this work, we studied how noise and input signals jointly affect the Mutual Information Rate (MIR) in heteroclinic communication channels, shifting the focus from heteroclinic computing to information transmission. As a concrete example, we have focused our efforts on a system of delta-pulse-coupled oscillators, studying how the signal-to-noise ratio (SNR) controls the measured input-output MIR. Specifically, we studied how the magnitude of pairwise differences between input components interacts with noise to control the MIR. Interestingly, MIR is a non-monotonic function of the SNR: for small SNR the dynamics are dominated by noise-triggered, random state-switches, thus exhibiting the lowest MIR; for large SNR, the dynamics almost exclusively exhibit deterministic switches triggered by the signal, yielding a cyclic trajectory approaching a specific sequence of states and, thus, exhibiting a well-defined MIR (here taken as baseline); for a considerable range of intermediate values of SNR, the MIR increases from its baseline value, before falling to its minimum, thus exhibiting non-monotonicity. This occurs due to a trade-off between a small amount of noise-triggered “wrong” turns and a wider exploration of the network of states, where a wrong turn can trigger a new deterministic transient trajectory returning to a cyclic trajectory. The overall result is a larger variety of comparisons between the input components by approaching a larger variety of saddles, at the cost of a small amount of computing errors. From the point of view of information transmission, this translates to more knowledge being transmitted about the input signal overall, albeit a less certain one at each switch event, due to noise.

Our choice of network of oscillators was dictated by practical considerations of numerical simulation, due to the large amount of simulation trials required to accurately compute the MIR, and because many properties of heteroclinic networks, e.g. number of states, grow exponentially with the system size. Notwithstanding the specific implementation of heteroclinic network considered in this study, our results are general as they do not (qualitatively) depend on the system size, oscillator model or the specific heteroclinic network realization, but only on the existence of a heteroclinic network of unstable states and saddles’ unstable manifolds with more than one direction. For any such system, there will be a trade-off between uncertainty at each state switch and a resulting greater exploration of the network of states, leading to MIR sweet spots for given SNR ranges. To support this view, we presented a Markov chain analysis of networks of symmetrical saddle states and show that, from a system-agnostic perspective, different networks exhibit qualitatively the same non-monotonic MI curves.

Our results may have direct implications on a variety of interdisciplinary issues concerning computation in natural and artificial systems. Notably, heteroclinic dynamics have been suggested as an underlying mechanism for the olfactory dynamics in animals laurent2001odor; huerta2004learning; afraimovich2004heteroclinic; AT2005. In this context, our results on increased MIR through noise suggest yet a new role for noise in neural information processing and transmission, adding to works on: “stochastic resonance” wiesenfeld1995stochastic; magalhaes2011vibratory; zeng2000human, where noise facilitates the detection of sub-threshold inputs; on “system size resonance”Pikovsky2002System, where the system size becomes an order parameter; on “coherence resonance”Pikovsky1997Coherence where noise induces a more coherent (better synchronized) state; and, more generally, on “stochastic facilitation” mcdonnell2011benefits. For artificial systems, our results reveal a clear picture of how noise affects computation and information transmission in heteroclinic networks composed of symmetrical states (under index permutation) and potentially on systems exhibiting complex state-space trajectories, e.g hierarchical heteroclinic networks Voit2018Hierarchical, networks of saddle-states composed of states with different symmetries Stone1999Noise; Armbruster2003Noisy; Bakhtin2010Small; Ashwin2010Designing or models of specific features of the mind Afraimovich2004Origin; Afraimovich2018Mind; Rabinovich2018Discrete, because their dynamics also rely on unstable states or similar structures. Notably, our numerical results are calculated on spiking neural networks with full resets, which promote an explicit loss of memory, i.e. simultaneous resets induced by incoming spikes instantly erase voltage differences between oscillators in the same stable cluster. Thus, even though our Markov chain analysis suggests a degree of generality, whether our results generalize to systems of phase-coupled units, where memory may fade exponentially fast, or not is still an open question.

Overall, our results comes in a timely manner, as the microprocessor industry is exploring the use of “imprecise” processors to compute and transmit information with greater speed and lower power consumption palem2005energy; chakrapani2007probabilistic.

## Supplementary Material

In the supplementary manuscript we explain in more detail how we implemented noise in our system and how it affects its dynamics, such that the results can be reproduced. Furthermore, we present the results on the mutual information and mutual information rate calculated over different path lengths. Finally, to provide some intuition on how we generate the statistics in our work, we also provide a video, see Figure 9 (Multimedia view), showing transitions between states for short intervals of time for different signal-to-noise ratios.

## Acknowledgements

We would like to acknowledge Ikerbasque (The Basque Foundation for Science) and moreover, this research is supported by the Basque Government through the BERC 2018-2021 program and by the Spanish State Research Agency through BCAM Severo Ochoa excellence accreditation SEV-2017-0718 and through project RTI2018-093860-B-C21 funded by (AEI/FEDER, UE) and acronym “MathNEURO”. Partially supported by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) under project number 419424741 and under Germany’s Excellence Strategy – EXC-2068 – 390729961 – Cluster of Excellence Physics of Life at TU Dresden, and the Saxonian State Ministry for Science, Culture and Tourism under grant number 100400118.

## Data Availability

The data that supports the findings of this study are available within the article [and its supplementary material].