1. Introduction and Motivation
The main goal of this paper is the development of a new mathematical formalism for the modeling of networks endowed with several different levels of structure. The types of structures considered are formalized in terms of categories, whose objects represent various kinds of resources, such as computational architectures for concurrent/distributed computing, codes generated by spiking activity of neurons, probabilities and information structures, and possible other categories describing physical resources with metabolic and thermodynamical constraints. Morphisms in these categories represent ways in which resources can be converted and computational systems can be transformed into one another. All these different levels of structure are in turn related via several functorial mappings. We model a configuration space of consistent ways of assgning such resources to a network and all its subsystems, and we introduce dynamical systems describing the evolution of the network with its resources and constraints. The main advantage of adopting this categorical viewpoint lies in the fact that the entire system, with all its levels of structure, transforms simultaneously and consistently (for example, consistently over all possible subsystems), under dynamical evolution, and in the course of interacting with and processing external stimuli. The way we incorporate these different levels of structure is based on a notion from homotopy theory, the concept of Gamma space introduced by Graeme Segal in the 1970s to realize homotopy-theoretic spectra in term of symmetric monoidal categories. We view this as a functorial construction of a “configuration space” of all possible mappings of subsystems of a finite system to resources, in a way that is additive on independent subsystems. We enrich this formalism with both probabilistic and persistent structures, and we incorporate ways of taking into account the specific wiring of the network to which the construction is applied.
We show that a discretized form of the Hopfield network dynamics can be formulated in this categorical setting, thus providing an evolution equation for the entire system of the network with all its resources and constraints, and we show that one recovers the usual Hopfield network dynamics when specializing this to a category of weighted codes.
An important aspect of the Gamma space formalism we use as a starting point of our construction is that it generates homotopy types, through associated simplicial sets and spectra. We show that, within this formalism, we recover as particular cases interesting homotopy types that have already been seen to play an important role in neuroscience modeling, such as the nerve of the receptive fields covering associate to a binary neural code or the clique complex of an activated subnetwork. We also show that measures of informational complexity such as integrated information can be incorporated naturally within our formalism.
In the rest of this introductory section we review the main motivations behind the approach developed in this paper. The content of the paper and the main results are then summarized in §2.
1.1. Cognition and Computation
A main motivation of this survey/research paper, as well as of many others, was briefly summarised in [ManMan17]
: it is the heuristic value of comparative study of“cognitive activity” of human beings (and more generally, other biological systems) with “computational processing” by engineered objects, such as computing devices.
In [ManMan17] it was stressed, in particular, that such a comparison should be not restricted, but rather widened, by the existence of wide spectra of space and time scales relevant for understanding both of “cognition” and “computation”.
In particular, in [ManMan17] it was argued that we must not a priori decide that brain should be compared to a computer, or neuron to a chip. We suggested, that there exists fruitful similarities between spatio–temporal activity patterns of one brain and the whole Web; or else between similar patterns on the levels of history of civilisations, and several functional neuronal circuits developing in a brain of a single human being from birth to ageing.
It was long ago noticed that various mathematical models of such processes have skeletons of common type: partly oriented graphs
describing paths of transmission and/or transformation of information. Mathematical machinery of topological nature (geometric realisation of graphs by simplicial complexes, and their topological invariants) must be in such studies connected with mathematical machinery of information theory (probability distributions, entropy, complexity …): cf.[Mar19] and [Hess].
Primary goal of the research part of this paper consists in the enrichment of the domain of useful tools: partly oriented graphs can be considered as graphs of morphisms between objects of various categories, and topological invariants of their geometric realisations might include homotopical rather than only (co)homological invariants. Respectively, we continue studying their possible interaction with information–theoretic properties started e.g. in [Mar19] and [Man15].
As in classical theoretical mechanics, such invariants embody configuration and phase spaces of systems that we are studying, equations of motion, conservation laws, etc.
In the setting we develop here, the main configuration space is the space of all consistent functorial mappings of a network and its subsystems to a monoidal category of resources (computational systems, codes, information structures).
An in the case of classical mechanics, this kinematic setup describing the configuration space is then enriched with dynamics, in the form of categorical Hopfield networks.
1.2. Homotopical representations of stimulus spaces
A main motivation behind the viewpoint developed in this paper comes from the idea that the neural code generates a representation of the stimulus space in the form of a homotopy type.
Indeed, it is known from [Cu17], [CuIt1], [CuIt2], [Man15], [Youngs] that the geometry of the stimulus space can be reconstructed up to homotopy from the binary structure of the neural code. The key observation behind this reconstruction result is a simple topological property: under the reasonable assumption that the place field of a neuron (the preferred region of the stimulus space that causes the neuron to respond with a high firing rate) is a convex open set, the binary code words in the neural code represent the overlaps between these regions, which determine a simplicial complex (the simplicial nerve of the open covering of the stimulus space). Under the convexity hypothesis the homotopy type of this simplicial complex is the same as the homotopy type of the stimulus space. Thus, the fact that the binary neural code captures the complete information on the intersections between the place fields of the individual neurons is sufficient to reconstruct the stimulus space, but only up to homotopy.
The homotopy equivalence relation in topology is weaker but also more flexible than the notion of homeomorphism. The most significant topological invariants, such as homotopy and homology groups, are homotopy invariants. Heuristically, homotopy describes the possibility of deforming a topological space in a one parameter family. In particular, a homotopy type is an equivalence class of topological spaces up to (weak) homotopy equivalence, which roughly means that only the information about the space that is captured by its homotopy groups is retained. There is a direct connection between the formulation of topology at the level of homotopy types and “higher categorical structures”. Homotopy theory and higher categorical structures have come to play an increasingly important role in contemporary mathematics including important applications to theoretical physics and to computer science. We will argue here that it is reasonable to expect that they will also play a role in the mathematical modeling of neuroscience. This was in fact already suggested by Mikhail Gromov in [Gromov].
This suggests that a good mathematical modeling of network architectures in the brain should also include a mechanism that generates homotopy types, through the information carried by the network via neural codes. One of the main goals in this paper is to show that, indeed, a mathematical framework that models networks with additional computational and information structure will also give rise to nontrivial homotopy types that encode information about the stimulus space.
1.3. Homology and stimulus processing
Another main motivation for the formalism developed in this paper is the detection in neuroscience experiments and simulations of a peak of non-trivial persistent homology in the clique complex of the network of neurons activated in the processing of external stimuli, and increasing evidence of a functional role of these nontrivial topological structures.
The analysis of the simulations of neocortical microcircuitry in [Hess], as well as experiments on visual attention in rhesus monkeys [Rouse], have shown the rapid formation of a peak of non-trivial homology generators in response to stimulus processing. These findings are very intriguing for two reasons: they link topological structures in the activated neural circuitry to phenomena like attention and they suggest that a sufficient amount of topological complexity serves a functional computational purpose.
This suggests a possible mathematical setting for modeling neural information networks architecture in the brain. The work of [Hess]
proposes the interpretation that these topological structures are necessary for the processing of stimuli in the brain cortex, but does not offer a theoretical explanation of why topology is needed for stimulus processing. However, there is a well known context in the theory of computation where a similar situation occurs, which may provide the key for the correct interpretation, namely the theory of concurrent and distributed computing[FaRaGou06], [Herl1].
In the mathematical theory of distributed computing, one considers a collection of sequential computing entities (processes) that cooperate to solve a problem (task). The processes communicate by applying operations to objects in a shared memory, and they are asynchronous, in the sense that they run at arbitrary varying speeds. Distributed algorithms and protocols decide how and when each process communicates and shares with others. The main questions are how to design distributed algorithms that are efficient in the presence of noise, failures of communication, and delays, and how to understand when a distributed algorithm exists to solve a particular task.
Protocols for distributed computing are modeled using simplicial sets. An initial or final state of a process is a vertex, any mutually compatible initial or final states are a -dimensional simplex, and each vertex is labelled by a different process. The complete set of all possible initial and final states is then a simplicial set. A decision task consists of two simplicial sets of initial and final states and a simplicial map (or more generally correspondence) between them. The typical structure describing a distributed algorithm consists of an input complex, a protocol complex, and an output complex, with a certain number of topology changes along the execution of the protocol, [Herl1].
There are very interesting topological obstruction results in the theory of distributed computing, [Herl1], [HeRa95] , which show that a sufficient amount of non-trivial homology in the protocol complex is necessary for a decision task problem to be solvable. Thus, the theory of distributed computing shows explicitly a setting where a sufficient amount of topological complexity (measured by non-trivial homology) is necessary for computation.
As discussed below, this suggests that the mathematical modeling of network architectures in the brain should be formulated in such a way as to incorporate additional structure keeping track of associated concurrent/distributed computational systems. This is indeed one of the main aspects of the formalism described in this paper: we will show how to associate functorially to a network and its subsystems a computational architecture in a category of transition systems, which is suitable for the modeling of concurrent and distributed computing.
1.4. Informational complexity and integrated information
In recent years there has been some serious discussion in the neuroscience community around the idea of possible computational models of consciousness based on some measure of informational complexity, in particular in the form of the proposal of Tononi’s integrated information theory (also known as the function), [Tono], see also [Koch], [MasTon] for a general overview. This proposal for a quantitative correlate of consciousness roughly measures the least amount of effective information in a whole system that is not accounted for by the effective information of its separate parts. The main idea is therefore that integrated information is a measure of informational complexity and causal interconnectedness of a system.
This approach to a mathematical modeling of consciousness has been criticized on the ground that it is easy to construct simple mathematical models exhibiting high values of the function. Generally, one can resort to the setting of coding theory to generate many examples of sufficiently good codes (for example the algebro-geometric Reed-Solomon error-correcting codes) that indeed exhibit precisely the typical from of high causal interconnectedness that leads to large values of integrated information. Thus, it seems that it would be preferable to interpret integrated information as a consequence of a more fundamental model of how networks in the brain process and represent stimuli, leading to high informational complexity and causal interdependence as a necessary but not in itself sufficient condition.
One of the goals of this paper is to show that integrated information can be incorporated as an aspect of the model of neural information network that we develop and that many of its properties, such as the low values on feedforward architectures, are already built into the topological structures that we consider. One can then interpret the homotopy types generated by the topological model we consider as the “representations” of stimuli produced by the network through the neural codes, and the space of these homotopy types as a kind of “qualia space”, [MarTsao18].
1.5. Perception, representation, computation
At these levels of generalisation, additional challenges arise, both for researchers and students. Namely, even when we focus on some restricted set of observables, passage from one space/time scale to a larger, or smaller one, might require a drastic change of languages we use for description of these levels. The typical example is passage from classical to quantum physics. In fact, it is only one floor of the Babel Tower of imagery that humanity uses in order to keep, extend and transmit the vast body of knowledge, that makes us human beings: cf. a remarkable description of this in [HF14].
Studying neural information, we meet this challenge, for example, when we try to pass from one subgraph of the respective oriented graph to the next one by adding just one oriented arrow to each vertex. It might happen that each such step implies a change of languages, but in fact such languages themselves cannot be reconstructed before the whole process is relatively well studied.
Actually, the drastic change of languages arises already in the passage between two wide communities of readers to which this survey/research paper is addressed: that of mathematicians and that of neural scientists.
Therefore we wanted to make readers–mathematicians, before moving to the main part of this paper, to be aware of this necessity of permanent change of languages.
A very useful example of successful approach to this problem is the book [Sto18], in particular its Chapter 5, “Encoding Colour.” Basically, this Chapter explains mathematics of colour perception, by retina in human eye. But for understanding its neural machinery, the reader will have to return temporarily back each time, when it is necessary. Combination of both is a good lesson in neural information theory.
Below we will give a brief sketch of this Chapter.
Physics describes light on the macroscopic level as a superposition of electromagnetic waves of various lengths, with varying intensity. Light perception establishes bounds for these wavelengths, outside of which they stop to be perceived as light. Inside these bounds, certain bands may be perceived as light of various “pure” colours: long wavelengths (red), medium wavelengths (green), and short wavelengths (blue).
The description above refers to the “point” source of light. The picture perceived by photoreceptors in eye and transmitted to neurons in brain, is a space superposition of many such “point source” pictures, which then is decoded by brain as “landscape”, or “human face”, or “several barely distinguishable objects in darkness”, etc.
We will focus here upon the first stages of this encoding/decoding of an image in human eye made by retina. There are two types of photoreceptors in retina: cones (responsible for colour perception in daylight conditions) and rods (providing images under night–time conditions).
Each photoreceptor (as other types of neurons) receives information in form of action potentials in its cell body, and then transmits it via its axon (kind of “cable”) to the next neuron in the respective neuronal network. For geometric models of such networks, see Section 3 below.
Action potentials are physically represented by a flow of ions. Communication between two neurons is mediated by synapses (small gaps, collecting ions from several presynaptic neurons and transferring the resulting action potential into cell body of the postsynaptic neuron).
Perception of visual information by human eye starts with light absorption by (a part of) retinal photoreceptors and subsequent exchange of arising action potentials in the respective part of neural network. Thenretinal ganglion cells forming the optic nerves, transmit the information from retina to brain.
Encoding colour bands into action potentials and subsequently encoding relative amplitudes of respective potentials into their superpositions, furnish the first stage of “colour vision”. Mathematical modelling of this stage in [Sto18] requires a full machinery of information theory and of chapters of statistical physics involving entropy and its role in efficient modelling of complex processes.
We now pass to the main goal of this paper: enrichment of all these models by topology, or vice versa, enrichment of topology by information formalisms.
2. Structure of the paper and main results
We start in §3 with a quick introduction to simplicial structures enriched with probabilistic data. This cover some background information for some of the constructions that will appear later in the paper, especially in §LABEL:CodesProbSec, §LABEL:ProbCodesSec, and §LABEL:InfomeasSec and in §LABEL:GammaNetInfoSec and §LABEL:HTCodesInfoSec.
In §4 we introduce the general problem of modeling networks with associated resources. We recall the various forms of resources that we will be discussing in the rest of the paper, in particular informational and metabolic constraints and computational resources. We also review in §4.2 the mathematical theory of resources and convertibility of resources developed in [CoFrSp16] and [Fr17] using symmetric monoidal categories. In §4.3 we review the construction of [WiNi95] of a category of transition systems, which models concurrent and distributed computing architecture, that we will be using in the rest of the paper, especially in §LABEL:TransSysGammaSec and in §LABEL:GammaNetCompSec. We also discuss in §4.4 how one can interpret certain classical categorical constructions such as the adjunction of functors as describing optimization processes and constraints.
In §LABEL:GammaGeneralSec we introduce Segal’s Gamma-spaces and several variations on the same theme that incorporate probabilistic and persistent structures in the notion of Gamma-space, as well as the possibility of working with cubical sets (more frequently used in the context of concurrent and distributed computing) rather than with simplicial sets. In the rest of the paper, for simplicity, we will usually adopt the standard notion of Gamma-spaces, but everything that we describe can also be adapted quite easily to all these more general versions of the same formalism, whenever required by a specific model. This section concludes the part of the paper that covers general introductory material.
In §LABEL:GammaInfoNetSec we begin with the first step of our main construction. We consider the categories of summing functors from the set of vertices or edges of a network and their subsets to a category of resources. The nerve of this category of summing functor is the simplicial set associated to the network by the Gamma-space. Thus, one can regard the category of summing functors, along with its nerve as a topological realization of it, as our main “configuration space” describing all possible consistent assignments of resources to the network and its subsystems. The idea here is similar to configuration and phase spaces in physics that account for all the possible realizations of a given system subject only to its internal geometric constraints. The dynamics then identifies specific trajectories inside this configuration space. In order to take into account in the construction the local connectivity of the network, namely incoming and outgoing edges at each vertex, we impose on this general configuration space of summing functors a constraint of conservation laws at vertices, which can be formulated in terms of categorical equalizer and coequalizer constructions. Further and more specific information on the structure of the network is discussed and incorporated in the model in §LABEL:GammaNetCompSec.
In §LABEL:GammaCodesSec we consider neural codes generated by networks of neurons and associated probabilities and information measures. We construct a simple category of codes and we show that one can think of the neural codes as determining summing functors to this category of codes, hence a corresponding Gamma-space. We show that the probabilites associated to neural codes by the firing frequencies of the neurons fit into a functorial map from this category of codes to a category of probability measures. However, we show that this construction is not fully satisfactory because it does not in general translate to a functorial assignment of information measures. This problem will be solved in §LABEL:GammaNetInfoSec using a more refined assignment of probabilities and information structures to neural codes, using the more sophisticated formalism of cohomological information theory introduced by Baudot and Bennequin, [BauBen1].
In §LABEL:HopfieldSec we introduce dynamics on our configuration spaces of summing functors. We model dynamics using the classical Hopfield network model of dynamics of networks of neurons in interaction. Starting from a discretized finite difference version of the usual Hopfield equations, we construct a categorical version of the equation, where the variables are summing functors and the dynamics is determined by an endofunctor of the target category of resources. The non-linear threshold of the Hopfield dynamics is realized in terms of a pre-ordered semigroup associated to the category of resources that describes the convertibility relation. We show that the solutions of these categorical Hopfield equations are a sequence of summing functors, the discrete time evolution is implemented by an endofunctor, and the dynamics induces a discrete dynamical system on the topological space given by the nerve of the category of summing functors. We also show that, when applied to the category of weighted codes introduced in §LABEL:GammaCodesSec this categorical Hopfield dynamics induces the usual Hopfield equations on the weights.
In §LABEL:GammaNetsSec we return to the question of how to better represent the wiring structure of the network in the functorial assignment of resources, beyond the conservation laws at vertices introduced in §LABEL:GammaInfoNetSec by taking categorical equalizers. We focus here on the assignments of computational resources in the form of transition systems in the category described in §4.3. We describe first in §LABEL:NeuronCompSec the automata in the category of transition systems associated to single neurons. Then in §LABEL:CompArchSec we describe a functorial assignment of a transition system to a network, via a grafting operation in the category of transition systems and a decomposition of the directed graph into its strongly connected components and the resulting acyclic condensation graph. We refer to these assignments of (computational) resources to networks as Gamma-networks, generalizing the notion of Gamma-spaces by better taking into account the network structure. In §LABEL:NeuromodSec we extend this construction further by showing that it can accommodate distributed computing models of neuromodulation proposed in [Potj].
In §LABEL:GammaNetInfoSec we introduce the formalism of finite information structures and cohomological information theory developed in [BauBen1], [Vign]. We construct a functorial mapping from the category of codes introduced in §LABEL:GammaCodesSec and the category of information structures and we show that the resulting Gamma-network that maps a network and its subsystems to the respective chain complexes of information cohomology satisfies an inclusion-exclusion property.
In §LABEL:SpectraHTSec we introduce another aspect of our construction: the spectra and homotopy types associated to our Gamma-spaces and Gamma-networks. In particular, we focus on the topology carried by the clique complex of a subnetwork, which as recalled in the Introduction is known to play a role in the processing of stimuli by networks of neurons. We propose that the images under the Gamma-space functor of the simplicial set and its suspension spectrum should be regarded as a “representation” produced by the network with its assigned resources. More precisely, the simplicial set resulting from the mapping via the Gamma-space functor combines in a nontrivial way the topology of the clique complex with the topology of the spectrum defined by the Gamma-space (which carries information about the category of resources associated to the network). In particular, we show that if the homotopy type of this resulting simplicial set is non-trivial (in the sense of non-vanishing homotopy groups), this forces the topology of the clique complex to be correspondingly nontrivial. We consider two significant classes of examples: Erdös–Rényi random graphs and feedforward networks. In the case of the random graphs we show that there is a dichotomy between two extreme regimes as a function of the probability , such that either the resulting image of the clique complex under the Gamma-space is almost always highly-connected or it is homotopy equivalent to the spectrum of the Gamma-space. In the case of feedforward networks we show that the topology of the same image is essentially trivial. These two scenarios show that the situations where this family of homotopy types can be nontrivial and significant is where the network is neither a simple feedforward architecture nor a completely random wiring. This property is very reminiscent of a similar property in the theory of integrated information, which constrains the network architectures for which the integrated information can be nontrivial. Indeed the relation between our construction and the theory of integrated information is discussed in the following sections.
In §LABEL:IntegInfoSec we introduce integrated information, in the form defined using information geometry as in [OizTsuAma]. We construct a cohomological version of integrated information in the same setting of cohomological information theory used in the previous section, and we show that it vanishes on feedforward networks as expected. We also show that it transforms functorially under the categorical Hopfield dynamics of §LABEL:HopfieldSec.
In §LABEL:HTCodesInfoSec we compare and combine the topological construction of the homotopy types associated via the Gamma-space to the clique complex of the network in §LABEL:SpectraHTSec and the cohomological information theory construction of §LABEL:GammaNetInfoSec and §LABEL:IntegInfoSec. We first show in §LABEL:CodesNerveSec that the nerve complex of the receptive fields covering of the neural code can be recovered as a special case of our functorial assignment of information structures to codes. We then show that there are compatible functorial assignments of information structures to computational architectures given by transition systems of networks as constructed in §LABEL:GammaNetsSec and the assignment of information structures to codes considered in §LABEL:GammaNetInfoSec. The functors are determined by passing to the language of the automaton of the transition system. We then show that the simplicial set given by the clique of the network can also be obtained as a special case of the cohomological information theory construction of §LABEL:GammaNetInfoSec, using the functorial maps from computational systems of networks to information structures. We conclude with a proposal and some questions about a possible enrichment of the cohomological information theory construction in the form of a sheaf of spectra and the associated generalized cohomology.
3. Simplicial Models and the Geometry of Information
In this short Section we present an approach to the geometry of information developed in [CoMa20]. It connects differential geometry of convex cones with intuitive image of passing information from one space scale to another one as signal transfer from a subset of vertices of a simplicial approximation of this space. Time axis here is modelled my varying probabilities of different outputs, and geometry of this space of various probability distributions becomes the central object of study.
In this paper we will be considering several combinations of simplicial sets and probabilistic data. In §4.2.1
we recall some examples of categories of resources underlying classical probability theory. In §LABEL:ProbGammaSec we consider a way of making certain categories (including the category of simplicial sets) probabilistic using a wreath product construction with a category of finite probabilities and stochastic maps. In §LABEL:CodesProbSec, §LABEL:ProbCodesSec, and §LABEL:InfomeasSec we describe how to assign probabilities and information measures to neural codes. In §LABEL:GammaNetInfoSec we introduce the more sophisticated formalism of cohomological information theory, based on categories of finite information structures, and in the final section, §LABEL:HTCodesInfoSec, we compare the topological structures we obtain, based on simplicial sets, spectra and their homotopy types, to the setting of cohomological information theory.
Given the very rich interplay of simplicial structures in topology and probabilities and information that appears in many different forms throughout the paper, we include this introductory section that presents some general background.
3.1. Simplicial topology enriched with probabilities
We will reproduce here only the shortest list of introductory information, excerpted form [CoMa20]. The interested reader may find much more in [Mar19], to which the book [CoGw17] can serve as a useful complement.
We start with a description of simplicial sets reproduced from [GeMa03], Ch. I.
3.1.1. Category of simplices
Denote by for the subset of integers . Denote by the category with objects and morphisms increasing (not strictly) maps , with obvious composition.
Denote by the –dimensional topological space
and by its maximal open subset.
A simplicial object of a category is a covariant functor .
The classical description of morphisms in via generators (“–th face maps”, “–the degeneration maps”) and relations (see [GeMa03], pp. 14–15) produces explicit description of simplicial sets and their topological realisations.
Now we will pass to the information theoretic structures hidden in this simplicial formalism and made explicit in [Che65], [Mar19], [CoMa20], et al.
3.2. -algebras, probability distributions, and categories
Consider a set and the structure on it, given by a collection of subsets . Such a pair is called a –algebra, if it satisfies the following axioms.
If , then .
The union of all elements of any countable subcollection of belongs to .
Let be a commutative semigroup with zero. An –valued measure on is a map such that and .
Finally, a probability distribution on is a –valued measure such that and for any countable subfamily with empty pairwise intersections we have .
Denote by the set of probability distributions on the –algebra .
Given two such sets and , call “a transition measure” between them a function upon such that for any fixed , is –measurable function on , and for any fixed , is a probability distribution upon .
A measure determines the map given by
These data define a category of probability distributions with objects and morphisms .
3.2.1. Probability distributions on finite sets
If is a finite set, then the collection of all subsets is a –algebra, and probability distributions on it are in the bijective correspondence with maps such that .
In other words, such distribution can be considered as probabilistic enrichment of the simplex whose vertices are coordinate points in . We will be even closer to the basic Definition 3.1, if we consider the category of pointed finite sets , morphisms in which are maps sending to . A probabilistic enrichment of such category is the particular case of Definition–Lemma 3.2; the transition measures are simply stochastic matrices with obvious properties. For further refinements of this formalism, cf. [Mar19], Sec. 2.
In the context of concurrent/distributed computing and/or neurology, simplices with compatible orientations of all 1–faces such that any maximal oriented path leads from one and the same source vertex to one and the same sink vertex, are models of directed cliques (see §LABEL:GammaCliqueSec below).
4. Neural Information Networks and Resources
In modelling of networks of neurons, one can consider three different but closely related aspects: the transmission of information with related questions of coding and optimality, the sharing of resources and related issues of metabolic efficiency, and the computational aspects. The third of these characteristics has led historically to the development of the theory of neural networks, starting with the McCulloch–Pitts model of the artificial neuron [McCPit43]
in the early days of cybernetics research, all the way to the contemporary very successful theory of deep learning[GoBeCou16]. In this paper is to focus mostly on the first two aspects mentioned above, for which a good discussion of the computational neuroscience background can be found, for instance, in [Sto18] that we have already discussed. We will also consider a formalism that assigns to a network its computational capacity, in terms of concurrent and distributed computing architectures, consistently with informational and metabolic constraints.
4.1. Networks with informational and metabolic constraints
We consider here a kind of neuronal architecture consisting of populations of neurons exchanging information via synaptic connections and action potentials, subject to a tension of two different kinds of constraints: metabolic efficiency and coding efficiency for information transmission. As discussed in §4 of [Sto18], metabolic efficiency and information rate are inversely related. The problem of optimizing both simultaneously is reminiscent of another similar problem of coding theory: the problem of simultaneous optimization, in the theory of error correcting codes, between efficient encoding (code rate) and efficient decoding (relative minimum distance). In order to model the optimization of resources as well as of information transmission, we rely on a categorical framework for a general mathematical theory of resources, developed in [CoFrSp16] and [Fr17], and on a categorical formulation of information loss [BaFrLei11], [BaFr14], [Mar19]. Before discussing the relevant categorical framework, we give a very quick overview of the main aspects of the neural information setting, for which we refer the readers to [Sto18] for a more detailed presentation.
4.1.1. Types of neural codes
There are different kinds of neural codes we will be considering. We will consider the binary codes that account only for the on/off information of which neurons in a given population/network are simultaneously firing. This type of codes allows for an interesting connection to homotopy theory through a reconstruction of the homotopy type of the stimulus space from the code, see [Cu17], [Man15]. Different types of coding are given by rate codes, where the input information is encoded in the firing rate of a neuron, by spike timing codes, where the precise timing of spikes carries information, and by correlation codes that use both the probability of a spike and the probability of a specific time interval from the previous spike.
4.1.2. Spikes, coding capacity, and firing rate
Using a Poisson process to model spike generation, so that spikes are regarded as mutually independent, given a firing rate of spikes per second, all long spike trains generated by at firing rate are equiprobable and the information contained in a spike train is computed by the logarithm of the number of different ways of rearranging the number of spikes in the total number of basic time intervals considered. The neural coding capacity (the maximum coding rate for a given firing rate ) is given by the output entropy divided by the basic time interval . This can be approximated (§3.4 of [Sto18]) by .
4.1.3. Metabolic efficiency and information rate
One defines the metabolic efficiency of a transmission channel as the ratio of the mutual information of output and input to the energy cost per unit of time, where the energy cost is a sum of the energy required to maintain the channel and the signal power. The latter represents the power required to generate spikes at a given firing rate. The energy cost of a spike depends on whether the neuron axon is myelinated or not, and in the latter case on the diameter of the axon. A discussion of optimal distribution of axon diameters is given in §4.7 of [Sto18].
4.1.4. Connection weights and mutual information
Given a population of neurons that respond to a stimulus with spikes, the output can be encoded as a matrix . When this output is transmitted to a next layer of cells (for example, in the visual system, the output of a set of cones transmitted to a set of ganglion cells) an weight matrix assigns weights to each connection so that the next input is computed by . Noise on the transmission channel is modelled by an additional term,
given by a random variable so that. The optimization with respect to information transmission is formulated as the weights that maximize the mutual information of output and input.
4.2. The mathematical theory of resources
A general mathematical setting for a theory of resources was developed in [CoFrSp16] and [Fr17]. We recall here the main setting and the relevant examples we need for the context of neural information.
A theory of resources, as presented in [CoFrSp16], is a symmetric monoidal category , where the objects represent resources. The product represents the combination of resources and , with the unit object representing the empty resource. The morphisms in represent possible conversions of resource into resource . In particular, no-cost resources are objects such that and freely disposable resources are those objects for which . The composition of morphisms represents the sequential conversion of resources.
4.2.1. Examples of resources
Among the cases relevant to us are the two examples based on classical information mentioned in [CoFrSp16], and another example of [CoFrSp16] more closely related to the setting of [Mar19].
Resources of randomness: the category has objects the pairs of a finite set with a probability measure with and , and with morphisms the maps satisfying the measure preserving property , and with product with unit a point set with measure .
Random processes: the category with objects the finite sets and maps given by stochastic matrices with for all and and for all .
Partitioned process theory: the category considered in this case is the coslice category of objects of under the unit object. This has objects given by the morphisms , for , and morphisms
The category of [Mar19] has objects the pairs of a finite set with a probability distribution and morphisms given by the stochastic maps such that . It is the coslice category with the category of stochastic processes as in the previous example.
4.2.2. Convertibility of resources
The question of convertibility of a resource to a resources is formulated as the question of whether the set . Thus, to the symmetric monoidal category of resources, one can associate a preordered abelian semigroup on the set of isomorphism class of (which we denote here by the same letters for simplicity), with the class of with unit given by the unit object and with iff . The partial ordering is compatible with the semigroup operation: if and then .
The maximal conversion rate between resources is given by
It measures how many copies of resource are needed on average to produce .
Given an abelian semigroup with partial ordering , an -valued measuring of -resources is a semigroup homomorphism such that in whenever in .
For and a measuring semigroup homomorphism, we have (Theorem 5.6 of [CoFrSp16])
that is, the number of copies of resource that one can obtain on average using resource is not bigger than the value of relative to the value of .
4.2.3. Information loss
A characterization of information loss is given in [BaFrLei11] as a map satisfying
additivity under composition ;
convex linearity for and for the convex combination of morphisms and in ;
continuity of over .
The Khinchin axioms for the Shannon entropy can then be used to show that an information loss functional satisfying these properties is necessarily of the formfor some and for the Shannon entropy. When working with the category , a similar characterization of information loss using the Khinchin axioms for the Shannon entropy is given in §3 of [Mar19].
4.3. Transition systems: a category of computational resources
We consider here, as a special case of categories of resources, in the sense of [CoFrSp16] and [Fr17] recalled above, a category of “reactive systems” in the sense of [WiNi95]. These describe models of computational architectures that involve parallel and distributed processing, including interleaving models such as synchronisation trees and concurrency models based on causal independence. Such computational systems can be described in categorical terms, formulated as a category of transition systems, [WiNi95]. The products in this category of transition systems represent parallel compositions where all possible synchronizations are allowed. More general parallel compositions are then obtained as combinations of products, restrictions and relabelling. The coproducts in the category of transition systems represent (non-deterministic) sums that produce a single process with the same computational capability of two or more separate processes.
In the most general setting, a category of transition systems has objects given by data of the form where is the set of possible states of the system, is the initial state, is a set of labels, and is the set of possible transition relations of the system, (specified by initial state, label of the transition, and final state). Such a system can be represented in graphical notation as a directed graph with vertex set and with set of labelled directed edges . Morphisms in the category of transition systems are given by pairs given by a function with and a (partially defined) function of the labeling sets such that, for any transition in , if is defined, then is a transition in . As shown in [WiNi95], the category has a coproduct given by
where both sets are seen as subsets of
This coproduct satisfies the universal property of a categorical sum. The zero object is given by the stationary single state system with empty labels and transitions. There is also a product structure on given by
where the product transition relations are determined by , for the projections and and and .
4.3.1. Probabilistic transition systems
A probabilistic category of transition systems can be constructed as in [Mar19], by taking a wreath product of the category of transition systems described above with a category of finite probabilites. The resulting category has objects given by finite combinations and morphisms given by a stochastic map with and morphisms with with probabilities with . The objects of this category can be seen as non-deterministic automata with states set which are a combination of subsystems that are activated with probabilities . A morphism in this category consists of a stochastic map affecting the probabilities of the subsystems and non-deterministic maps of the states and labelling systems and transitions, applied with probabilities .
4.4. Adjunction and optimality of resources
Suppose then that we have a category as above that models distributed/concurrent computational architecture (a category of transition systems or of higher dimensional automata). We also assume that we have a category describing metabolic or informational resources. The description of the resources constraints associated to a given automaton is encoded in a (strict symmetric monoidal) functor . The property of being strict symmetric monoidal here encodes the requirement that independent systems combine with combined resources.
A stronger property would be to require that the functor that assigns resources to computational systems has a left adjoint, a functor such that for all objects and there is a bijection
The meaning of the left-adjoint functor and the adjunction formula (4.2) can be understood as follows. In general an adjoint functor is a solution to an optimization problem. In this case the assignment via the functor is an optimal way of assigning a computational system in the category to given constraints on the available resources, encoded by the object . The optimization is expressed through the adjunction (4.2), which states that any possible conversion of resources from to the resources associated to a system determines in a unique way a corresponding modification of the system into the system . Note, moreover, that the system is constructed from the assigned resources , and since some of the resources encoded in are used for the manufacturing of one expects that there will be a conversion from to the remaining resources available to the system , namely . The existence of the left-adjoint (hence the possibility of solving this optimization problem) is equivalent to the fact that the conversion of resources is the initial object in the category . Here, for an object the comma category of objects -under has objects the pairs with and a morphism in and morphisms given by morphisms such that one has the commutative diagram