Communication within biological systems often involves chemical processes in which a receptor (protein, DNA or a living cell) transitions among several metastable states, with transitions rates modulated by the intensity of an input signal. Examples of such input signals include: the concentration of a ligand, transcription factor, or metabolic substrate; intensity of a light source; and potential difference across a cell membrane.
A simple example of such a channel is the BIND channel  in which a receptor protein has two states: bound to the ligand, or unbound; the receptor is sensitive to the signal (ligand concentration) in the unbound state, and insensitive in the bound state. More elaborate channel schemes abound in biology: examples include the Channelrhodopsin-2 receptor (ChR2)  and the acetylcholine receptor (ACh) [3, 4], with three and five states, respectively. These systems are not restricted to receptors in signal transduction: similar models have been proposed for transcription of RNA in the presence of a promoter .
Receptor proteins may be modelled as Markov processes on a finite graph, with vertices representing the state of the receptor, and a weighted directed edge , with weight representing the transition rate from state to state . In general, not all of these edges have rates that are sensitive to the input signal, and not all the state transitions are observable by the organism. We will describe these systems more formally in the next section, but from this description the system output can also be modelled as a hidden Markov chain
hidden Markov chain.
Diagram (1) illustrates the structure of such a partially observed channel model, in a discrete time formulation: A sequence of input signals ( modulates the transition rates governing the evolution of a channel state vector (
modulates the transition rates governing the evolution of a channel state vector (), a subset of which form the observable channel output (). The fully observed channel refers to ; the partially observed channel to .
There are many ways to evaluate the performance and capabilities of a signalling system, including ideal observer and signal detection analysis  and fidelity of stimulus reconstruction . In this paper we consider the framework of information theory, particularly the mutual information (MI) between the input signal and the state of the receptor (either all states, , or the subset of observable states, ). In our previous work, we calculated mutual information for individual receptors including ChR2 and ACh , and the BIND receptor for multiple receptors ; elsewhere, mutual information has been used as a figure of merit in biological systems such as the sensitivity of gene expression [10, 11].
Since the partially observed system consists of the features of signal transduction that are used by the organism, a complete analysis of the partially observed system is essential. While the fully observed system MI can be obtained in closed form for IID inputs , for the partially observed system the MI is generally intractable. Explicit expressions for the entropy rates of hidden Markov chains are often difficult to obtain, and new, approximate methods are required.
Our previous investigations revealed a qualitative relationship between the structure of sensitivity (which state transitions are modulated by the input) and observability (which components of are included in ) and the MI of the fully and partially observed systems. In some cases the fully observed system has the same, or nearly the same MI as the partially observed system, so the latter can be studied using the closed form expression for the former. In other cases, such as the important neurotransmitter ACh, the fully observed and partially observed MI differ by an order of magnitude.
The key contribution of this paper is to develop a linear noise approximation [12, 13] applicable to large populations of senders and receivers, that gives a closed-form solution for the spectral efficiency (MI per bandwidth) of both the fully- and partially-observed signal transduction channel models. For our linear channel approximation, we show that high frequency information is transduced more efficiently when observable transitions are also sensitive; we also show that the spectral efficiency of receptors is dependent on whether sensitive transitions are observable. These results have biological significance, for example in predicting how an organism will observe high-frequency phenomena.
Ii System model
Ii-a Physical model
Consider a biochemical receptor that is sensitive to a given input process . We first describe a discrete-time channel model, , from which we will derive a continuous time model expressed as a stochastic differential equation (8) with . As noted in the introduction, different types of receptors are sensitive to a variety of input stimuli.
For concreteness, here we will consider a simple three-state chain with one observable state (cf. Fig. 1). (However, our framework can be extended to general signal transduction systems, such as those considered in .) We see that:
Only some of the state transitions are sensitive to the input. In Fig. 1, only the filled arrows are sensitive to the concentration of the ligand: their transition rate is directly proportional to concentration. The unfilled arrows always occur at the same rate, independent of concentration.
Only some of the state transitions are observable by the organism. In Fig. 1, the unfilled bubbles (representing states 1 and 2) are indistinguishable to the organism. Thus, only transitions between states 2 and 3 are observable, while transitions between 1 and 2 occur internally to the receptor, hidden from the organism. Physically, this may occur if the receptor has an ion channel, which is closed in states 1 and 2, and open in state 3.
We consider both “fully observed” systems (the channel with signal as input and channel state as output) and “partially observed” systems (the channel with as input and as output, where is a rectangular matrix giving the observable output). Intuitively, we expect most of the information transduced about the signal to come from the edges that are both observable and sensitive. But even in networks for which all sensitive edges are hidden, there can still be a positive mutual information and capacity.
We consider an input signal generated by an ensemble of independent binary sources that together produce a concentration or intensity , where , , and are the low and high concentration signals that would obtain were all sources simultaneously inactive, or active, respectively.
As a simplification, we assume the input process is uncorrelated across discrete timesteps of size (see Discussion).
We let represent the relative size of fluctuations in the input concentration signal. By the central limit theorem, for sufficiently large
represent the relative size of fluctuations in the input concentration signal. By the central limit theorem, for sufficiently large, we may approximate the input signal with a discrete time Gaussian white noise input, and write the input process as
is the mean concentration, the variance of the input is, is Gaussian white noise with , and . As a rule of thumb we restrict attention to parameter values for which the Gaussian approximation is appropriate, namely . For the systems we consider, , so is a suitable parameter range.
Next, consider a population of independent receptors, each individually described by a channel state given by one of the standard unit vectors . We have shown previously that for a population of independent binary-state BIND receptors receiving common IID input, the mutual information and capacity is exactly times the single-receptor mutual information and capacity, respectively . An analogous result holds for arbitrary finite-state receptors with IID input ( the proof is straightforward but lengthy; we omit it for space). We assume that each receptor acts, independently of the others, as a conditional Markov process on , with transition rates depending on the input concentration . Assuming standard first-order mass-action kinetics 
, the transition probability from stateto state is affine-linear. Thus for each receptor we have
Recall that only some of the state transitions are sensitive to the input concentration. If , we define the transition to be insensitive; otherwise it is sensitive. For the sensitive transitions, the mean input concentration affects the mean transition rate. Thus we have where is the intrinsic transition rate. (Introducing a nonlinear dependence of the on would not qualitatively change our results.) We will require that the channel state process is irreducible for all .
Just as we assume the population of independent transmitters is large enough to justify a Gaussian approximation for the input concentration signal, we also assume a large population of independent receptors exposed to the same input signal. For large , the fluctuations in number of transitions per time among states, relative to the mean number of transitions per unit time, scale as . In general, we may consider a variety of scaling relations between , the number of sources, and , the number of receptors. For this paper we assume for simplicity. Our results would not change significantly with in some other fixed proportion. Thus we consider independent receptors all exposed to an identical input concentration . Kurtz’s theorem guarantees that (over any given finite time horizon) the discrete process converges as to a Gaussian process . Thus for sufficiently large , we may approximate the receptor state via a chemical Langevin equation [17, 18]. We write the three-state fractional population vector as with and . To derive a tractable channel model based on the linear noise approximation, we begin with the exact (discrete) representation with channels, in which case . For , we may approximate the population state vector as a Gaussian random vector process in continuous time, based on several observations:
In the absence of input fluctuations (
) the mean channel state obeys a linear evolution equation with stable eigenvalues (except the Perron-Frobenius eigenvalue corresponding to the steady state channel distribution).
The input fluctuations are stationary, so has a stationary distribution, given by , with Generally both and the normalization depend on .
For , the fluctuating part of the input concentration signal will be small (variance ).
The variance of the channel state, , will be small when both and are large.
To obtain a linear noise approximation, we expand the joint process around the mean, and neglect terms smaller than size . We subtract the mean with a linear change of coordinates . Thus, represents the deviations of the channel population vector from its mean value (therefore ). We thereby obtain a first-order approximation with additive Gaussian noise represented by the increments of several independent Wiener processes ( for the intrinsic fluctuations in the transition):
where the matrix , the vector , and the matrix all depend on the parameter , and is a
vector of standard Gaussian random variables, with independent components and uncorrelated in time. Thus we arrive at a linear channel model, with structure arising from the channel state transition graph. The input signal is, a mean-zero Gaussian discrete time process with variance , the channel noise is given by the Wiener processes comprising , and the channel memory arises from the mean transition rate matrix .
In order to obtain analytical results we will consider a continuous time limit in which the channel model takes the form of a multidimensional Ornstein-Uhlenbeck process 
Here is a vector with components summing to zero, representing the fluctuations of the channel state around the mean occupancy (with the mean occupancy normalized to unity); is a matrix (depending on
) with a single null eigenvector corresponding to the stationary distribution, and remaining eigenvalues real and negative. The independent Wiener process incrementsrepresents the input. The input fluctuations are “transduced” according to the vector , where is the stoichiometry vector for directed edge and is the collection of sensitive edges in the graph. Note that the components of sum to zero. The channel noise arises from the vector of independent Wiener process increments; note is also independent of . For the three-state model with four edges, is a matrix
Here is the mean flux across the th directed edge, with denoting the pair corresponding to edge . A similar construction results for arbitrary signal-transduction systems represented as a directed graph with vertices and edges; in general will be
In this formula denotes the pseudodeterminant, the product of the nonzero eigenvalues of a square matrix. For the matrix is full rank; the matrix has rank . The rank-one matrix is the outer product of a vector that is a linear combination of the columns of each of which is proportional to one of the stoichiometry vectors . It follows that the matrix has the same rank as . The matrices in (10) are Hermitian and have the same number of nonzero (and real) eigenvalues, thus the ratio of pseudodeterminants is well defined. Note that the factor of multiplying and in (8) cancels in the ratio . Therefore, provided is small enough (that is, are large enough) to ensure validity of the underlying Gaussian approximations, the following results hold independent of the specific value of
We wish to compare the spectral efficiency for the fully-observed system (10) with that of the partially-observed system. Let the observation vector (meaning transitions are directly observable, while transitions are not). Then the spectral efficiency appearing in the expression for the mutual information rate between the input and the observed channel state satisfies
In general, the integrals and diverge as grows without bound. Rather than impose an arbitrary cutoff frequency, we study the integrands directly in their high- and low-frequency limits,
While varies over a finite range as varies , we find that is independent of and can be simplified to
Figs. 2-3 shows the spectral efficiency (SE) of the fully observed system (solid black lines), together with the spectral efficiency of the partially observed system in the high-frequency (dashed red lines) and low-frequency (dashed blue lines) limits, as the high-concentration input probability ranges from 0.01 to 0.99. For all systems we use standard parameters for insensitive edges and for sensitive edges. Inset diagrams indicate which edges are sensitive (solid black arrows), resp. insensitive (open arrows), following the same scheme as Fig. 1.
In Fig. 2
, a single transition is sensitive to the input. When this transition is observable, the high-frequency SE is indistinguishable from the fully-observed SE (Fig. 2A, 2B, black and red dashed lines), while the low-frequency SE is smaller (blue dashed lines). In contrast, when only the observable transitions are insensitive, the low-frequency SE is reduced, but the high-frequency SE is effectively zero (Fig. 2C, 2D). Thus, heuristically, only the slow time scale carries information about the hidden transition.
In each case the maximal value of SE occurs at an interior value of , ranging roughly from 0.3 to 0.4. The maximal SE for the fully observed system is 0.025 for cases A and D (which are identical, by symmetry, for the parameters we use) and is 0.022 for cases B and C.
Fig. 3 contrasts a system with two sensitive transitions directed towards the third state (the transition to which is observable) (Fig. 3A), and a system with two sensitive transitions directed away from the third state (Fig. 3B). For the fully observed system, these two cases are identical and hence have identical spectral efficiency curves. However, for the partially observed system, we find that both the low and high frequency spectral efficiency limits are higher for the system with sensitive transitions moving towards state 3 than the system with sensitive transitions moving away from state 3. In contrast to the systems with a single sensitive edge, we find that the spectral efficiency at low frequencies exceeds that at high frequencies by a significant margin, when two sensitive edges operate in concert.
Panel 3C shows the sum of the two SE curves for the systems with a single edge directed towards the third state (sum of curves in 2A and 2C). Panel 3D shows the sum of the SE curves for the systems with a single edge directed away from the third state (sum of curves in 2B and 2D).
The spectral efficiency of the fully observed system with edges sensitive is within 5% of the sum of the corresponding single-edge SE curves (compare Fig. 3A, 3C). The SE of the fully observed system with sensitive edges is within 12% of the sum of the and systems. The mechanism leading to higher SE for low- versus high-frequency inputs (Figs. 3A-B) remains to be explored.
By considering a large population of independent sources signaling via concentration fluctuations, and a large population of independent protein molecules transducing the concentration signal through intensity-mediated state changes, we derived an additive Gaussian channel based on a linear noise approximation. Although models we have studied recently are more realistic for single receptors [22, 1, 9, 8], the model in this paper is analytically tractable for large populations with hidden states, and shows how signal transduction depends on the structure of observability and sensitivity in a channel. For example, Figs. 2A and 2C suggest that directly observable transitions carry information about the input on both fast and slow timescales, while hidden transitions carry information on slow timescales. This reflects the stochastic shielding phenomenon [23, 18]; cf. Fig. 4A of ; see also .
A shortcoming of the present treatment is the assumption of unlimited input signal power. Moreover, rapid signaling via molecular concentrations is limited by the physics of diffusion, and a more realistic treatment would involve an input with a natural high frequency cutoff. The typical persistence time of a signaling molecule with diffusion constant , in a geometry of size , is roughly For acetylcholine signaling in the neuromuscular junction (NMJ), typical values are and , giving msec. For comparison, nerve cells signaling through the NMJ transmit action potentials with maximal rates in the tens of Hz . In  we considered the two-state BIND model driven both with IID inputs and with inputs having finite correlation time, and in  we considered thermodynamic constraints on diffusion-based signaling schemes. Further analysis along these lines would be a natural next step for the linear channel model explored here.
-  P. J. Thomas and A. W. Eckford, “Capacity of a simple intercellular signal transduction channel,” IEEE Transactions on Information Theory, vol. 62, pp. 7358–7382, Dec 2016.
-  G. Nagel, T. Szellas, W. Huhn, S. Kateriya, N. Adeishvili, P. Berthold, D. Ollig, P. Hegemann, and E. Bamberg, “Channelrhodopsin-2, a directly light-gated cation-selective membrane channel,” PNAS, vol. 100, no. 24, pp. 13940–13945, 2003.
-  B. Sakmann, J. Patlak, and E. Neher, “Single acetylcholine-activated channels show burst-kinetics in presence of desensitizing concentrations of agonist,” Nature, vol. 286, no. 5768, p. 71, 1980.
-  D. Colquhoun and A. G. Hawkes, Single-Channel Recording, ch. The Principles of the Stochastic Interpretation of Ion-Channel Mechanisms. Plenum Press, New York, 1983.
-  G. Rieckh and G. Tkačik, “Noise and information transmission in promoters with multiple internal states,” Biophys. J., vol. 106, pp. 1194–1204, 2014.
-  J. A. Swets, Signal detection theory and ROC analysis in psychology and diagnostics: Collected papers. Psychology Press, 2014.
-  T. Berger and J. D. Gibson, “Lossy source coding,” IEEE Transactions on Information Theory, vol. 44, pp. 2693–2723, 1998.
-  A. W. Eckford and P. J. Thomas, “The channel capacity of channelrhodopsin and other intensity-driven signal transduction receptors,” IEEE Transactions on Molecular, Biological, and Multi-Scale Communications, 2019. In press (arXiv preprint arXiv:1804.04533).
-  P. J. Thomas and A. W. Eckford, “Shannon capacity of signal transduction for multiple independent receptors,” in 2016 IEEE International Symposium on Information Theory (ISIT), pp. 1804–1808, July 2016.
-  G. Tkačik, C. G. Callan, and W. Bialek, “Information flow and optimization in transcriptional regulation,” Proceedings of the National Academy of Sciences, 2008.
-  G. Tkačik and A. M. Walczak, “Information transmission in genetic regulatory networks: a review,” Journal of Physics: Condensed Matter, vol. 23, no. 15, p. 153102, 2011.
-  J. Elf and M. Ehrenberg, “Fast evaluation of fluctuations in biochemical networks with the linear noise approximation,” Genome research, vol. 13, no. 11, pp. 2475–2484, 2003.
-  E. Wallace, D. Gillespie, K. Sanft, and L. Petzold, “Linear noise approximation is valid over limited times for any chemical system that is sufficiently large,” IET systems biology, vol. 6, no. 4, pp. 102–115, 2012.
-  G. E. Briggs and J. B. S. Haldane, “A note on the kinetics of enzyme action,” Biochemical journal, vol. 19, no. 2, p. 338, 1925.
-  D. R. Schmidt, R. F. Galán, and P. J. Thomas, “Stochastic shielding and edge importance for markov chains with timescale separation,” PLoS computational biology, vol. 14, no. 6, p. e1006206, 2018.
-  T. G. Kurtz, “Limit theorems for sequences of jump markov processes approximating ordinary differential processes,” Journal of Applied Probability, vol. 8, no. 2, pp. 344–356, 1971.
-  C. W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences. Springer Verlag, 2nd ed., 2004.
-  D. R. Schmidt and P. J. Thomas, “Measuring edge importance: a quantitative analysis of the stochastic shielding approximation for random processes on graphs,” The Journal of Mathematical Neuroscience, vol. 4, no. 1, p. 6, 2014.
-  A. Kolmogorov, “On the Shannon theory of information transmission in the case of continuous signals,” IRE Transactions on Information Theory, vol. 2, no. 4, pp. 102–108, 1956.
-  M. Pinsker, “Information and information stability of random variables and processes. Translated and edited by Amiel Feinstein, Holden-Day, Inc., San Francisco, Calif,” London-Amsterdam xii, vol. 243, 1964. Cf. Equation 10.4.1.
-  S. Ihara, Information theory for continuous systems, vol. 2. World Scientific, 1993.
-  P. J. Thomas, D. J. Spencer, S. K. Hampton, P. Park, and J. Zurkus, “The diffusion-limited biochemical signal-relay channel,” in Advances in NIPS, 2004.
-  N. T. Schmandt and R. F. Galán, “Stochastic-shielding approximation of markov chains and its application to efficiently simulate random ion-channel gating,” Physical review letters, vol. 109, no. 11, p. 118101, 2012.
-  D. Bernardi and B. Lindner, “A frequency-resolved mutual information rate and its application to neural systems,” Journal of neurophysiology, vol. 113, no. 5, pp. 1342–1357, 2014.
-  K. Tai, S. D. Bond, H. R. MacMillan, N. A. Baker, M. J. Holst, and J. A. McCammon, “Finite element simulations of acetylcholine diffusion in neuromuscular junctions,” Biophysical journal, vol. 84, no. 4, pp. 2234–2241, 2003.
-  B. Bigland-Ritchie, R. Johansson, O. C. Lippold, S. Smith, and J. J. Woods, “Changes in motoneurone firing rates during sustained maximal voluntary contractions.,” The Journal of physiology, vol. 340, no. 1, pp. 335–346, 1983.
-  A. W. Eckford, B. Kuznets-Speck, M. Hinczewski, and P. J. Thomas, “Thermodynamic properties of molecular communication,” in 2018 IEEE International Symposium on Information Theory (ISIT), pp. 2545–2549, IEEE, 2018.