I Introduction
In cognitive science it is often assumed that agents can be described as input/output systems, an idea based on traditional, computational accounts of cognition [1, 2, 3]. In these models, the emphasis is on internal models of the world, central processing and the sensemodelplanact framework, often neglecting embodiment, situatedness and feedback from the environment [4]. More recent attempts, e.g., [5, 6, 7], have proposed closedloop descriptions of cognitive system using internal forward/inverse models in an attempt to provide better accounts of behaviour in living organisms. However in both the input/output and the closedloop architectures advocated by these approaches, the role of perceptual and motor processes is thought to be fundamentally modular [2], i.e., these functions can be described as nearly independent, (informationally) encapsulated components with minimal interactions.
In recent years, theories of estimation and control have become increasingly popular accounts of perception
[8, 9, 10] and action [5, 6, 7] respectively. In this context, the KalmanBucy filter is used as a model of perception [9, 11] while LQR (linear quadratic regulator) constitutes the basis of various accounts of motor control [12, 13]. In previous work [14] we claimed that the idea of modularity of action and perception can be seen as an analogy of the separation principle in control theory [15, 16, 17]. According to this principle, problems of estimation and control of a system can be solved separately and their solutions can be optimally combined under a set of assumptions. Following this, one can sequentially combine a KalmanBucy filter and LQR to create the LQG (linear quadratic Gaussian) architecture, used as a general methodology for several models of sensorimotor loops, e.g., [6, 18, 13, 19]. The “classical sandwich” [3] of cognitive science thus survives, we claim, even in the forward/inverse models formulation of perception and motor control.The fields of embodied and enactive cognitive science on the other hand emphasise the deep integration of perception and action, seen as fundamentally intertwined [20, 21, 22]. In [14] we proposed to use a framework based on the formulation of perception and action as estimation and control while not implementing the conditions for the separation principle, i.e., active inference. Active inference is a process theory based on the free energy principle [23] describing cognitive functions (perception and action, but also learning and attention) as processes of minimisation of sensory surprisal [23, 24]. More precisely, since this quantity is not directly accessible by an agent, it is thought that the variational free energy (an upper bound to sensory surprisal) is minimised in its place. In active inference, perceptual and motor processes are often described as entangled and inseparable [25, 26, 27] providing thus a new possible methodology combining estimation and control following embodied/enactive theories of the mind. We previously presented a conceptual account of active inference and its role for nonmodular architectures of cognitive systems [14]. Here we introduce a minimal agent model highlighting the different implementations (LQG vs. active inference) especially in presence of unknown external stimuli affecting an agent’s observations.
Ii LQG and the separation principle
The framework provided by LQG control and based on the separation principle linearly combines two processes of 1) estimation or inference of hidden properties of the environment and 2) control or regulations of variables of interest. The estimation of hidden variables is based on the presence of a Kalman (for discrete time systems) or KalmanBucy (for continuous time systems) filter, while the control of the desired variables on LQR [15, 16, 17]. In particular, this combination is provably optimal according to a set of assumptions:

the estimator is implemented through a statespace model where only linear process dynamics and observation laws describe the environment and its latent states

uncertainty or noise in both dynamics and observations are represented by white, zeromean Gaussian variables

the properties of these random variables, in particular their (co)variance matrices, are known

the performance of the regulator can be evaluated using a quadratic cost function

all the inputs/forces applied to the agent are known, e.g., external disturbances and internal signals such as motor actions.
Following the separation principle, the LQG controller produces optimal estimation and optimal control for linear systems, sequentially combining two separate subsystems, a KalmanBucy filter and LQR, in an optimal (i.e., minimumvariance) way [16, 17]. The KalmanBucy filter provides the optimal stateestimate of a signal and the LQR controller uses such estimate (i.e., the mean) to implement the optimal deterministic controller: LQG control makes use of the estimated mean and feeds it into an LQR controller.
A general linear system to be regulated in the presence of noise on the observed state is described by:
(1) 
where all the variables and parameters are the same as previously defined for KalmanBucy filters and LQR. Using the separation principle, it can then be shown that minimising the expected value of the costtogo is equivalent to minimising the costtogo for the expected (estimated) state [17]
(2) 
where we replaced states with their estimates , meaning that the optimal control can be computed using only the state estimate (i.e., the mean) rather than . The combined problem of estimation and control in LQG terms is then implemented by the following system combining KalmanBucy filter and LQR equations:
(3) 
Iii Active inference
Active inference is a process theory proposed to explain brain functioning and other functions of living systems based on Bayesian inference and optimal control theory [28, 23, 24]. In this section we establish its relations to the LQG architecture, starting by building an active inference version of the regulation of a linear multivariate system, and highlighting differences, limitations and possible extensions proposed for the control problem. As with LQG control, we build an estimator of the hidden states . In this case however, we will give a variational account of the estimator in generalised coordinates of motion that generalises the MLE/MAP derivation of KalmanBucy filters [29] using Variational Bayes with a Laplace approximation [30, 24]. We start by defining a generative model for an agent capturing the dynamics of the system to control and how these relate to observations and represented in a generalised statespace form [31, 24]:
(4) 
where the hat over the matrices simply represents the fact that the matrices used in the generative model don’t necessarily mirror their counterparts describing the world dynamics ([32, 33]), as shown in our model later. The main difference with respect to LQG however is that LQG explicitly mirrors (by construction in the linear case) the dynamics of the observed system, thus including knowledge of inputs
. On the other hand, in active inference this vector is not explicitly modelled by an agent, assuming that such information is not available to a system, in accordance with evidence in motor neuroscience suggesting the lack of knowledge of selfproduced controls (i.e., efference copy)
[34, 25, 35]. It is in fact proposed that a deeper duality of estimation and control exists whereby, in the simplest case (i.e., a purely reflexive account), actions are simply responses to the presence of prediction errors at the proprioceptive level, irrespectively of the cause of sensations (selfgenerated or external forces) [25, 33]. The vector in the generative model encodes instead external or exogenous inputs in a statespace models context or, from a Bayesian perspective, priors or “desired” outcomes generated by higher layers in hierarchical (Bayesian) implementations [10, 31]. In this light, priors can be used to effectively bias the estimator to “infer” desired rather than observed states, with a controller instantiating actions on the world to fulfil the “observed” (= desired) states of an agent. Variables model the real noise in the environment making, however, use of the definition of state space models in generalised coordinates of motion [30, 31], where are treated as analytical noise with nonzero autocorrelation, generalising the definition of Wiener processes with Markov property.This statespace model can then be written down in a probabilistic form, mapping the measurements equation to a likelihood (no direct influence of inputs on observations), and a the dynamics to a prior [31, 24, 32]
. The two multivariate Gaussian probabilities densities can then be combined and used in the general formulation of Laplace encoded variational free energy defined in
[30, 24] (without constants):(5) 
the free energy for a generic linear multivariate system becomes then:
(6) 
where we explicitly replaced with their expectations since under the Laplace assumption this represents the best estimate of (i.e., covariances of the approximate, variational density can be recovered analytically [30, 24]). Variables represent the length of vectors and respectively. Expectations play the same role of estimates in LQG, we simply decided to use a notation consistent with some of our previous work [24, 32, 36]. We also defined precision matrices as the inverse of covariance matrices and used to define the determinant of a matrix. It is important to highlight that, in general, the covariance matrices used in the generative model can be different from the ones used to describe the environment or generative process [33, 32]. To simplify the already heavy notation we will however represent them in the same way.
The recognition dynamics, encoding perception and action in a system minimising free energy [30, 24] and equivalent to estimation and control functions respectively, are implemented in standard active inference formulations as a gradient descent scheme minimising the free energy with respect to the variables for perception/estimation:
(7) 
and actions for action/control, assuming only that actions have an effect on observations [28]:
(8) 
The estimation expressed in (7) prescribes a generalisation of KalmanBucy filters to trajectories with arbitrary embedding orders where random variables are not treated as Markov processes [30]. In (7), we also include an extra term that represents the “mode of the motion” (also the mean for Gaussian variables) for the minimisation in generalised coordinates of motion [31, 24], with as a differential operator shifting the order of motion, i.e., . More intuitively, since we are now minimising the components of a generalised state representing a trajectory rather than a static variable, variables are in a moving framework of reference where the minimisation is achieved for rather than . Action as expressed in (8) may appear similar to the traditional LQR/LQG form, but is fundamentally different since it depends explicitly on observations rather than estimated hidden states .
Iv The model
The double integrator is a canonical example used in control theory and represents one of the most fundamental problems in optimal control, modelling single degree of freedom motion of different physical systems
[37, 38]. In the case presented here, this could be thought of as a block on frictionless surface. In motor neuroscience, this is the simplest model of singlejoint movement [39] and can, in some cases, be easily generalised to multiple degrees of freedom [28]. The standard double integrator is usually described as a deterministic system. The control policy is thus defined using a feedback law applied directly to the known dynamics, as the full state of the system is measured with no uncertainty [37]. For the purposes of this work, where uncertainty and noise are crucial components, we will introduce process and measurement noise into the system, making the estimation of hidden states necessary. This will then allow us to compare LQG and active inference in one of the simplest possible examples in the control theory literature with direct applications to the study of motor systems and behaviour ^{1}^{1}1The code is available at https://github.com/mbaltieri/doubleIntegrator. The double integrator is described by the following statespace model:(9) 
where matrices are defined as:
and covariance matrices as:
Iva The LQG solution to the double integrator
For LQG we implement (3) using the same matrices specified above and furthermore define:
(10) 
with no specific optimisation of these parameters since it is beyond the scope of this work. For further analysis see for instance [37].
As we can see in Fig. 1(a), the block is effectively driven to the desired position and velocity
from a set of 5 randomly initialised conditions (position and velocity are sampled from zeromean Gaussian distributions, sd=300). In Fig.
1(b) we then show the actions over time of the same 5 example agents, all converging to zero since the agents effectively reach their desired target. The main feature of LQG, and from which active inference will depart, is the reliability of estimates of both position and velocity (the red line in the phase space), using a KalmanBucy filter. In LQG, accurate estimates are necessary to then enact the LQR component implementing a negative feedback mechanism based on estimates rather than true hidden states . In Fig. 3 we introduced a new external force not modelled by the agents, equivalent to a disturbance from the environment (black line in Fig. 2(b)). Fig. 2(a) then shows that the agents are incapable of regulating their position/velocity against this unknown input (blue lines), after an initial convergence towards the desired state, they in fact move away from it when the unexpected force is introduced. Furthermore, these agents are incapable of correctly inferring their trajectories, providing inaccurate estimates of their sensed variables (red lines). In Fig. 2(b) we see that all of these agents attempt to counteract the effects of unexpected stimuli (they minimise their velocity after the force is introduced), however the lack of an appropriate mechanism to track their position correctly (e.g., integral action) pushes them away from the target.IvB The double integrator with active inference
To solve the same control problem, active inference relies on the generation of predictions of proprioceptive sensations (position, velocity as in LQG, and also acceleration in this case), followed by the implementation of actions in the world via (trivial) reflex arcs. The proprioceptive modality is essentially treated as other inputs (vision, audition, etc.) and estimates/predictions are generated using the same generative model taking advantage of incoming proprioceptive sensations. This produces a considerably different control system, with state estimates and actions now created by the same model, making it hard to clearly separate processes of perception and action. The copy of motor control signals (cf. efference copy [40]), necessary in standard LQG settings to meet the observability constraints of KalmanBucy filters [16, 17] is not included in this formulation, as explained in section III. Active inference postulates in fact that direct representations of the causes or actions of selfgenerated sensations need not be discounted during the prediction of new incoming sensory inputs. This could be seen as a limitation of active inference, but in general this speaks to the robustness of this approach in face of unknown inputs (i.e., motor actions produced by an agent or exogenous forces from the environment), see [36]. In this framework, inputs can also be estimated using an appropriate generative model of the world dynamics [30], a feature thought to be fundamental in biological systems [41]
. Simple and effective approximations are also possible, for example with integral control, thought to be the most basic heuristic dealing with the problem of uncertain inputs in biological systems down to the unicellular level
[42, 41] and already shown to be consistent with formulations of active inference [36].To derive an active inference solution to the double integrator, we start by defining a generative model for the agent, i.e., the block:
(11) 
where matrix is:
while is diagonal with values, is zero everywhere but in where the motor action is applied (with a value of 1) and covariance matrices are also diagonal with, respectively, and values on the main diagonals. The agent implements beliefs of a world where it is pulled back to the desired state by an imaginary spring and slows down thanks to an imaginary pistonlike damper, “designed” (in this case by us, but more in general one could imagine evolutionary processes for biological system [28]) to favour normative behaviour.
Following (6), the variational free energy for our controller is then described by:
(12) 
where precisions are taken from the diagonals of precision matrices (inverse covariances matrices defined in the generative model). After explicitly writing out the equations derived from the matrix formulation in (7), we get the following formulation of perceptual inference:
(13) 
and
(14) 
showing the lack of the Kalman gain and an important difference derived from its absence: if is nondiagonal as in this case (one can simply verify this claim with standard functions solving continuous Riccati equations, as in the provided code), both orders of motion are present in the optimal filter problem in (3), but only one appears in (13) since the precision matrices are assumed to be diagonal in our formulation. More in general, in active inference the Kalman gain matrix is replaced by learning rates such as in this work or [32], or by clever implementations that allow for adaptive update schemes with varying integration steps as in [30].
The action component is, however, the one most significantly different, starting from the assumption that direct knowledge of motor signals is not available and thus not modelled in the generative model (motor commands are replaced by inputs acting as priors). This entails a new approach to the problem, with active inference suggesting that the only information needed comes from observations , see (8). On this account, action reduces to
(15) 
and with the assumption that
the explicit, scalar version of action becomes
(16) 
replacing the LQR component in (3). This type of control is equivalent to a PID controller, and is the “optimal” linear solution when knowledge of inputs is not available in the generative model [36]. As in the case of filtering, the feedback gain is missing in the active inference formulation, once again replaced by learning rates of the gradient descent or by other approximations.
In Fig. 5 we can see an example implementation of the double integrator using active inference. Five agents are initialised at random position and velocity (zeromean Gaussian distributed, sd=300) and converge to the target solution where the output actions are essentially zero (excluding some noise), as expected Fig. 4(b). The most striking feature is that estimates of both position and velocity of the block are very inaccurate but the agent nonetheless reaches the desired target in the phase space, Fig. 4(a). These differences are given by the generative implemented by the agent, encoding an imaginary springdamper system that pulls it towards its desired state Fig. 4. Fig. 6 shows the robustness of this implementation when an external force is introduced: by implementing integral control [36], active inference can in this case counteract the effects of unexpected inputs. The presence of integral action perfectly counteracts the effects of disturbances Fig. 5(b) (cf. Fig. 2(b)), and more importantly allows for the desired regulation of the agents’ positions, Fig. 5(a), which is impossible in LQG accounts assuming perfect knowledge of the world (cf. Fig. 2(a)).
V Discussion
LQGbased architectures are modular in nature, with perception and action seen as separate problems solved nearly independently. According to this view, a system should initially find accurate estimates of the hidden properties of its observations, and only once such estimates are available should an agent attempt to regulate variables that are of interest to achieve its goals, e.g., temperature, oxygen level, etc.. On the other hand, we can define a framework based on mathematical formulations of control problems where the separation principle is not included or required. According to one such proposal, that we identified in active inference [28, 23], perception and action are combined in an inseparable sensorimotor loop described by the minimisation of variational free energy for an agent. In this set up, action and perception are seen as instances of a fundamentally unique process [20], using different labels for our (i.e., the observers’) convenience. In particular, the idea of precise inferences of world variables is called into question [43, 32], to the point that inaccurate perception is not only possible but becomes a prerequisite to act on the world [33, 26]. In architectures based on the separation principle, the estimated state of a system is thought of as a relevant account of real observations, e.g., their means and covariances. Conversely, in active inference it becomes clear that estimates of latent variables of the world are deeply connected to the current goal of an agent, e.g., to regulate its observations, cf. [44]. To do so, its targets are encoded as prior expectations and used to bias inferential processes toward its desires so that prediction errors are created as the mismatch of observations and the estimates of hidden variables. These errors are then minimised by acting on the world [28], taking advantage of proprioceptive prediction errors that enact reflex arcs to make observation better accord with existing predictions [45, 26]. More in general, the active inference formulation allows also for accurate estimates of the latent variables generating observations, see for instance [30], but this modality fundamentally excludes the possibility of acting: if no prediction errors are generated for action to minimise, an agent becomes a simple mirror of its world with no strong desire or even necessity to act [46, 33, 32]. In other words, depending on different precision weights an agent can accurately estimate its observations without acting or potentially discard its sensations to only pursue its desires, generating all possible cases in between as a balanced mix of weighted prediction errors [47].
Vi Conclusions
In recent years the more traditional understanding of perceptual and motor as nearly independent processes as been put into discussion by different authors, especially in neuroscience [48, 49, 50]. It is clear that many experimental set ups are limited [51], requiring new and ethologically meaningful paradigms for an appropriate study of different aspects of living systems [52]. In this context, we propose some new ideas that could drive future experiments. These ideas are centred around a critical appraisal of LQG as a model architecture for cognitive systems, focusing in particular on the assumptions made by the use of KalmanBucy filters, central to these proposals [18, 11, 53]. One of the key requirements for KalmanBucy filters to generate an accurate estimate of the hidden state of a system is to have access to all the outputs (the observations) and all the inputs (forces that affect the state) of a system. The inputs, in particular, include both motor commands, which in classical forward/inverse models are identified using the idea of efference copy [40] (see for instance [5, 6, 7]), and external forces/signals from the environment that cannot be in principle accounted by an organism, i.e., a sudden change in weather conditions or unexpected interactions with other agents.
In this work we focused on the latter, since the presence of external unaccounted forces is often overlooked in many experimental setups with fixed or predictable conditions (e.g., the classic and still dominating twoalternative forced choice paradigm). In more realistic and ethological scenarios, however, one should expect that external and unpredictable stimuli constantly affect the behaviour of an agent [51, 52, 50]. In this case, introducing noise or varying experimental conditions may help in testing the robustness of LQGbased architectures. In practice, if some inputs are not known, one should expect LQG to perform rather poorly until these inputs can be estimated and adaptation (e.g., learning) to new conditions can take place. However, one should then explain how such forces can be described in LQG since KalmanBucy filters cannot estimate inputs [29] (cf. DEM [30]). More in general, if a system is well adapted to deal with unpredictable stimuli, simple mechanisms such as integral control could be in place, as shown formally in [41] and in experiments on chemotactic adaptation in E. Coli [42] for instance. More recently, some promising results have been presented in [54], supporting the idea that integral feedback control, unlike Kalman(Bucy) filters, is a good model for adaptation in environments with varying conditions. Integral control constitutes a linear approximation to problems of control with unknown forces affecting the observations of an agent [38, 36], providing a robust solution with fast responses to problems that otherwise would require slower learning mechanisms [42], which may be ineffective in fastpaced environments [55].
Vii Acknowledgments
This work was supported in part by a BBSRC Grant BB/P022197/1.
References
 [1] A. Newell, H. A. Simon et al., Human problem solving. PrenticeHall Englewood Cliffs, NJ, 1972, vol. 104, no. 9.
 [2] J. Fodor, The Modularity of Mind. MIT Press, 1983.
 [3] S. Hurley, “Perception and action: Alternative views,” Synthese, vol. 129, no. 1, pp. 3–40, 2001.
 [4] R. A. Brooks, “New approaches to robotics,” Science, vol. 253, no. 5025, pp. 1227–1232, 1991.
 [5] M. Kawato, “Internal models for motor control and trajectory planning,” Current opinion in neurobiology, vol. 9, no. 6, pp. 718–727, 1999.
 [6] D. M. Wolpert and Z. Ghahramani, “Computational principles of movement neuroscience,” Nature neuroscience, vol. 3, no. 11s, p. 1212, 2000.
 [7] E. Todorov, “Optimality principles in sensorimotor control,” Nature neuroscience, vol. 7, no. 9, pp. 907–915, 2004.
 [8] D. C. Knill and W. Richards, Perception as Bayesian inference. Cambridge University Press, 1996.
 [9] R. P. Rao and D. H. Ballard, “Predictive coding in the visual cortex: a functional interpretation of some extraclassical receptivefield effects,” Nature neuroscience, vol. 2, no. 1, pp. 79–87, 1999.
 [10] T. S. Lee and D. Mumford, “Hierarchical bayesian inference in the visual cortex,” JOSA A, vol. 20, no. 7, pp. 1434–1448, 2003.
 [11] D. M. Wolpert, J. Diedrichsen, and J. R. Flanagan, “Principles of sensorimotor learning,” Nature Reviews Neuroscience, vol. 12, no. 12, p. 739, 2011.
 [12] W. Li and E. Todorov, “Iterative linear quadratic regulator design for nonlinear biological movement systems.” in ICINCO (1), 2004, pp. 222–229.
 [13] I. H. Stevenson, H. L. Fernandes, I. Vilares, K. Wei, and K. P. Körding, “Bayesian integration and nonlinear feedback control in a fullbody motor task,” PLoS computational biology, vol. 5, no. 12, p. e1000629, 2009.
 [14] M. Baltieri and C. L. Buckley, “The modularity of action and perception revisited using control theory and active inference,” in The 2018 Conference on Artificial Life: A Hybrid of the European Conference on Artificial Life (ECAL) and the International Conference on the Synthesis and Simulation of Living Systems (ALIFE), T. Ikegami, N. Virgo, O. Witkowski, M. Oka, R. Suzuki, and H. Iizuka, Eds., 2018, pp. 121–128.
 [15] W. Wonham, “On the separation theorem of stochastic control,” SIAM Journal on Control, vol. 6, no. 2, pp. 312–326, 1968.
 [16] B. Anderson and J. B. Moore, Optimal control: linear quadratic methods. PrenticeHall, Inc., 1990.
 [17] R. F. Stengel, Optimal control and estimation. Courier Corporation, 1994.
 [18] E. Todorov and M. I. Jordan, “Optimal feedback control as a theory of motor coordination,” Nature neuroscience, vol. 5, no. 11, p. 1226, 2002.
 [19] S.H. Yeo, D. W. Franklin, and D. M. Wolpert, “When optimal feedback control is not enough: Feedforward strategies are required for optimal control with active sensing,” PLoS computational biology, vol. 12, no. 12, p. e1005190, 2016.
 [20] A. Clark, Being there: Putting brain, body, and world together again. MIT press, 1998.
 [21] M. Wilson, “Six views of embodied cognition,” Psychonomic bulletin & review, vol. 9, no. 4, pp. 625–636, 2002.
 [22] E. Di Paolo, T. Buhrmann, and X. Barandiaran, Sensorimotor Life: An Enactive Proposal. Oxford University Press, 2017.
 [23] K. Friston, “The freeenergy principle: a unified brain theory?” Nature reviews. Neuroscience, vol. 11, no. 2, pp. 127–138, 2010.
 [24] C. L. Buckley, C. S. Kim, S. McGregor, and A. K. Seth, “The free energy principle for action and perception: A mathematical review,” Journal of Mathematical Psychology, vol. 14, pp. 55–79, 2017.
 [25] K. Friston, “What is optimal about motor control?” Neuron, vol. 72, no. 3, pp. 488–498, 2011.
 [26] W. Wiese, “Action is enabled by systematic misrepresentations,” Erkenntnis, pp. 1–20, 2016.
 [27] G. Pezzulo, F. Donnarumma, P. Iodice, D. Maisto, and I. Stoianov, “Modelbased approaches to active perception and control,” Entropy, vol. 19, no. 6, p. 266, 2017.
 [28] K. J. Friston, J. Daunizeau, J. Kilner, and S. J. Kiebel, “Action and behavior: A freeenergy formulation,” Biological Cybernetics, vol. 102, no. 3, pp. 227–260, 2010.

[29]
Z. Chen, “Bayesian filtering: From kalman filters to particle filters, and beyond,”
Statistics, vol. 182, no. 1, pp. 1–69, 2003.  [30] K. J. Friston, N. TrujilloBarreto, and J. Daunizeau, “DEM: A variational treatment of dynamic systems,” NeuroImage, vol. 41, no. 3, pp. 849–885, 2008.
 [31] K. Friston, “Hierarchical models in the brain,” PLoS Computational Biology, vol. 4, no. 11, 2008.
 [32] M. Baltieri and C. L. Buckley, “An active inference implementation of phototaxis,” in Proc. Eur. Conf. on Artificial Life, 2017, pp. 36–43.
 [33] H. Brown, R. A. Adams, I. Parees, M. Edwards, and K. Friston, “Active inference, sensory attenuation and illusions,” Cognitive processing, vol. 14, no. 4, pp. 411–427, 2013.
 [34] A. G. Feldman, “New insights into action–perception coupling,” Experimental Brain Research, vol. 194, no. 1, pp. 39–58, 2009.
 [35] ——, “Active sensing without efference copy: referent control of perception,” Journal of neurophysiology, vol. 116, no. 3, pp. 960–976, 2016.
 [36] M. Baltieri and C. L. Buckley, “A probabilistic interpretation of pid controllers using active inference,” in From Animals to Animats 15, P. Manoonpong, J. C. Larsen, X. Xiong, J. Hallam, and J. Triesch, Eds. Springer International Publishing, 2018, pp. 15–26.
 [37] V. G. Rao and D. S. Bernstein, “Naive control of the double integrator,” IEEE Control Systems, vol. 21, no. 5, pp. 86–97, 2001.
 [38] K. J. Åström and R. M. Murray, Feedback systems: an introduction for scientists and engineers. Princeton university press, 2010.
 [39] G. L. Gottlieb, “A computational model of the simplest motor program,” Journal of Motor behavior, vol. 25, no. 3, pp. 153–161, 1993.
 [40] E. von Holst and H. Mittelstaedt, “Das reafferenzprinzip,” Naturwissenschaften, vol. 37, no. 20, pp. 464–476, 1950.
 [41] E. D. Sontag, “Adaptation and regulation with signal detection implies internal model,” Systems & control letters, vol. 50, no. 2, pp. 119–126, 2003.
 [42] T.M. Yi, Y. Huang, M. I. Simon, and J. Doyle, “Robust perfect adaptation in bacterial chemotaxis through integral feedback control,” Proceedings of the National Academy of Sciences, vol. 97, no. 9, pp. 4649–4653, 2000.
 [43] A. Clark, “Radical predictive processing,” The Southern Journal of Philosophy, vol. 53, no. S1, pp. 3–27, 2015.
 [44] W. T. Powers, Behavior: The control of perception. Aldine Chicago, 1973.
 [45] A. Clark, Surfing Uncertainty: Prediction, Action, and the Embodied Mind. Oxford University Press, 2015.
 [46] K. Friston, C. Thornton, and A. Clark, “Freeenergy minimization and the darkroom problem,” Frontiers in psychology, vol. 3, p. 130, 2012.
 [47] M. Allen and K. J. Friston, “From cognitivism to autopoiesis: towards a computational framework for the embodied mind,” Synthese, vol. 195, no. 6, pp. 2459–2482, 2018.
 [48] E. Ahissar and E. Assa, “Perception as a closedloop convergence process,” Elife, vol. 5, p. e12830, 2016.
 [49] L. Busse, J. A. Cardin, M. E. Chiappe, M. M. Halassa, M. J. McGinley, T. Yamashita, and A. B. Saleem, “Sensation during active behaviors,” Journal of Neuroscience, vol. 37, no. 45, pp. 10 826–10 834, 2017.
 [50] C. L. Buckley and T. Toyoizumi, “A theory of how active behavior stabilises neural activity: Neural gain modulation by closedloop environmental feedback,” PLoS computational biology, vol. 14, no. 1, p. e1005926, 2018.
 [51] J. W. Krakauer, A. A. Ghazanfar, A. GomezMarin, M. A. MacIver, and D. Poeppel, “Neuroscience needs behavior: correcting a reductionist bias,” Neuron, vol. 93, no. 3, pp. 480–490, 2017.
 [52] F. Najafi and A. K. Churchland, “Perceptual decisionmaking: A field in the midst of a transformation,” Neuron, vol. 100, no. 2, pp. 453–462, 2018.
 [53] D. W. Franklin and D. M. Wolpert, “Computational mechanisms of sensorimotor control,” Neuron, vol. 72, no. 3, pp. 425–442, 2011.
 [54] H. Ritz, M. R. Nassar, M. J. Frank, and A. Shenhav, “A control theoretic model of adaptive learning in dynamic environments,” Journal of cognitive neuroscience, pp. 1–17, 2018.
 [55] W. R. Ashby, An introduction to cybernetics. Chapman & Hall Ltd., 1957.