Nonmodular architectures of cognitive systems based on active inference

by   Manuel Baltieri, et al.
University of Sussex

In psychology and neuroscience it is common to describe cognitive systems as input/output devices where perceptual and motor functions are implemented in a purely feedforward, open-loop fashion. On this view, perception and action are often seen as encapsulated modules with limited interaction between them. While embodied and enactive approaches to cognitive science have challenged the idealisation of the brain as an input/output device, we argue that even the more recent attempts to model systems using closed-loop architectures still heavily rely on a strong separation between motor and perceptual functions. Previously, we have suggested that the mainstream notion of modularity strongly resonates with the separation principle of control theory. In this work we present a minimal model of a sensorimotor loop implementing an architecture based on the separation principle. We link this to popular formulations of perception and action in the cognitive sciences, and show its limitations when, for instance, external forces are not modelled by an agent. These forces can be seen as variables that an agent cannot directly control, i.e., a perturbation from the environment or an interference caused by other agents. As an alternative approach inspired by embodied cognitive science, we then propose a nonmodular architecture based on the active inference framework. We demonstrate the robustness of this architecture to unknown external inputs and show that the mechanism with which this is achieved in linear models is equivalent to integral control.


A Minimal Active Inference Agent

Research on the so-called "free-energy principle" (FEP) in cognitive neu...

Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop

Active inference is an ambitious theory that treats perception, inferenc...

Kernel Based Cognitive Architecture for Autonomous Agents

One of the main problems of modern cognitive architectures is an excessi...

Predictive Processing in Cognitive Robotics: a Review

Predictive processing has become an influential framework in cognitive s...

Rethinking the Separation Layers in Speech Separation Networks

Modules in all existing speech separation networks can be categorized in...

A deep active inference model of the rubber-hand illusion

Understanding how perception and action deal with sensorimotor conflicts...

Least Action Principles and Well-Posed Learning Problems

Machine Learning algorithms are typically regarded as appropriate optimi...

I Introduction

In cognitive science it is often assumed that agents can be described as input/output systems, an idea based on traditional, computational accounts of cognition [1, 2, 3]. In these models, the emphasis is on internal models of the world, central processing and the sense-model-plan-act framework, often neglecting embodiment, situatedness and feedback from the environment [4]. More recent attempts, e.g., [5, 6, 7], have proposed closed-loop descriptions of cognitive system using internal forward/inverse models in an attempt to provide better accounts of behaviour in living organisms. However in both the input/output and the closed-loop architectures advocated by these approaches, the role of perceptual and motor processes is thought to be fundamentally modular [2], i.e., these functions can be described as nearly independent, (informationally) encapsulated components with minimal interactions.

In recent years, theories of estimation and control have become increasingly popular accounts of perception

[8, 9, 10] and action [5, 6, 7] respectively. In this context, the Kalman-Bucy filter is used as a model of perception [9, 11] while LQR (linear quadratic regulator) constitutes the basis of various accounts of motor control [12, 13]. In previous work [14] we claimed that the idea of modularity of action and perception can be seen as an analogy of the separation principle in control theory [15, 16, 17]. According to this principle, problems of estimation and control of a system can be solved separately and their solutions can be optimally combined under a set of assumptions. Following this, one can sequentially combine a Kalman-Bucy filter and LQR to create the LQG (linear quadratic Gaussian) architecture, used as a general methodology for several models of sensorimotor loops, e.g., [6, 18, 13, 19]. The “classical sandwich” [3] of cognitive science thus survives, we claim, even in the forward/inverse models formulation of perception and motor control.

The fields of embodied and enactive cognitive science on the other hand emphasise the deep integration of perception and action, seen as fundamentally intertwined [20, 21, 22]. In [14] we proposed to use a framework based on the formulation of perception and action as estimation and control while not implementing the conditions for the separation principle, i.e., active inference. Active inference is a process theory based on the free energy principle [23] describing cognitive functions (perception and action, but also learning and attention) as processes of minimisation of sensory surprisal [23, 24]. More precisely, since this quantity is not directly accessible by an agent, it is thought that the variational free energy (an upper bound to sensory surprisal) is minimised in its place. In active inference, perceptual and motor processes are often described as entangled and inseparable [25, 26, 27] providing thus a new possible methodology combining estimation and control following embodied/enactive theories of the mind. We previously presented a conceptual account of active inference and its role for nonmodular architectures of cognitive systems [14]. Here we introduce a minimal agent model highlighting the different implementations (LQG vs. active inference) especially in presence of unknown external stimuli affecting an agent’s observations.

Ii LQG and the separation principle

The framework provided by LQG control and based on the separation principle linearly combines two processes of 1) estimation or inference of hidden properties of the environment and 2) control or regulations of variables of interest. The estimation of hidden variables is based on the presence of a Kalman (for discrete time systems) or Kalman-Bucy (for continuous time systems) filter, while the control of the desired variables on LQR [15, 16, 17]. In particular, this combination is provably optimal according to a set of assumptions:

  1. the estimator is implemented through a state-space model where only linear process dynamics and observation laws describe the environment and its latent states

  2. uncertainty or noise in both dynamics and observations are represented by white, zero-mean Gaussian variables

  3. the properties of these random variables, in particular their (co)variance matrices, are known

  4. the performance of the regulator can be evaluated using a quadratic cost function

  5. all the inputs/forces applied to the agent are known, e.g., external disturbances and internal signals such as motor actions.

Following the separation principle, the LQG controller produces optimal estimation and optimal control for linear systems, sequentially combining two separate sub-systems, a Kalman-Bucy filter and LQR, in an optimal (i.e., minimum-variance) way [16, 17]. The Kalman-Bucy filter provides the optimal state-estimate of a signal and the LQR controller uses such estimate (i.e., the mean) to implement the optimal deterministic controller: LQG control makes use of the estimated mean and feeds it into an LQR controller.

A general linear system to be regulated in the presence of noise on the observed state is described by:


where all the variables and parameters are the same as previously defined for Kalman-Bucy filters and LQR. Using the separation principle, it can then be shown that minimising the expected value of the cost-to-go is equivalent to minimising the cost-to-go for the expected (estimated) state [17]


where we replaced states with their estimates , meaning that the optimal control can be computed using only the state estimate (i.e., the mean) rather than . The combined problem of estimation and control in LQG terms is then implemented by the following system combining Kalman-Bucy filter and LQR equations:


Iii Active inference

Active inference is a process theory proposed to explain brain functioning and other functions of living systems based on Bayesian inference and optimal control theory [28, 23, 24]. In this section we establish its relations to the LQG architecture, starting by building an active inference version of the regulation of a linear multivariate system, and highlighting differences, limitations and possible extensions proposed for the control problem. As with LQG control, we build an estimator of the hidden states . In this case however, we will give a variational account of the estimator in generalised coordinates of motion that generalises the MLE/MAP derivation of Kalman-Bucy filters [29] using Variational Bayes with a Laplace approximation [30, 24]. We start by defining a generative model for an agent capturing the dynamics of the system to control and how these relate to observations and represented in a generalised state-space form [31, 24]:


where the hat over the matrices simply represents the fact that the matrices used in the generative model don’t necessarily mirror their counterparts describing the world dynamics ([32, 33]), as shown in our model later. The main difference with respect to LQG however is that LQG explicitly mirrors (by construction in the linear case) the dynamics of the observed system, thus including knowledge of inputs

. On the other hand, in active inference this vector is not explicitly modelled by an agent, assuming that such information is not available to a system, in accordance with evidence in motor neuroscience suggesting the lack of knowledge of self-produced controls (i.e., efference copy)

[34, 25, 35]. It is in fact proposed that a deeper duality of estimation and control exists whereby, in the simplest case (i.e., a purely reflexive account), actions are simply responses to the presence of prediction errors at the proprioceptive level, irrespectively of the cause of sensations (self-generated or external forces) [25, 33]. The vector in the generative model encodes instead external or exogenous inputs in a state-space models context or, from a Bayesian perspective, priors or “desired” outcomes generated by higher layers in hierarchical (Bayesian) implementations [10, 31]. In this light, priors can be used to effectively bias the estimator to “infer” desired rather than observed states, with a controller instantiating actions on the world to fulfil the “observed” (= desired) states of an agent. Variables model the real noise in the environment making, however, use of the definition of state space models in generalised coordinates of motion [30, 31], where are treated as analytical noise with non-zero autocorrelation, generalising the definition of Wiener processes with Markov property.

This state-space model can then be written down in a probabilistic form, mapping the measurements equation to a likelihood (no direct influence of inputs on observations), and a the dynamics to a prior [31, 24, 32]

. The two multivariate Gaussian probabilities densities can then be combined and used in the general formulation of Laplace encoded variational free energy defined in

[30, 24] (without constants):


the free energy for a generic linear multivariate system becomes then:


where we explicitly replaced with their expectations since under the Laplace assumption this represents the best estimate of (i.e., covariances of the approximate, variational density can be recovered analytically [30, 24]). Variables represent the length of vectors and respectively. Expectations play the same role of estimates in LQG, we simply decided to use a notation consistent with some of our previous work [24, 32, 36]. We also defined precision matrices as the inverse of covariance matrices and used to define the determinant of a matrix. It is important to highlight that, in general, the covariance matrices used in the generative model can be different from the ones used to describe the environment or generative process [33, 32]. To simplify the already heavy notation we will however represent them in the same way.

The recognition dynamics, encoding perception and action in a system minimising free energy [30, 24] and equivalent to estimation and control functions respectively, are implemented in standard active inference formulations as a gradient descent scheme minimising the free energy with respect to the variables for perception/estimation:


and actions for action/control, assuming only that actions have an effect on observations [28]:


The estimation expressed in (7) prescribes a generalisation of Kalman-Bucy filters to trajectories with arbitrary embedding orders where random variables are not treated as Markov processes [30]. In (7), we also include an extra term that represents the “mode of the motion” (also the mean for Gaussian variables) for the minimisation in generalised coordinates of motion [31, 24], with as a differential operator shifting the order of motion, i.e., . More intuitively, since we are now minimising the components of a generalised state representing a trajectory rather than a static variable, variables are in a moving framework of reference where the minimisation is achieved for rather than . Action as expressed in (8) may appear similar to the traditional LQR/LQG form, but is fundamentally different since it depends explicitly on observations rather than estimated hidden states .

Iv The model

The double integrator is a canonical example used in control theory and represents one of the most fundamental problems in optimal control, modelling single degree of freedom motion of different physical systems

[37, 38]. In the case presented here, this could be thought of as a block on frictionless surface. In motor neuroscience, this is the simplest model of single-joint movement [39] and can, in some cases, be easily generalised to multiple degrees of freedom [28]. The standard double integrator is usually described as a deterministic system. The control policy is thus defined using a feedback law applied directly to the known dynamics, as the full state of the system is measured with no uncertainty [37]. For the purposes of this work, where uncertainty and noise are crucial components, we will introduce process and measurement noise into the system, making the estimation of hidden states necessary. This will then allow us to compare LQG and active inference in one of the simplest possible examples in the control theory literature with direct applications to the study of motor systems and behaviour 111The code is available at The double integrator is described by the following state-space model:


where matrices are defined as:

and covariance matrices as:

Fig. 1: The generative process, a double integrator. The double integrator models the motion of a system with a single degree of freedom, corresponding to a block of mass=1kg placed on a surface with no friction. The block is initialised at a random position with a random velocity and needs to stop, , at position .

Iv-a The LQG solution to the double integrator

For LQG we implement (3) using the same matrices specified above and furthermore define:


with no specific optimisation of these parameters since it is beyond the scope of this work. For further analysis see for instance [37].

Fig. 2: The double integrator solved using LQG. (a) Five examples with different initial conditions showing in blue the observed trajectories of different blocks in the phase-space and in red the agent’s estimates of the same trajectories. (b) Actions taken by the five agents.
Fig. 3: The double integrator solved using LQG. (a) Five examples with different initial conditions showing in blue the observed trajectories of different blocks in the phase-space and in red the agent’s estimates of the same trajectories. (b) Actions taken by the five agents after an external force is introduced (black line).

As we can see in Fig. 1(a), the block is effectively driven to the desired position and velocity

from a set of 5 randomly initialised conditions (position and velocity are sampled from zero-mean Gaussian distributions, sd=300). In Fig. 

1(b) we then show the actions over time of the same 5 example agents, all converging to zero since the agents effectively reach their desired target. The main feature of LQG, and from which active inference will depart, is the reliability of estimates of both position and velocity (the red line in the phase space), using a Kalman-Bucy filter. In LQG, accurate estimates are necessary to then enact the LQR component implementing a negative feedback mechanism based on estimates rather than true hidden states . In Fig. 3 we introduced a new external force not modelled by the agents, equivalent to a disturbance from the environment (black line in Fig. 2(b)). Fig. 2(a) then shows that the agents are incapable of regulating their position/velocity against this unknown input (blue lines), after an initial convergence towards the desired state, they in fact move away from it when the unexpected force is introduced. Furthermore, these agents are incapable of correctly inferring their trajectories, providing inaccurate estimates of their sensed variables (red lines). In Fig. 2(b) we see that all of these agents attempt to counteract the effects of unexpected stimuli (they minimise their velocity after the force is introduced), however the lack of an appropriate mechanism to track their position correctly (e.g., integral action) pushes them away from the target.

Iv-B The double integrator with active inference

To solve the same control problem, active inference relies on the generation of predictions of proprioceptive sensations (position, velocity as in LQG, and also acceleration in this case), followed by the implementation of actions in the world via (trivial) reflex arcs. The proprioceptive modality is essentially treated as other inputs (vision, audition, etc.) and estimates/predictions are generated using the same generative model taking advantage of incoming proprioceptive sensations. This produces a considerably different control system, with state estimates and actions now created by the same model, making it hard to clearly separate processes of perception and action. The copy of motor control signals (cf. efference copy [40]), necessary in standard LQG settings to meet the observability constraints of Kalman-Bucy filters [16, 17] is not included in this formulation, as explained in section III. Active inference postulates in fact that direct representations of the causes or actions of self-generated sensations need not be discounted during the prediction of new incoming sensory inputs. This could be seen as a limitation of active inference, but in general this speaks to the robustness of this approach in face of unknown inputs (i.e., motor actions produced by an agent or exogenous forces from the environment), see [36]. In this framework, inputs can also be estimated using an appropriate generative model of the world dynamics [30], a feature thought to be fundamental in biological systems [41]

. Simple and effective approximations are also possible, for example with integral control, thought to be the most basic heuristic dealing with the problem of uncertain inputs in biological systems down to the unicellular level

[42, 41] and already shown to be consistent with formulations of active inference [36].

Fig. 4: The generative model. To implement the regulation of position and velocity, the agent implements a model whereby an imaginary spring pulls the block back to the origin () while an imaginary damper slows it down ().

To derive an active inference solution to the double integrator, we start by defining a generative model for the agent, i.e., the block:


where matrix is:

while is diagonal with values, is zero everywhere but in where the motor action is applied (with a value of 1) and covariance matrices are also diagonal with, respectively, and values on the main diagonals. The agent implements beliefs of a world where it is pulled back to the desired state by an imaginary spring and slows down thanks to an imaginary piston-like damper, “designed” (in this case by us, but more in general one could imagine evolutionary processes for biological system [28]) to favour normative behaviour.

Following (6), the variational free energy for our controller is then described by:


where precisions are taken from the diagonals of precision matrices (inverse covariances matrices defined in the generative model). After explicitly writing out the equations derived from the matrix formulation in (7), we get the following formulation of perceptual inference:




showing the lack of the Kalman gain and an important difference derived from its absence: if is non-diagonal as in this case (one can simply verify this claim with standard functions solving continuous Riccati equations, as in the provided code), both orders of motion are present in the optimal filter problem in (3), but only one appears in (13) since the precision matrices are assumed to be diagonal in our formulation. More in general, in active inference the Kalman gain matrix is replaced by learning rates such as in this work or [32], or by clever implementations that allow for adaptive update schemes with varying integration steps as in [30].

The action component is, however, the one most significantly different, starting from the assumption that direct knowledge of motor signals is not available and thus not modelled in the generative model (motor commands are replaced by inputs acting as priors). This entails a new approach to the problem, with active inference suggesting that the only information needed comes from observations , see (8). On this account, action reduces to


and with the assumption that

the explicit, scalar version of action becomes


replacing the LQR component in (3). This type of control is equivalent to a PID controller, and is the “optimal” linear solution when knowledge of inputs is not available in the generative model [36]. As in the case of filtering, the feedback gain is missing in the active inference formulation, once again replaced by learning rates of the gradient descent or by other approximations.

Fig. 5: The double integrator solved using active inference (). Same layout as Fig. 2. (a) Five examples with different initial conditions showing in blue the observed trajectories of different blocks in the phase-space and in red the agent’s estimates of the same trajectories. (b) Actions taken by the five agents.
Fig. 6: The double integrator solved using active inference (). Same layout as Fig. 2. (a) Five examples with different initial conditions showing in blue the observed trajectories of different blocks in the phase-space and in red the agent’s estimates of the same trajectories. (b) Actions taken by the five agents after an external force is introduced (black line).

In Fig. 5 we can see an example implementation of the double integrator using active inference. Five agents are initialised at random position and velocity (zero-mean Gaussian distributed, sd=300) and converge to the target solution where the output actions are essentially zero (excluding some noise), as expected Fig. 4(b). The most striking feature is that estimates of both position and velocity of the block are very inaccurate but the agent nonetheless reaches the desired target in the phase space, Fig. 4(a). These differences are given by the generative implemented by the agent, encoding an imaginary spring-damper system that pulls it towards its desired state Fig. 4. Fig. 6 shows the robustness of this implementation when an external force is introduced: by implementing integral control [36], active inference can in this case counteract the effects of unexpected inputs. The presence of integral action perfectly counteracts the effects of disturbances Fig. 5(b) (cf. Fig. 2(b)), and more importantly allows for the desired regulation of the agents’ positions, Fig. 5(a), which is impossible in LQG accounts assuming perfect knowledge of the world (cf. Fig. 2(a)).

V Discussion

LQG-based architectures are modular in nature, with perception and action seen as separate problems solved nearly independently. According to this view, a system should initially find accurate estimates of the hidden properties of its observations, and only once such estimates are available should an agent attempt to regulate variables that are of interest to achieve its goals, e.g., temperature, oxygen level, etc.. On the other hand, we can define a framework based on mathematical formulations of control problems where the separation principle is not included or required. According to one such proposal, that we identified in active inference [28, 23], perception and action are combined in an inseparable sensorimotor loop described by the minimisation of variational free energy for an agent. In this set up, action and perception are seen as instances of a fundamentally unique process [20], using different labels for our (i.e., the observers’) convenience. In particular, the idea of precise inferences of world variables is called into question [43, 32], to the point that inaccurate perception is not only possible but becomes a pre-requisite to act on the world [33, 26]. In architectures based on the separation principle, the estimated state of a system is thought of as a relevant account of real observations, e.g., their means and covariances. Conversely, in active inference it becomes clear that estimates of latent variables of the world are deeply connected to the current goal of an agent, e.g., to regulate its observations, cf. [44]. To do so, its targets are encoded as prior expectations and used to bias inferential processes toward its desires so that prediction errors are created as the mismatch of observations and the estimates of hidden variables. These errors are then minimised by acting on the world [28], taking advantage of proprioceptive prediction errors that enact reflex arcs to make observation better accord with existing predictions [45, 26]. More in general, the active inference formulation allows also for accurate estimates of the latent variables generating observations, see for instance [30], but this modality fundamentally excludes the possibility of acting: if no prediction errors are generated for action to minimise, an agent becomes a simple mirror of its world with no strong desire or even necessity to act [46, 33, 32]. In other words, depending on different precision weights an agent can accurately estimate its observations without acting or potentially discard its sensations to only pursue its desires, generating all possible cases in between as a balanced mix of weighted prediction errors [47].

Vi Conclusions

In recent years the more traditional understanding of perceptual and motor as nearly independent processes as been put into discussion by different authors, especially in neuroscience [48, 49, 50]. It is clear that many experimental set ups are limited [51], requiring new and ethologically meaningful paradigms for an appropriate study of different aspects of living systems [52]. In this context, we propose some new ideas that could drive future experiments. These ideas are centred around a critical appraisal of LQG as a model architecture for cognitive systems, focusing in particular on the assumptions made by the use of Kalman-Bucy filters, central to these proposals [18, 11, 53]. One of the key requirements for Kalman-Bucy filters to generate an accurate estimate of the hidden state of a system is to have access to all the outputs (the observations) and all the inputs (forces that affect the state) of a system. The inputs, in particular, include both motor commands, which in classical forward/inverse models are identified using the idea of efference copy [40] (see for instance [5, 6, 7]), and external forces/signals from the environment that cannot be in principle accounted by an organism, i.e., a sudden change in weather conditions or unexpected interactions with other agents.

In this work we focused on the latter, since the presence of external unaccounted forces is often overlooked in many experimental set-ups with fixed or predictable conditions (e.g., the classic and still dominating two-alternative forced choice paradigm). In more realistic and ethological scenarios, however, one should expect that external and unpredictable stimuli constantly affect the behaviour of an agent [51, 52, 50]. In this case, introducing noise or varying experimental conditions may help in testing the robustness of LQG-based architectures. In practice, if some inputs are not known, one should expect LQG to perform rather poorly until these inputs can be estimated and adaptation (e.g., learning) to new conditions can take place. However, one should then explain how such forces can be described in LQG since Kalman-Bucy filters cannot estimate inputs [29] (cf. DEM [30]). More in general, if a system is well adapted to deal with unpredictable stimuli, simple mechanisms such as integral control could be in place, as shown formally in [41] and in experiments on chemotactic adaptation in E. Coli [42] for instance. More recently, some promising results have been presented in [54], supporting the idea that integral feedback control, unlike Kalman(-Bucy) filters, is a good model for adaptation in environments with varying conditions. Integral control constitutes a linear approximation to problems of control with unknown forces affecting the observations of an agent [38, 36], providing a robust solution with fast responses to problems that otherwise would require slower learning mechanisms [42], which may be ineffective in fast-paced environments [55].

Vii Acknowledgments

This work was supported in part by a BBSRC Grant BB/P022197/1.


  • [1] A. Newell, H. A. Simon et al., Human problem solving.    Prentice-Hall Englewood Cliffs, NJ, 1972, vol. 104, no. 9.
  • [2] J. Fodor, The Modularity of Mind.    MIT Press, 1983.
  • [3] S. Hurley, “Perception and action: Alternative views,” Synthese, vol. 129, no. 1, pp. 3–40, 2001.
  • [4] R. A. Brooks, “New approaches to robotics,” Science, vol. 253, no. 5025, pp. 1227–1232, 1991.
  • [5] M. Kawato, “Internal models for motor control and trajectory planning,” Current opinion in neurobiology, vol. 9, no. 6, pp. 718–727, 1999.
  • [6] D. M. Wolpert and Z. Ghahramani, “Computational principles of movement neuroscience,” Nature neuroscience, vol. 3, no. 11s, p. 1212, 2000.
  • [7] E. Todorov, “Optimality principles in sensorimotor control,” Nature neuroscience, vol. 7, no. 9, pp. 907–915, 2004.
  • [8] D. C. Knill and W. Richards, Perception as Bayesian inference.    Cambridge University Press, 1996.
  • [9] R. P. Rao and D. H. Ballard, “Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects,” Nature neuroscience, vol. 2, no. 1, pp. 79–87, 1999.
  • [10] T. S. Lee and D. Mumford, “Hierarchical bayesian inference in the visual cortex,” JOSA A, vol. 20, no. 7, pp. 1434–1448, 2003.
  • [11] D. M. Wolpert, J. Diedrichsen, and J. R. Flanagan, “Principles of sensorimotor learning,” Nature Reviews Neuroscience, vol. 12, no. 12, p. 739, 2011.
  • [12] W. Li and E. Todorov, “Iterative linear quadratic regulator design for nonlinear biological movement systems.” in ICINCO (1), 2004, pp. 222–229.
  • [13] I. H. Stevenson, H. L. Fernandes, I. Vilares, K. Wei, and K. P. Körding, “Bayesian integration and non-linear feedback control in a full-body motor task,” PLoS computational biology, vol. 5, no. 12, p. e1000629, 2009.
  • [14] M. Baltieri and C. L. Buckley, “The modularity of action and perception revisited using control theory and active inference,” in The 2018 Conference on Artificial Life: A Hybrid of the European Conference on Artificial Life (ECAL) and the International Conference on the Synthesis and Simulation of Living Systems (ALIFE), T. Ikegami, N. Virgo, O. Witkowski, M. Oka, R. Suzuki, and H. Iizuka, Eds., 2018, pp. 121–128.
  • [15] W. Wonham, “On the separation theorem of stochastic control,” SIAM Journal on Control, vol. 6, no. 2, pp. 312–326, 1968.
  • [16] B. Anderson and J. B. Moore, Optimal control: linear quadratic methods.    Prentice-Hall, Inc., 1990.
  • [17] R. F. Stengel, Optimal control and estimation.    Courier Corporation, 1994.
  • [18] E. Todorov and M. I. Jordan, “Optimal feedback control as a theory of motor coordination,” Nature neuroscience, vol. 5, no. 11, p. 1226, 2002.
  • [19] S.-H. Yeo, D. W. Franklin, and D. M. Wolpert, “When optimal feedback control is not enough: Feedforward strategies are required for optimal control with active sensing,” PLoS computational biology, vol. 12, no. 12, p. e1005190, 2016.
  • [20] A. Clark, Being there: Putting brain, body, and world together again.    MIT press, 1998.
  • [21] M. Wilson, “Six views of embodied cognition,” Psychonomic bulletin & review, vol. 9, no. 4, pp. 625–636, 2002.
  • [22] E. Di Paolo, T. Buhrmann, and X. Barandiaran, Sensorimotor Life: An Enactive Proposal.    Oxford University Press, 2017.
  • [23] K. Friston, “The free-energy principle: a unified brain theory?” Nature reviews. Neuroscience, vol. 11, no. 2, pp. 127–138, 2010.
  • [24] C. L. Buckley, C. S. Kim, S. McGregor, and A. K. Seth, “The free energy principle for action and perception: A mathematical review,” Journal of Mathematical Psychology, vol. 14, pp. 55–79, 2017.
  • [25] K. Friston, “What is optimal about motor control?” Neuron, vol. 72, no. 3, pp. 488–498, 2011.
  • [26] W. Wiese, “Action is enabled by systematic misrepresentations,” Erkenntnis, pp. 1–20, 2016.
  • [27] G. Pezzulo, F. Donnarumma, P. Iodice, D. Maisto, and I. Stoianov, “Model-based approaches to active perception and control,” Entropy, vol. 19, no. 6, p. 266, 2017.
  • [28] K. J. Friston, J. Daunizeau, J. Kilner, and S. J. Kiebel, “Action and behavior: A free-energy formulation,” Biological Cybernetics, vol. 102, no. 3, pp. 227–260, 2010.
  • [29]

    Z. Chen, “Bayesian filtering: From kalman filters to particle filters, and beyond,”

    Statistics, vol. 182, no. 1, pp. 1–69, 2003.
  • [30] K. J. Friston, N. Trujillo-Barreto, and J. Daunizeau, “DEM: A variational treatment of dynamic systems,” NeuroImage, vol. 41, no. 3, pp. 849–885, 2008.
  • [31] K. Friston, “Hierarchical models in the brain,” PLoS Computational Biology, vol. 4, no. 11, 2008.
  • [32] M. Baltieri and C. L. Buckley, “An active inference implementation of phototaxis,” in Proc. Eur. Conf. on Artificial Life, 2017, pp. 36–43.
  • [33] H. Brown, R. A. Adams, I. Parees, M. Edwards, and K. Friston, “Active inference, sensory attenuation and illusions,” Cognitive processing, vol. 14, no. 4, pp. 411–427, 2013.
  • [34] A. G. Feldman, “New insights into action–perception coupling,” Experimental Brain Research, vol. 194, no. 1, pp. 39–58, 2009.
  • [35] ——, “Active sensing without efference copy: referent control of perception,” Journal of neurophysiology, vol. 116, no. 3, pp. 960–976, 2016.
  • [36] M. Baltieri and C. L. Buckley, “A probabilistic interpretation of pid controllers using active inference,” in From Animals to Animats 15, P. Manoonpong, J. C. Larsen, X. Xiong, J. Hallam, and J. Triesch, Eds.    Springer International Publishing, 2018, pp. 15–26.
  • [37] V. G. Rao and D. S. Bernstein, “Naive control of the double integrator,” IEEE Control Systems, vol. 21, no. 5, pp. 86–97, 2001.
  • [38] K. J. Åström and R. M. Murray, Feedback systems: an introduction for scientists and engineers.    Princeton university press, 2010.
  • [39] G. L. Gottlieb, “A computational model of the simplest motor program,” Journal of Motor behavior, vol. 25, no. 3, pp. 153–161, 1993.
  • [40] E. von Holst and H. Mittelstaedt, “Das reafferenzprinzip,” Naturwissenschaften, vol. 37, no. 20, pp. 464–476, 1950.
  • [41] E. D. Sontag, “Adaptation and regulation with signal detection implies internal model,” Systems & control letters, vol. 50, no. 2, pp. 119–126, 2003.
  • [42] T.-M. Yi, Y. Huang, M. I. Simon, and J. Doyle, “Robust perfect adaptation in bacterial chemotaxis through integral feedback control,” Proceedings of the National Academy of Sciences, vol. 97, no. 9, pp. 4649–4653, 2000.
  • [43] A. Clark, “Radical predictive processing,” The Southern Journal of Philosophy, vol. 53, no. S1, pp. 3–27, 2015.
  • [44] W. T. Powers, Behavior: The control of perception.    Aldine Chicago, 1973.
  • [45] A. Clark, Surfing Uncertainty: Prediction, Action, and the Embodied Mind.    Oxford University Press, 2015.
  • [46] K. Friston, C. Thornton, and A. Clark, “Free-energy minimization and the dark-room problem,” Frontiers in psychology, vol. 3, p. 130, 2012.
  • [47] M. Allen and K. J. Friston, “From cognitivism to autopoiesis: towards a computational framework for the embodied mind,” Synthese, vol. 195, no. 6, pp. 2459–2482, 2018.
  • [48] E. Ahissar and E. Assa, “Perception as a closed-loop convergence process,” Elife, vol. 5, p. e12830, 2016.
  • [49] L. Busse, J. A. Cardin, M. E. Chiappe, M. M. Halassa, M. J. McGinley, T. Yamashita, and A. B. Saleem, “Sensation during active behaviors,” Journal of Neuroscience, vol. 37, no. 45, pp. 10 826–10 834, 2017.
  • [50] C. L. Buckley and T. Toyoizumi, “A theory of how active behavior stabilises neural activity: Neural gain modulation by closed-loop environmental feedback,” PLoS computational biology, vol. 14, no. 1, p. e1005926, 2018.
  • [51] J. W. Krakauer, A. A. Ghazanfar, A. Gomez-Marin, M. A. MacIver, and D. Poeppel, “Neuroscience needs behavior: correcting a reductionist bias,” Neuron, vol. 93, no. 3, pp. 480–490, 2017.
  • [52] F. Najafi and A. K. Churchland, “Perceptual decision-making: A field in the midst of a transformation,” Neuron, vol. 100, no. 2, pp. 453–462, 2018.
  • [53] D. W. Franklin and D. M. Wolpert, “Computational mechanisms of sensorimotor control,” Neuron, vol. 72, no. 3, pp. 425–442, 2011.
  • [54] H. Ritz, M. R. Nassar, M. J. Frank, and A. Shenhav, “A control theoretic model of adaptive learning in dynamic environments,” Journal of cognitive neuroscience, pp. 1–17, 2018.
  • [55] W. R. Ashby, An introduction to cybernetics.    Chapman & Hall Ltd., 1957.