I Introduction
Providing the robot with the predictive functions of the body, environment and others is a critical aspect for complex interaction. Appropriately, in order to generate safe and robust interaction the artificial agent must take into account uncertainties related to the sensory input as well as the unexpected events that can occur. Unfortunately, a perfect model of the body, environment and others is almost impossible to design. We present an adaptive robot body learning and estimation algorithm able to deal with noisy sensory inputs and to integrate multiple sources of information (touch, visual and proprioceptive sensors). This model is framed on the predictive processing theory proposed by Friston [1] and biologically grounded on predictive coding evidence about the brain as observed by Rao and Ballard in the visual cortex [2]. This approach has been extensively studied in computational biology and psychology but it has not been properly tested in robotics [3].
The main idea behind this embodied approach of robot body perception is that the only available information is the sensory input [4]. By learning the predictors of the sensor outcome given its current body latent variables and the actions exerted, the robot is able to properly infer its real body configuration. The error between the expected sensory signal and the real input contributes to refine the most plausible hypothesis that the robot has about its body, as depicted in Fig. 1. This simplifies the complexity of online estimation of the body internal variables and increases the ability of the robot to adapt to uncertain situations.
Ia Motivation and method
To produce safe interaction the robot should robustly predict its body and other agents in every instant using all sensory information available. This undoubtedly passes through the systematic design of the body model or enabling the robot with an accurate perception of its body [5]. Here, we define body perception from the probabilistic perspective [6] as inferring the body variables only depending on the sensory information : ^{1}^{1}1We have intentionally left the action out to be coherent with the active inference perceptual theory from Friston. At the end of the paper we remark the role of the action within the proposed scheme. [4]. When the robot does not have access to its body variables we can infer the body configuration from the sensory input through Bayes:
(1) 
where is the sensory consequence of being in state and is the prior belief of the internal variables. We could estimate this posterior using Bayesian recursive filters [6]. However, instead of computing this posterior directly, we approximate an auxiliary distribution over the unobserved latent variables to the real posterior . Moreover, according to predictive processing theory [7], the robot belief, and variables that represent it, differs from the real world process^{2}^{2}2We describe the robot mind as a system whose belief has its own dynamics and internal variables and the inference process tries to fit the robot internal representation to the real world.. Moreover, it has incomplete knowledge about the real generative process of its body. Thus, we have to approximate, at every instant, not only the state variables but a particular density of a family of functions. In other words, we approximate the distribution where is the internal state of the robot. For this paper, as the body configuration has been simplified to the joint angles, we overloaded the notation by using for the body state and as the internal state of the robot.
In order to approximate both distributions (real and believed) we can minimise the KullbackLeibler divergence
, through the freeenergy bound, expressed as [7, 8]:(2) 
When converge to , only the sensory surprise differs from the belief and the posterior is properly approximated. Hence, in theory, the inference of the body internal configuration depending on the sensory input can be tractable approximated by minimizing the variational free energy. One way to minimize it is through a gradient descent scheme: .
From the generative process point of view, governs the dynamics of the environment and governs the sensory information. However, the robot has an approximation of them , . Thus, the agent is continuously adapting its belief about its body and the world just with the sensory input. This is performed by dynamically updating its internal variables by means of the error between the expected and the real sensory input: the prediction error.
Conversely to previous works on predictive coding, instead of knowing the sensor generative functions here we provide a method to learn them and transparently integrate them into a predictive coding scheme.
IB Related works
Predictive coding [2] and predictive processing [1], has mainly been studied for human perception and control. Just a few works have applied it to robots. For instance, in [3]
the viability of predictive processing for robot control on a simulated robotic arm is discussed. However, the generative functions and the parameters were known in advance. Besides, implementing predictive coding with deep neural networks has gained popularity for modelling multisensory perception
[9]and, in the computer vision community, for video prediction
[10]. Finally, this approach shares some conceptual background with sensorimotor contingencies [4, 11] approaches, and predictive learning [12], where the robot develops its perception as infants do.The variational approach presented in this paper is related to expectationmaximization algorithms formalized as a maximizationmaximization problem of the freeenergy function
[13]. In fact, using Friston terminology, predictive processing is a dynamic expectationmaximization algorithm [7]. Furthermore, it is important to highlight the strong similarities with the ensemble Kalman filter
[14]. We have adopted the freeenergy mathematical framework for the following reasons: it provides indirect minimization of the KullbackLeibler divergence [7]; it supports multisensory nonlinear integration; it is scalable in its Laplace approximation [8]; it permits unsupervised parameters tuning [15]; and it is biologically plausible [2].In terms of body model learning there is a vast literature on regressors such as locally weighted projection regression [16], local Gaussian process [17] or infinite experts algorithm [18]. These methods are able to compute the mapping between the sensory input and the configuration of the body and are used to learn forward and inverse kinematics and dynamics. Feedforward and recurrent nets can also be assimilated for learning body schemas and the needed predictors but relay on supervised information and hundreds of parameters optimization [19, 20]. Unsupervised and selfexploration learning of the body has also been addressed in works like [21] using temporal contingencies. Moreover, biologically plausible sensorimotor learning has been investigated in works like [22] by means of Hebbianbased methods where body calibration can be learnt through sensorimotor mapping. Dynamic Hebbian learning has also been proposed for obtaining intermodal forward models in [23]. Body model free visual detection [24] has been approached as an intermodal inference problem but it is restricted to the camera view of the robot.
IC Contribution and organization
This work introduces predictive processing for robot body perception [7], where the robot first learns the sensors or features forward generative models and then it is able to dynamically provide the most plausible body configuration and the location of the endeffector, incorporating in a scalable way several noisy sources of sensory information. We address free model body learning and estimation, where the sensor generative/forward model is learnt using a Gaussian process regression. The body configuration and endeffector location is obtained by means of online freeenergy minimization using the prediction error.
The computational model is presented in Sec. II, where we propose a way to learn the generative sensor model and its derivative by exploration (sampling), as well as the differential equations to solve body estimation through freeenergy minimization. The experimental setup on a real multisensory robot is presented in Sec. III. In Sec. IV we analyse the proposed approach, evaluating the body estimation with different sensor modalities and inducing visuotactile perturbations. Finally, in Sec. V and VI we discuss the advantages and drawbacks of the proposed approach, and enforce the applicability of the method to improve self localization and interaction.
Ii Mathematical model
Notation  

Distribution and value of the body vars.  
Most plausible hypothesis of body vars.  
Prior belief of the body variables  
Sensor value: visual, proprioceptive, tactile  
Error value: visual, proprioceptive, tactile  
Freeenergy  
Derivative of freeenergy w.r.t. internal state  
Normal with mean and variance 
We first describe the proposed mathematical model for visual and proprioceptive information and then we extend it with a more complex visuotactile input. The model is based on works from predictive processing [7] and freeenergy approaches to perception [15, 8]. For this model and without loss of generality, we restrict that the robot cannot perceive the gradient of the sensor signal and that the state transition model (generative function ) is believed to be static. In Sec. IV we discuss the drawbacks of these simplifications. For the sake of clarity, we adopted the freeenergy derivation presented in [15], although the original one uses the KLdivergence as the starting point.
The robot is defined as a set of sensors and body internal unobserved variables . The proprioceptive sensors
outputs a value depending on the body configuration that follows a Normal distribution with linear or nonlinear mean
: . The visual sensor provides the location of the endeffector in the visual field also following a Normal distribution with linear or nonlinear mean : ). Finally, the robot counts with artificial skin sensors on the endeffector limb and it is able to detect other’s hand in the visual field  See Fig. 1.Iia Perception model for visual and proprioceptive sensors
The body configuration can inferred via visual and proprioceptive sensory information through a Bayes rule. Assuming that the visual and proprioceptive sensing are independent, the distribution of is:
(3) 
The denominator has integrals that make intractable exact computation for large distributions.
As explained previously we want to approximate the belief distribution to the posterior using Eq. 2. Mimicking predictive processing theory, which states that the brain works with the most plausible model of the world to perform predictions, instead of working with the whole distribution , we use the most plausible value: . This have an important implication as the denominator does not any more depend on [15], and hence we get^{3}^{3}3In the ensemble Bayesian filtering terminology this is similar to maintain a sample drawn from the latent space distribution.:
(4) 
Applying logarithms we obtain the negative freeenergy formulation:
(5) 
Substituting the probability distributions by their functions
, and under the Laplace approximation [7, 8] and assuming normally distributed noise, we can compute the negative free energy as:(6) 
To approximate the posterior distribution we minimize , following a gradientdescent scheme:
Note that the first term is the error between the most plausible value of the body configuration and its prior belief (), the second term is the error between the observed proprioceptive value and the expected one () and the third term is the prediction error between the visual sensed position of the endeffector and the expected location (). In order to use Eq. 8 we need to know or learn the sensor forward/generative functions.
For the sake of simplification, we encode the internal state directly in the proprioceptive sensing space, thus defining body just by means of the proprioception state. For that purpose, we substitute by and its partial derivative is set to 1. In other words, if the body configuration is defined by the joint angles, the state will represent the joint sensors (encoders) output. For notation convenience we maintain as the body configuration but it represents . By defining prediction errors as:
(9)  
(10)  
(11) 
We construct the differential equation that infers the body latent variables as^{4}^{4}4Note that the computation of is a simplification of the predictive processing approach for passive static perception [15] as we are omitting the generative model of the world . Accordingly, to certainly reduce the difference between the believed distribution and the observed one, should describe the error between the world generative function and the internal belief: . We leave this extension for further works and in Sec. VI we point out the challenges to obtain the full construct without knowing .:
(12) 
According to Eq. 12 the update of the internal state is driven by the observed and the expected value of the state and the error prediction. The gradient or Jacobian of the sensor with respect to the latent variables maps the contribution of each sensor modality to each body configuration variable in the same way as in the extended or ensemble Kalman filter.
Generalizing the free energy minimization for sensors the body configuration is driven by:
(13) 
Then, the full dynamics of our body estimation model is given by:
(14) 
where is the learning ratio parameter that specifies how fast the prior of body configuration is adjusted to the prediction error.
IiB Body learning – learning the sensory states caused by the body configuration
We define bodylearning as obtaining the unknown forward/observation model and its derivative/Jacobian that relate the sensor values with the body state. This is a consequence of describing body estimation by means of Eq. 14. To learn both functions we use Gaussian process regression with collected data generated by body exploration. We obtain sensor samples from the robot in several body configurations . For instance, for the visual generative process is the proprioceptive state and is the visual information.
The training is performed by computing the covariance matrix on the collected data with noise , where the covariance function is defined as:
(15) 
The prediction of the sensory outcome given is then computed as [25]:
(16) 
where for numerical stability.
Finally, in order to compute the gradient of the posterior we differentiate the kernel [26], and obtain its prediction analogously as Eq. 16:
(17) 
Using the squared exponential kernel with the Mahalanobis distance covariance function, the derivative becomes:
(18) 
where is a matrix where the diagonal is populated with the length scale for each dimension () and is elementwise multiplication.
IiC Adding tactile feedback and other’s interaction
We exploit the artificial skin of the robot to refine the bodyconfiguration estimation. For that purpose, we model the intermodal relation between visual and the tactile sensing[4]. When somebody touches the robot endeffector, it should adjust its body configuration to fit the endeffector location in the visual field where the other agent is touching. In other words, other agent touching the robot endeffector in location in the visual field robot endeffector is there body configuration is adjusted.
First, we assume that the robot is able to discern that its endeffector limb is being touched, and that it knows the relation between the touch signal and the location on the body. We define the likelihood function of being touch by other by means of spatial and temporal coherence. We can learn this function by touching the limb in different endeffector locations. Alternatively, in this paper we reuse learnt model from the visual field to compute the expected endeffector location and define the visuotactile sensory likelihood as:
(19) 
where are parameters that shape the likelihood and have been tuned in concordance with the data acquired in [27] from human participants; is the level of synchrony of the event (e.g., time difference between the visual and the tactile event); and is the other agent endeffector location in the visual field.
We directly introduce this generative function into the freeenergy scheme as follows^{5}^{5}5Under the predictive processing framework we might include another internal variable that defines being touched and a second layer of hierarchy that is able to infer similarity (temporal and spatial) between the patterns generated in the visual field by the other agent and the patterns perceived in the skin.:
(20) 
When a synchronous touching and visual pattern occurs the body configuration is adjusted depending on the expected endeffector visual location and the other’s visual location .
IiD Adaptive body estimation and learning through predictive processing
Algorithm 1 summarizes the learning and estimation stages to dynamically compute the internal body configuration based on the sensory error prediction, using for multiple independent sources of sensory information or features, and body internal variables . The learning stage is using GP regression described in [25] for each sensor modality contribution to the body configuration. Using the solution we reduce the complexity of the prediction calculation. The estimation stage computes the prediction error for every sensor and solves the differential equations by variational freeenergy minimization. Note that we are applying 1st order Euler integration method. More accurate approaches are out of the scope of this paper.
Iii Experimental setup on a robotic arm
We test the model on the multisensory UR5 arm of robot TOMM [28] as depicted in Fig. 2. Although the methodology is thought for robots difficult to calibrate with imprecise sensors, we use this platform as a proof of concept as we can easily compare with the ground truth values. Without loss of generality, the body (latent variables) is defined as the joint angles and its perception from multiple modalities: (1) the proprioceptive input data is three joint angles with Gaussian added noise (shoulder, shoulder and elbow  Fig. 3(a)); (2) the visual input is a rgb camera mounted on the head of the robot with pixels definition; and (3) the tactile input is generated by multimodal skin cells distributed on the arm [29].
Iiia Learning from visual and proprioceptive data
In order to learn the sensory forward/observation model we programmed random trajectories in the joint space that resemble to horizontal displacements of the arm. Figure 3(a) shows the data extracted: noisy joint angles and visual location of the endeffector, obtained by colour segmentation. To learn the visual forward model , each sample is defined as the input joint angles sensor values and the output pixel coordinates. As an example, Fig. 3(b) shows the learnt visual forward model by GP regression with 46 samples (red dots). The horizontal displacement mean (in pixels) with respect to two joint angles and the variance.
IiiB Extracting visuotactile data
We use proximity sensing information from the infrarred sensors located in every skin cell to discern when the arm is being touched. The infrarred sensor outputs a filtered signal . The likelihood of a cell being touched is given by the following function (Eq. 19) , where and . The parameters have been obtained by fitting the function to the distancesensor output measurements. Figure 3(c) shows the raw skin proximity sensing data during the experiment (each colour represents the 117 different skin cells). From the other’s hand visual trajectory and the skin proximity activation we compute the level of synchrony between the two patterns (Fig. 3(d)). Timings for tactile stimuli are obtained by setting a threshold over the proximity value: prox activation. Timings for other’s trajectory events are obtained through the velocity components. Detected initial and ending position of the visual touching is depicted in Fig. 3(d) (right, green circles).
Iv Results
For comparison purposes, all experiments parameters are set fixed values.
learning hyperparameters: signal variance
and kernel length scale . The integration step is () and error variances are , , . Finally, the learning rate of is .Iva Robust multisensory integration
We present three different experiments to study visual and proprioceptive body estimation. The first one, described in Fig. 4 shows the proposed body estimation algorithm while deploying a similar trajectory as presented in Fig. 3(a). We analyse the error between the estimated body configuration and the ground truth joint angles for different sensor contributions. The algorithm is able to correctly estimate the joint angles but presents slow dynamics when big changes occur, due to the static nature of the generative model used. It also shows that with only visual input it is not able to estimate the elbow angle. This happens because learning trajectory was set to not provide information about the elbow. However, we can see how combining visual information and two joint sensors (), reduces the estimation error. This shows the ability of the proposed method to deal with missing information. We have also validated the method against an standard Kalman filter [30] with only the joint angles as input (proprioception), process noise covariance , same measurement noise as the proposed approach and static transition model for fair comparison (yellow dottedline). As expected, the error and behaviour is practically equivalent to the proprioception version of the proposed approach (red dottedline).
In second experiment, presented in Fig. 5(a), we test the model with nonlinear proprioceptive sensors: . The body configuration values plotted are in the sensor space. We have initialized the robot body belief with a wrong configuration. On the first 5 seconds, the plot shows how the system converges to the “embodied” configuration and then the arm starts moving. The estimation reaction time is slightly slower than previous experiment. Furthermore, we observe an interesting effect. The joint angles vary from to , but with the function the robot cannot distinguish between positive and negative angles. Thus, when inverting the sign of one joint the robot thinks that it is in the right configuration but it is not.
The last experiment, depicted in Fig. 5(b), we study how the model deal with damaged or uncalibrated sensors. After the visual learning stage, we have added a drift error to shoulder proprioceptive sensor. The visual prediction error should correct this anomaly. The plot shows how the system nicely reduces proprioceptive drift in shoulder. However, it induces a wrong bias on shoulder. Thus, although visual information, with the current learning, evidences a coupling between and , visual correction has appeared.
IvB Adaptation with visual, proprioceptive and tactile sensors
We further test the proposed model adaptation with proprioceptive and visuotactile stimulation. Fig. 6(a) describes body estimation refinement depending on different sensor modalities. Every sensor or feature contributes independently to improve the robot arm localization. In essence, the method provides scalable data association, e.g., the robot can learn more than one visual feature and incorporate them into the predictive error formulation as an additive term. Besides, Fig. 6(b) experiment shows the potential of the proposed method to adapt its body inference to incoherent new situations as a human will do. We have introduced a strong perturbation on the visuotactile input inspired by the rubberhand illusion experiments in humans [31]. The new visual location induced by synchronous tactile stimulation makes the robot to infer the most plausible situation given the sensory information, which in this case is to drift the location of the arm towards the new location. In the first 5 seconds, there is no tactile stimulation and the estimation is refined to ground truth (black dotted line). Then we inject visuotactile stimulation while other agent is pretending to touch another location. When it becomes synchronous a horizontal drift appears and the inferred body configuration is altered.
IvC A note on scalability
The learning using Gaussian process regression has a computational complexity of and the prediction of the sensor forward model depends on the covariance kernel complexity . For independent sensor contributions, internal variables and samples, the prediction of forward models is . Finally, the freeenergy optimization is using Euler integration method.
V Discussion: I sense, therefore I am?
We have stressed that robot body estimation can be computed just by means of sensory information. Every sensing modality or feature, when available, contributes to the final body estimation through the prediction error and the variance of each error describes the precision of every sensor with respect to body internal variables. For instance, outside of the field of view proprioceptive and tactile sensors define the arm configuration. When the arm appears in the visual field, other features are included into the inference. We have also shown that when the robot has a broken proprioceptive sensor it can rely on visual features to complete the lack of information. Finally, we have underscored embodiment showing how the sensor function influences body estimation. Hence, we have defined adaptive body learning and estimation as providing the most plausible solution according to the current information available from the sensors. As a collateral effect, the model has been showed to be prone to visuotactile illusions, something that has been also evidenced in humans.
Nevertheless, we have only focused on passive perception and omitted deliberatively the generative model of the body dynamics. Moreover, where is the action? We have not considered it in the model, something core for interacting with the body. In order to obtain the full construct, which properly reduces the KLdivergence between the robot belief and the posterior probability of the body configuration given the sensors, we need to include the robot dynamics. However, this is a hard task from the learning perspective. The advantage with this approach is that we only need an approximation of the dynamics because freeenergy minimization should solve the discrepancy. With the full construct we expect to improve prediction accuracy and to incorporate the action into the body estimation framework.
Vi Conclusion
We have presented an adaptive robot body learning and estimation algorithm based on predictive processing, able to integrate information from visual, proprioceptive and tactile sensors. The robot independently learns the sensor forward generative functions and then it use them to refine its body estimation by a freeenergy minimization scheme.
The model has been tested on a robot with a standard industrial arm to facilitate ground truth comparison. Results have shown how the model deals with missing and noisy sensory information, reducing the effect of sensor failures. The algorithm has also displayed adaptability to wrong body prior initialization and unexpected situations. In addition, we have shown how other’s touch can refine body robot estimation, opening interesting questions about improved localization and mapping by means of tactile interaction. Altogether reflects the potential of the proposed approach for complex robots, where estimating body location is a hard task and a requirement for safe interaction.
References
 [1] K. Friston, “A theory of cortical responses,” Philosophical Transactions of the Royal Society of London B: Biological Sciences, vol. 360, no. 1456, pp. 815–836, 2005.
 [2] R. P. Rao and D. H. Ballard, “Predictive coding in the visual cortex: a functional interpretation of some extraclassical receptivefield effects,” Nature neuroscience, vol. 2, no. 1, pp. 79–87, 1999.
 [3] L. PioLopez, A. Nizard, K. Friston, and G. Pezzulo, “Active inference and robot control: a case study,” Journal of The Royal Society Interface, vol. 13, no. 122, p. 20160616, 2016.
 [4] P. Lanillos, E. DeanLeon, and G. Cheng, “Yielding selfperception in robots through sensorimotor contingencies,” IEEE Trans. on Cognitive and Developmental Systems, no. 99, pp. 1–1, 2016.
 [5] P. Lanillos, E. DeanLeon, and G. Cheng, “Enactive self: a study of engineering perspectives to obtain the sensorimotor self through enaction,” in Developmental Learning and Epigenetic Robotics, Joint IEEE Int. Conf. on, 2017.
 [6] S. Thrun, W. Burgard, and D. Fox, Probabilistic robotics. MIT press, 2005.
 [7] K. Friston, “Hierarchical models in the brain,” PLoS computational biology, vol. 4, no. 11, p. e1000211, 2008.
 [8] C. L. Buckley, C. S. Kim, S. McGregor, and A. K. Seth, “The free energy principle for action and perception: A mathematical review,” arXiv preprint arXiv:1705.09156, 2017.

[9]
A. Ahmadi and J. Tani, “Bridging the gap between probabilistic and deterministic models: a simulation study on a variational bayes predictive coding recurrent neural network model,” in
Int. Conf. on Neural Information Processing. Springer, 2017, pp. 760–769.  [10] W. Lotter, G. Kreiman, and D. Cox, “Deep predictive coding networks for video prediction and unsupervised learning,” arXiv preprint arXiv:1605.08104, 2016.
 [11] C. Angulo and J. M. Acevedovalle, “On dynamical systems for sensorimotor contingencies. a first approach from control engineering,” in Recent Advances in Artificial Intelligence Research and Development, Proc. Int. Conf. of the Catalan Association for Artificial Intelligence, vol. 300. IOS Press, 2017, p. 46.
 [12] Y. Nagai and M. Asada, “Predictive learning of sensorimotor information as a key for cognitive development,” in Proc. of the IROS 2015 Workshop on Sensorimotor Contingencies for Robotics, 2015.
 [13] R. M. Neal and G. E. Hinton, “A view of the em algorithm that justifies incremental, sparse, and other variants,” in Learning in graphical models. Springer, 1998, pp. 355–368.
 [14] G. Evensen, “Sequential data assimilation with a nonlinear quasigeostrophic model using monte carlo methods to forecast error statistics,” Journal of Geophysical Research: Oceans, vol. 99, no. C5, pp. 10 143–10 162, 1994.
 [15] R. Bogacz, “A tutorial on the freeenergy framework for modelling perception and learning,” Journal of mathematical psychology, 2015.
 [16] S. Vijayakumar, A. D’souza, and S. Schaal, “Incremental online learning in high dimensions,” Neural computation, vol. 17, no. 12, pp. 2602–2634, 2005.
 [17] D. NguyenTuong, J. R. Peters, and M. Seeger, “Local gaussian process regression for real time online model learning,” in Advances in Neural Information Processing Systems, 2009, pp. 1193–1200.
 [18] B. Damas and J. SantosVictor, “An online algorithm for simultaneously learning forward and inverse kinematics,” in Inte. Robots and Systems (IROS), 2012 IEEE/RSJ Int. Conf. on, 2012, pp. 1499–1506.
 [19] C. Nabeshima, Y. Kuniyoshi, and M. Lungarella, “Adaptive body schema for robotic tooluse,” Advanced Robotics, vol. 20, no. 10, pp. 1105–1126, 2006.
 [20] E. Wieser and G. Cheng, “Progressive learning of sensorymotor maps through spatiotemporal predictors,” in Developmental Learning and Epigenetic Robotics (ICDLEpirob), IEEE Int. Conf. on, 2016.
 [21] A. Stoytchev, “Selfdetection in robots: a method based on detecting temporal contingencies,” Robotica, vol. 29, no. 01, pp. 1–21, 2011.
 [22] H. Mori and Y. Kuniyoshi, “A human fetus development simulation: Selforganization of behaviors through tactile sensation,” in Development and Learning (ICDL), IEEE 9th Int. Conf. on, 2010, pp. 82–87.
 [23] G. Schillaci, V. V. Hafner, and B. Lara, “Exploration behaviors, body representations, and simulation processes for the development of cognition in artificial agents,” Frontiers in Robotics and AI, vol. 3, p. 39, 2016.
 [24] P. Lanillos, E. DeanLeon, and G. Cheng, “Multisensory object discovery via selfdetection and artificial attention,” in Developmental Learning and Epigenetic Robotics, Joint IEEE Int. Conf. on, 2016.

[25]
C. E. Rasmussen and C. K. I. Williams,
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
. The MIT Press, 2005.  [26] A. McHutchon, “Differentiating gaussian processes,” 2013.
 [27] M. Samad, A. J. Chung, and L. Shams, “Perception of body ownership is driven by bayesian sensory inference,” PloS one, vol. 10, no. 2, p. e0117178, 2015.
 [28] E. DeanLeon, B. Pierce, F. Bergner, P. Mittendorfer, K. RamirezAmaro, W. Burger, and G. Cheng, “Tomm: Tactile omnidirectional mobile manipulator,” in Robotics and Automation (ICRA), IEEE Int. Conf. on, 2017, pp. 2441–2447.
 [29] P. Mittendorfer and G. Cheng, “Humanoid multimodal tactilesensing modules,” IEEE Trans. on robotics, vol. 27, no. 3, pp. 401–410, 2011.
 [30] E. BesadaPortas, J. A. LopezOrozco, P. Lanillos, and J. M. de la Cruz, “Localization of nonlinearly modeled autonomous mobile robots using outofsequence measurements,” Sensors, vol. 12, no. 3, pp. 2487–2518, 2012.
 [31] N.A. Hinz, P. Lanillos, H. Mueller, and G. Cheng, “Drifting perceptual patterns suggest prediction errors fusion rather than hypothesis selection: replicating the rubberhand illusion on a robot,” arXiv preprint arXiv:1806.06809, 2018.