The homunculus for proprioception: Toward learning the representation of a humanoid robot's joint space using self-organizing maps

09/05/2019 ∙ by Filipe Gama, et al. ∙ Czech Technical University in Prague 0

In primate brains, tactile and proprioceptive inputs are relayed to the somatosensory cortex which is known for somatotopic representations, or, "homunculi". Our research centers on understanding the mechanisms of the formation of these and more higher-level body representations (body schema) by using humanoid robots and neural networks to construct models. We specifically focus on how spatial representation of the body may be learned from somatosensory information in self-touch configurations. In this work, we target the representation of proprioceptive inputs, which we take to be joint angles in the robot. The inputs collected in different body postures serve as inputs to a Self-Organizing Map (SOM) with a 2D lattice on the output. With unrestricted, all-to-all connections, the map is not capable of representing the input space while preserving the topological relationships, because the intrinsic dimensionality of the body posture space is too large. Hence, we use a method we developed previously for tactile inputs (Hoffmann, Straka et al. 2018) called MRF-SOM, where the Maximum Receptive Field of output neurons is restricted so they only learn to represent specific parts of the input space. This is in line with the receptive fields of neurons in somatosensory areas representing proprioception that often respond to combination of few joints (e.g. wrist and elbow).



There are no comments yet.


page 1

page 2

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Somatosensory inputs are constituted by tactile and proprioceptive ones and are first processed by the primary somatosensory cortex (SI) in the primate brain, in particular by Brodmann areas 3b and 3a respectively. The representations are somatotopic, resembling the structure of the human body (hence the term homunculus or “little man”), yet distorted. These map-like representations are a result of genetic predispositions and experience both before and after birth. For the learning of such maps, self-organizing (Kohonen) maps (SOMs) seem a good candidate because of their topology preservation property. In [1] we used tactile stimulation on the body of a humanoid robot and introduced a modification of SOM that we called MRF-SOM where the Maximum Receptive Field of output neurons is restricted so they only learn to represent specific parts of the input space. The MRF settings were chosen to mimic the genetic predispositions and ensuring the gross layout of the output layer similar to that of area 3b. Pugach et al. [2] have also employed SOM to learn the representation of a tactile surface.

In [3] we focused on the representation of proprioceptive inputs, with the additional simplification that only joint angles were considered (under the assumption that muscle spindles giving muscle length and speed are the “primary proprioceptors”). Population coding was employed to encode information from robot arm and head joints while it was observing its hand in front of the face and then fed into a standard SOM. A number of limitations of this approach are discussed in [3]—one of them being that the problem seems ill-posed—the intrinsic dimensionality (topology) of body configuration space is too high to be preserved on a 2-dimensional output sheet.

In this work, we take a different approach. First, with the bigger goal of developing models testing the hypothesis that self-touch configurations may give rise to representations of the body (and skin surface) in space and to reach to stimuli on own body [4], we collected data where the humanoid robot touches itself on the face. Second, population coding of inputs was not used here. Third, the MRF setting was applied.

Ii Methods

Ii-a Humanoid robot and training dataset

The training dataset was created by recording the joint angles from 3216 body configurations where the Nao humanoid robot touches its face with its right hand (Fig. 1 left). In every configuration, seven joint angles are recorded: head yaw, head pitch, shoulder roll, shoulder pitch, elbow roll, elbow yaw, and wrist.

Fig. 1: (Left) Nao robot with skin performing self-touch. (Right) Overlapping receptive fields setting for MRF-SOM.

Ii-B Self-organizing map and MRF-SOM

If the dimension of the SOM output layer does not correspond to the intrinsic dimensionality of the input space, distortions of the mapping are inevitable. In [1], we mapped the artificial skin surface of a humanoid robot (1928 “taxels”) onto a SOM with 7x24 output neurons. The skin space is locally 2D but enclosed in 3D and there are numerous possible mappings onto a 2D output lattice. By using MRF, we steered the learning into a layout resembling area 3b of the cortex.

For proprioception, the number of inputs is smaller—here 7 joints (unless population coding is used [3])—, but the intrinsic dimensionality of the body configuration space seems high: a 2D output sheet cannot preserve all the topological relationships in which postures of the body are “similar”. Neural data also shows that receptive fields in area 3a typically cover one or two neighbouring joints [5]. Therefore, we decided to employ the MRF-SOM with a 4x4 hexagonal lattice on the output and 4 partially overlapping receptive fields – shown in Fig. 1 right. Distance between neurons is computed using the Manhattan distance, and weights and initial position of neurons are initialised randomly.Ingeniørenes hus møtesenter

Iii Preliminary Results

The network is evaluated by its ability to encode input joints: individual neuron activation encodes one specific joint or a combination between two or three joints, and topological and pattern encoding where specific activity from a group of neurons encodes a specific input joint or combination.

Results after learning are shown in Fig. 2. The heatmaps show the value of the weight between each neuron of output layer and a corresponding joint. For each cluster (except wrist as there is only one joint), each neuron in the cluster prefers one or another joint, making it possible to differentiate the encoded joints from the state of the cluster. However, despite preferring one joint, most neurons actually encode a combination of the two joints, with high weights values for both. The wrist is also learned by two neurons in its cluster, while another neuron takes it as inhibition (negative weight). Concerning the neurons with overlapping joints from different body parts, it seems that all joints are learned equally: the neuron doesn’t show any preference for a specific joint or body part, bur only encodes a combination of all the joints in its receptive field.

Fig. 2: Weight values between input joints and the output layer

Another visualization of the network (not shown) displays the distance between neurons. The clusters coding the shoulder and elbow joints have half of the distance between them than with the head or wrist clusters. The wrist cluster is separated from the other joints, while neurons inside the head cluster are extremely close to each other. This indicates that elbow and shoulder joints are more closely related and dependent in the dataset than other joints, while the wrist joint is more independent and head joints are highly correlated together.

Iv Conclusion

Following up on our previous work [3, 1], we continue to seek how a robot may learn a representation of its joint space resembling the representations of proprioception found in the somatosensory cortex. A variant of the SOM—MRF-SOM—can be a useful tool to channel the learning in the desired directions. However, the results presented are highly preliminary. Even if MRF-SOM can be used to form receptive fields somewhat resembling those in area 3a, it is still not clear what information about the joints is encoded in these areas (see the position-scaled vs. posture-selective neuron types reported in [6]) and how it can contribute to the formation of spatial body representations discussed in [4].


  • [1] M. Hoffmann, Z. Straka, I. Farkas, M. Vavrecka, and G. Metta, “Robotic homunculus: Learning of artificial skin representation in a humanoid robot motivated by primary somatosensory cortex,” IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 2, pp. 163–176, June 2018.
  • [2] G. Pugach, A. Pitti, and P. Gaussier, “Neural learning of the topographic tactile sensory information of an artificial skin through a self-organizing map,” Advanced Robotics, pp. 1–17, 2015.
  • [3] M. Hoffmann and N. Bednarova, “The encoding of proprioceptive inputs in the brain: knowns and unknowns from a robotic perspective,” in Kognice a umělý život XVI [Cognition and Artificial Life XVI], M. Vavrecka, O. Becev, M. Hoffmann, and K. Stepanova, Eds., 2016, pp. 55–66.
  • [4] M. Hoffmann, L. K. Chinn, E. Somogyi, T. Heed, J. Fagard, J. J. Lockman, and J. K. O’Regan, “Development of reaching to the body in early infancy: From experiments to robotic models,” in Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), 2017, pp. 112–119.
  • [5] L. Krubitzer, K. J. Huffman, E. Disbrow, and G. Recanzone, “Organization of area 3a in macaque monkeys: contributions to the cortical phenotype,” Journal of Comparative Neurology, vol. 471, no. 1, pp. 97–111, 2004.
  • [6] S. S. Kim, M. Gomez-Ramirez, P. H. Thakur, and S. S. Hsiao, “Multimodal interactions between proprioceptive and cutaneous signals in primary somatosensory cortex,” Neuron, vol. 86, no. 2, pp. 555–566, 2015.