The goal of the RoboCup Humanoid League is to develop a team of humanoid robots that can compete against the human World Soccer Champion in 2050. In recent years, there were many rule changes introduced to the league in order to bring the level of complexity closer to human soccer. In the RoboCup 2018 competitions, drop-in games were introduced to the AdultSize class, in which two teams consisting of two robots competed with each other, and several teams performed very well.
For RoboCup 2018, we used two open-source 3D printed robots and an upgraded version of one of our classic robots. Each of our 3D printed robots is equipped with a fast onboard computer and a GPU to perform parallel computations. We extended our open-source software with a deeplearning-based perception system and gait parameter optimization. All of the AdultSize robots are shown in Fig. 1, along with the human members of our team NimbRo.
2 Robot Hardware
One of the main contributions to our team’s performance at RoboCup 2018 was the hardware capabilities of our design. At the competition in Montréal, we participated with three robots: Copedo, NimbRo-OP2, and NimbRo-OP2X (See Fig. 1). Copedo  has a light weight of , and spring-loaded legs with parallel kinematics make it a dynamically capable robot, which we utilize in, e.g., the jumping technical challenge.
In contrast to the aluminum and carbon-based build of Copedo, the structure of our newest NimbRo-OP2X  robot is completely 3D-printed and is a substantial upgrade to the NimbRo-OP2 . The core design principles that made the NimbRo-OP2 a reliable and capable platform remained the same . Both robots share the same kinematic structure, external gearing for increased torque, multiple master-slave actuation pairs and minimal complexity in assembly, diagnostics, and maintenance. Although their appearance may seem similar, the NimbRo-OP2X is a complete redesign that introduces multiple upgrades over the NimbRo-OP2. The main component of the redesign process was the use of a new type of actuator — the Robotis Dynamixel XM-540 — which has a heat dissipating metal casing and outputs more torque than the previously used MX-106. This design choice led to the implementation of other features. With a single knee housing eight actuators, a substantial amount of heat is produced during operation. To reduce the possibility of thermal malfunctioning and overheating, we have installed cooling fans, which helped to reduce the temperature in the knee by approximately . We have also reduced the weight of the 3D-printed parts by making them slightly narrower, rounder and have added dedicated cable pathways, all of which contributed to an increased rigidity. The external gearing necessary to exert enough torque in the ankle and hip roll joints was a bottleneck in the production process of the NimbRo-OP2. We have mitigated this issue by designing low-friction and low-backlash double helical gears, which can be quickly 3D-printed . The SLS (Selective Laser Sintering) printing technology was essential to the robustness of our robots, as no part ever broke, even after several collisions with twice as heavy robots that had a metal exoskeleton with sharp edges. The features mentioned above, along with their comparison between the NimbRo-OP2 and NimbRo-OP2X can be observed in Fig. 2.
3 Software Design
Our open-source software based on the ROS middleware  has become a well-established framework in the research and RoboCup community since the initial release. Many soccer teams have used our code and ideas in RoboCup . We continue to further develop the repository, with the hope that other research groups can benefit from it.
3.1 Visual Perception
Each of our robots perceives the environment using a Logitech C905 camera which is equipped with a wide-angle lens. We supersede our previous approach to vision 
by utilizing a deep convolutional neural network followed by post-processing. The presented perception system can work with different brightnesses, viewing angles, and even lens distortions. Using a recurrent deep neural network, we also are able to track and identify our robots.
. Due to computational limitations, we utilized a shorter decoder than encoder part. Although this design choice minimizes the number of parameters and helps us achieve real-time perception, some fine-grained spatial information is lost. We alleviate this spatial information loss by using a subpixel centroid-finding method in the post-processing steps. To minimize the effort of data annotation, we used transfer-learning in our encoder part, by utilizing a pre-trained ResNet-18 model. Since our task is different from the classification task, we removed the GAP and the fully connected layers in the ResNet-18 model. In the decoder part, we used four transpose-convolutional layers. We followed the U-Net model and added lateral connections between the encoder and decoder parts with the intention to preserve spatial information in the decoder part. The proposed visual perception architecture, which in total has 23 convolutional layers, is illustrated in Fig.3.
The following object classes were detected using the network: goal posts, ball, and robots. For our soccer behavior, we only need to perceive predefined center locations of the interesting objects. Similar to SweatyNet , instead of full segmentation loss, we used mean squared error. The desired output consists of Gaussian blobs around the ball center and bottom-middle points of the goal posts and robots.
Although we use Adam optimizer, which has an adaptable per-parameter scale, finding a good learning rate is a challenging prerequisite to training. To find an optimal learning rate, we followed the approach presented by Smith et al. .
We used progressive image resizing that uses small images at the start of training, and gradually increase the size as training progresses, a technique inspired by Brock et al.  and by Yosinski et al. 
. In early iterations, the inaccurate randomly initialized model can make rapid progress by learning from large batches of small images. In the first 50 epochs, we used downsampled training images while freezing the weights on the encoder part. During the next 50 epochs, all parts of the models are jointly trained. In the last 50 epochs, to learn fine-grained details, full-sized images are used. With the intuition that the pre-trained model needs less training, a lower learning rate is used for the encoder part. By using the aforementioned methods, the whole training process with around 3000 samples takes less than 40 minutes on a single Titan Black GPU with 6 GB memory. Two samples from the test set are depicted in Fig.4. Some portion of the used dataset were taken from the ImageTagger library , which have annotated samples from different angles, cameras, and brightness. We extract the object coordinates by post-processing the blob-shaped network outputs. We apply morphological erosion and dilation to eliminate negligible responses on the thresholded output channels. Finally, we compute the object center coordinates. The output of the network is of lower resolution and has less spatial information than the input image. To account for this effect, we calculate sub-pixel level coordinates based on the center of mass of a detected contour. To find the contours, we use connected component analysis  on each of the output channels.
We filter detected objects and project each object location into egocentric world coordinates. To minimize projection errors due to the differences between the designed model and real hardware, we calibrate the camera extrinsic parameters, using the Nelder-Mead  Simplex method.
In the competition, the robots were able to perceive the AdultSize ball up to a distance of with an accuracy of 99% and less than 1% of false detection rates. White goal posts are detected up to with 98% accuracy and with 3% false detections. Opponent robots are detected up to with a success rate of 90% and a false detection rate of 8%. We are still using non-deep learning approaches for field and line detections . In the future, we will add two more channels to the network output and use a single unified network for all detections. The complete perception pipeline including a forward-pass of the network takes approximately 20 ms on the robot hardware.
3.1.1 Localization and Breaking the Symmetry:
Our localization method relies on having a source of global yaw rotation of the robot 
. Instead of a compass, we use integrated gyroscope measurements as the source of yaw orientation. Gyroscope integration is a reliable source of orientation tracking, but it needs a global reference. In order to set the initial heading, we could either use manual initialization or automatic initial orientation estimation. Manual heading initialization can fail during the match since sometimes restarting the operating system of the robot is unavoidable, which will force a reinitialization of the heading. Hence, we reformulated the global heading initialization as a classification task
. There are four predefined distinct positions and orientations that the robot can start in or enter the game from. In two of these spots, the robot should start facing the opponent goal, which the location is either near the center circle or the goal area. The other two sets of locations are beside the sideline in the robot’s respective half, while facing the field. To choose from these predefined locations and orientations, we employ a multi-hypothesis version of our localization module, which is initialized with four different hypotheses. In the beginning, the robot attempts to discern the most likely hypothesis among all running instances. This process terminates when either the method times out or the robot finds the clearly most probable hypothesis. Ultimately, the vision module keeps the valid instance and rejects the rest. To verify the decision, we double check the result based on the recognized landmarks like the center circle and the goalposts.
3.2 Soccer Behaviors
Over the past 2–3 years, we have refined our soccer behaviors to become more robust, flexible, and easier to tune [22, 14]. The behaviors are implemented as a highly modularised multi-layer hierarchical state machine and packaged into a ROS module that communicates with other parts of the software, like the vision node and gait motion module, via ROS topics. In this paper, we describe the current state of this architecture which was originally described in .
The flow of information and control starts with the ROS topics for which the behavior node is the subscriber, covering predominantly the game state perception, localization and game controller information coming from other nodes. This is captured and read by a ROS interface layer, which abstracts away all ROS-specific knowledge and code. The information is then distilled down into a standardized SensorVars structure, that at the beginning of each cycle is updated and recalculated with the latest direct and derived information about the state of the robot and soccer game. The so-called sensor variables are then used by the upper main layer of the state machine, referred to as the ‘Game FSM’. This includes a range of behaviors that determine the soccer gameplay, including ball handling, goalie and positioning skills, which are all required at different times of the game. A standardized set of outputs are provided by the game behaviors that specify parameters like walking targets, ball targets (where to kick or dribble to), whether kicking and/or dribbling should be allowed in the current situation, and so on. These outputs are in turn the inputs to the lower main layer of the state machine, referred to as the ‘Behavior FSM’. In this layer, low-level skills are implemented, such as searching for the ball, walking to the ball, kicking and/or dribbling it, and diving for the ball (enabled only for goalkeepers). The Behavior FSM then, in turn, provides a standardized set of outputs that determine where the robot should look, whether the robot should walk or not, and if so, with which velocity in what direction, as well as whether the robot should dive or kick, and if so, which direction of dive or type of kick. This information is then passed back to the ROS interface layer, which ensures that the other nodes are notified of the required actions of the robot.
3.2.1 Ball Approach:
Walking to the ball, or more specifically, behind the ball while orienting to the correct direction for the ball target, is a Behavior FSM-level skill. It is performed by calculating an orientation-specific halo around the ball and constructing a path plan out of linear and circular arc segments that avoids entering the halo. Further away from the ball, the priority is to turn and walk directly in the direction that the robot needs to go, as forward walking is the fastest and most reliable, but as the robot approaches the ball, it smoothly transitions towards using more omnidirectional walking to approach the desired final position, while also starting to turn to face the direction that the robot wishes to kick or dribble the ball. The ball is aligned with the foot that is closest to the required position for the required action.
3.2.2 Kicking and Dribbling:
If during the ball approach the ball is detected to be in a suitable region relative to the robot for a suitable amount of time, the kicking and/or dribbling skill behavior is activated. Kicking can only be activated when the robot is standing close to the ball in a suitable position and orientation to kick, but dribbling can sometimes activate up to away from the ball, so that the robot can follow a dribble approach trajectory and walk right through the ball at speed, leading to smoother, faster and more effective dribbling performance.
3.2.3 Obstacle Avoidance:
It was a greatly simplifying design choice to implement obstacle avoidance in a completely generic manner, independent of what behavior skill is currently active. The output gait velocity of the Behavior FSM is a combination of a 2D walking vector with a rotational velocity. In the presence of an obstacle within a relevant distance of the robot, the walking vector of the robot is rotated away from the obstacle in a way that limits the maximum radial inwards walking velocity towards the obstacle. Further away from the obstacle (for example) the limit radial velocity is high, so there is little change to the robot’s walking intent, but when very close to the obstacle the limit radial velocity even becomes negative to ensure that the robot will distance itself from the obstacle. A turning component is also proportionally added to the commanded rotational velocity to make the robot turn away from the obstacle, helping it to for example walk past the obstacle if it is blocking the way.
3.2.4 Obstacle Ball Handling:
The obstacle ball handling was similarly implemented in a completely generic way, but one layer higher in the Game FSM. Given the situation that there is a ball and a ball target, i.e. where the ball should be kicked or dribbled to, then if there is an obstacle that is blocking this possibility, the ball-target is rotated out to avoid the obstacle, more so for closer and more relevant obstacles, and less so for further out obstacles. This enables the robot to identify and kick past a goalkeeper to score a goal. If the obstacle is too close to the robot, or the ball-target has to be rotated more than the amount for example by which a goal can still be safely scored, then kicking is disabled and dribbling is forced to try to take the ball off the opponent, which ideally makes space to then kick the ball towards its intended target.
3.3 Bayesian Gait Optimization
The gait is based on an open-loop Central Pattern Generator which calculates a nominal state for the joints using the gait phase angle. The phase angle is proportional to the step frequency  and controls the movement of the arms and legs. This approach has been improved by the use of fused angle feedback mechanisms, which introduce corrective actions to counteract disturbances [2, 3]. These fused angle feedback controllers establish new parameters, which need to be tuned. To ensure a high standard of performance, robot-specific parameters have to be tuned for each robot. Moreover, since the robot wears off during extensive use, parameters will become suboptimal, for instance over the course of a RoboCup competition.
As walking is one of the most crucial skills of a humanoid robot, it has to be robust and reliable at all times. To achieve this goal, we optimize the parametrization of the aforementioned fused angle feedback controller autonomously. Using Bayesian optimization, we rely not only on real-world experiments but also on simulated experiments to gain useful information, without wearing off the hardware of the robot. This approach has already been successfully applied to the igus® Humanoid Open Platform  and the NimbRo-OP2X .
Our approach is able to optimize the parameter set in a sample-efficient manner, trading off exploration and exploitation efficiently. This trade-off depends on a kernel function and the parametrization of the underlying Gaussian Process (GP). The latter encodes problem-specific values like signal noise and can be measured by a series of initial experiments . The proposed kernel, on the other hand, is composed of two components, where the first term encodes simulation performance and the second term functions as an error-term resembling the difference between simulation and the real-world performance:
where is an augmented parameter vector and is a flag signalizing whether an evaluation has been performed in the simulator or on the real system. If, and only if both experiments have been performed in the real world, is defined to be , resulting in a high correlation. Due to the error term , it is possible to model complex, non-linear mappings between the simulator and real-world evaluations . For both terms of the composite kernel, we chose the Rational Quadratic kernel, since it has been proven to be appropriate in previous work . This composite kernel is then used to perform Gaussian Process regression on the data points.
Since real-world experiments are expensive, we utilize Entropy as a measure of information content to sample data points efficiently. In this manner, the next point of evaluation is chosen with respect to the maximal change of entropy, weighted by a factor that trades off the cost of simulated and real-world evaluations .
The cost function is a combination of aggregated fused angle feedback, as a stability measure, and a logistic function which penalizes parameters of large magnitude. Furthermore, we consider the sagittal () and lateral () planes separately to reduce the complexity of the cost function. This results in the final cost functions:
which depend on the parameters of the fused angle feedback controller. To reduce the impact of simulation noise, we average the cost of evaluations. Each evaluation is a predefined sequence of movements into forward, sideways and backward directions. In the presented example, we optimize P and D gains of the arm angle corrective actions in the sagittal direction, but the method can be similarly applied on different controllers. We limit the number of real-world evaluations to . This limit was reached after evaluating simulations, thus resulting in a total number of iterations. The resulting optimized parameters were validated by comparison with the performance of the old gait parameters over five gait sequence evaluations each. The optimized parameters not only reduce the fused angle feedback deviation by about , but also lead to a qualitatively more convincing gait .
The resulting Gaussian Process posterior is depicted in Fig. 5. Note that simulations are important especially in early iterations, even though their impact might not be directly visible in the final posterior . This is proven by the fact that the robot did not fall during optimization, thus confirming that the model is able to utilize information of the simulator effectively.
In RoboCup 2018, AdultSize robots autonomously competed in one vs. one soccer games, two vs. two drop-in games, and four technical challenges that tested different abilities. The soccer games were performed on a artificial grass field, which made locomotion challenging. Due to the dynamic lighting conditions, perceiving the environment and localization were also challenging. Our robots performed outstandingly by winning all of the four possible awards, including the Best Humanoid Award. In the main tournament, our robots played a total of six games, including the quarter-finals, semi-finals, and finals. Additional five drop-in games were played, where two vs. two mixed teams were formed and robots collaborated during the game. Our robots officially played 220 minutes with a total score of 66:5.
4.1 Technical Challenges
In the following sections, we discuss four technical challenges at RoboCup 2018: Push Recovery, High Jump, High Kick, and Goal Kick from Moving Ball.
4.1.1 Push Recovery:
The goal of this challenge is to withstand a strong push which is applied to the robot on the level of the CoM by a pendulum. To define the impulse, a 3 kg weight is retracted by a distance from the point of contact with the robot. The push is applied both from the front and from the back while the robot is walking on the spot. NimbRo-OP2X was able to successfully withstand a push from the front and the back with cm.
4.1.2 High Jump:
The goal of the high jump is to remain airborne as long as possible during an upward jump. In order to successfully complete the challenge, the robot has to reach a stable standing or sitting posture upon landing. The challenge was performed using a predesigned jump motion, which was constructed with our keyframe editor. Copedo has successfully completed the challenge, remaining airborne for 0.147 s.
4.1.3 High Kick:
This challenge poses the task of scoring a goal over an obstacle positioned on the goal line. The ranking for this challenge is based on the height of the kick. The ball starts at the penalty mark, and multiple kicks are allowed during one trial. We utilized the following strategy: first move the ball closer to the obstacle by a kick of reduced power and then perform a specially designed kick to overcome the obstacle. The kick was manually designed in a way that the foot hits the ball significantly lower on its COM and then moves upwards, which allows to kick the ball into the air instead of rolling it on the ground. We managed to perform a high kick over an obstacle of 21.5 cm. The whole trial took 14.4 s. NimbRo-OP2 performing the challenge is shown in Fig. 6.
4.1.4 Goal Kick from Moving Ball:
The task of this challenge is to score a goal by kicking a moving ball into the goal. The robot is standing at the penalty mark. At RoboCup 2017 a special ramp was used to direct the ball towards the robot. In contrast, at RoboCup 2018 a human player was giving a pass to the robot from a corner, symbolizing a situation from the real soccer game. Our approach for solving this task was as follows: once positioned at the penalty mark, the robot lifts its foot to be ready for kicking and is standing on the other foot, human player kicks the ball towards the robot; using ball detection and its pose estimation we estimate the velocity of the ball and its approximate time of arrival to the area of a potentially successful kick; given this time, we execute the kicking motion when necessary. Since the robot is initially standing on one foot, with the other lifted upwards, the kick can be performed quickly, which allows for higher speed of the pass and, hence, faster scoring of the goal, which was the primary criterion in team rankings. Standing on one foot, which is also performed by many other teams during this challenge, has two major drawbacks: the robot is not stable in that posture, and it cannot adjust if the pass is not accurate enough. In the future we will work on a more general approach to perform this challenge. NimbRo-OP2X was able to score a goal in 2.78 s after a human player touched the ball (see Fig. 7).
The recorded parameters describing our performance at technical challenges are summarized in Table 1.
|Pendulum weight [kg]||3||Push Recovery|
|Pendulum swing [cm]||90|
|Obstacle height [cm]||21.5||High Kick|
|Time for completion [s]||14.4|
|Time airborne [s]||0.147||High Jump|
|Time for completion [s]||2.78||Kick from Moving Ball|
In this paper, we presented hardware and software design that lead us to win all possible competitions in the AdultSize class for the RoboCup 2018 Humanoid League in Montréal: the soccer tournament, the drop-in games, the technical challenges, and the Best Humanoid Award. We presented individual skills regarding the perception, the bipedal gait tuning, and behavior as well as their application in the technical challenges. A video showing the competition highlights is available online111RoboCup 2018 NimbRo AdultSize highlights: https://www.youtube.com/watch?v=tPktQyFrMuw. The hardware of the NimbRo-OP2 generation222Hardware: https://github.com/NimbRo/nimbro-op2 as well as our software333Software: https://github.com/AIS-Bonn/humanoid_op_ros were released open-source to GitHub with the hope that other teams and research groups benefit from our work.
This work was partially funded by grant BE 2556/13 of the German Research Foundation (DFG).
-  (2013) Hierarchical and state-based architectures for robot behavior planning and control. In 8th Workshop on Humanoid Soccer Robots, International Conference on Humanoid Robots (Humanoids), Cited by: §3.2.
-  (2015) Fused Angles: A representation of body orientation for balance. In IROS, Cited by: §3.3.
-  (2016) Omnidirectional Bipedal Walking with Direct Fused Angle Feedback Mechanisms. In Humanoids, Cited by: §3.3.
-  (2015) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. arXiv:1511.00561. Cited by: §3.1.
-  (2006) Online trajectory generation for omnidirectional biped walking. In ICRA, Cited by: §3.3.
-  (2017) FreezeOut: accelerate training by progressively freezing layers. arXiv preprint arXiv:1706.04983. Cited by: §3.1.
-  (2017) RoboCup Rescue Team Description Paper NuBot. Technical report University of Newcastle. Cited by: §3.
-  (2018) Unbounded Designers Teen & Kid Size Team Description Paper. Technical report Azad University of Isfahan. Cited by: §3.
-  (2015) A Monocular Vision System for Playing Soccer in Low Color Information Environments. In 10th Workshop on Humanoid Soccer Robots (Humanoids), Cited by: §3.1, §3.1.
-  (2017) RoboCup 2016 Humanoid TeenSize winner NimbRo: Robust visual perception and soccer behaviors. In Robot World Cup XX, Cited by: §3.1.1.
-  (2017) Online visual robot tracking and identification using deep lstm networks. In Int. Conf. on Intelligent Robots and Systems (IROS), Cited by: §3.1.
-  (2017) NimbRo-OP2: Grown-up 3D Printed Open Humanoid Platform for Research. In Humanoids, Cited by: §2.
-  (2018) NimbRo-OP2X: Adult-sized Open-source 3D Printed Humanoid Robot. In Humanoids, Cited by: §2, §3.3, §3.3.
-  (2018) Grown-up NimbRo Robots Winning RoboCup 2017 Humanoid AdultSize Soccer Competitions. Robot World Cup XXI. Cited by: §2, §3.1.1, §3.2.
-  (2018) ImageTagger: an open source online platform for collaborative image labeling. In Robot World Cup XXII, Cited by: §3.1.
Entropy search for information-efficient global optimization.
Journal of Machine Learning Research. Cited by: §3.3.
Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization. In ICRA, Cited by: §3.3.
-  (1965) A simplex method for function minimization. The Computer Journal. Cited by: §3.1.
-  (2009) ROS: An open-source robot operating system. In ICRA, Cited by: §3.
-  (2018) ICHIRO Team Description Paper Humanoid Teensize League. Technical report Institut Teknologi Sepuluh Nopember. Cited by: §3.
-  (2018) Combining Simulations and Real-robot Experiments for Bayesian Optimization of Bipedal Gait Stabilization. RoboCup International Symposium. Cited by: §3.3, §3.3, §3.3, §3.3.
-  (2018) Advanced Soccer Skills and Team Play of RoboCup 2017 TeenSize Winner NimbRo. RoboCup 2017: Robot World Cup XXI. Cited by: §3.2.
-  (2015) U-Net: convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-assisted Intervention, Cited by: §3.1.
-  (2017) Detection and Localization of Features on a Soccer Field with Feedforward Fully Convolutional Neural Networks (FCNN) for the Adult-size Humanoid Robot Sweaty. In 12th Workshop on Humanoid Soccer Robots (Humanoids), Cited by: §3.1.
Cyclical learning rates for training neural networks.
Applications of Computer Vision (WACV), Cited by: §3.1.
-  (1985) Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing. Cited by: §3.1.
-  (2014) How transferable are features in deep neural networks?. In Advances in neural information processing systems, pp. 3320–3328. Cited by: §3.1.