DeepAI
Log In Sign Up

Markerless Visual Robot Programming by Demonstration

In this paper we present an approach for learning to imitate human behavior on a semantic level by markerless visual observation. We analyze a set of spatial constraints on human pose data extracted using convolutional pose machines and object informations extracted from 2D image sequences. A scene analysis, based on an ontology of objects and affordances, is combined with continuous human pose estimation and spatial object relations. Using a set of constraints we associate the observed human actions with a set of executable robot commands. We demonstrate our approach in a kitchen task, where the robot learns to prepare a meal.

READ FULL TEXT VIEW PDF

page 1

page 4

page 5

03/07/2018

3D Human Pose Estimation in RGBD Images for Robotic Task Learning

We propose an approach to estimate 3D human pose in real world units fro...
08/22/2019

Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks

Recognising human actions is a vital task for a humanoid robot, especial...
12/13/2015

Articulated Pose Estimation Using Hierarchical Exemplar-Based Models

Exemplar-based models have achieved great success on localizing the part...
03/28/2016

Shuffle and Learn: Unsupervised Learning using Temporal Order Verification

In this paper, we present an approach for learning a visual representati...
12/04/2018

Sensorless Pose Determination using Randomized Action Sequences

This paper is a study of 2D manipulation without sensing and planning, b...
03/11/2022

Registering Articulated Objects With Human-in-the-loop Corrections

Remotely programming robots to execute tasks often relies on registering...

I Introduction

Programming by Demonstration (PbD) [1] is an alternative method for teaching robot tasks where no expert programmer is needed. Instead, an expert demonstrator shows a robot a task to imitate. Robots could be programmed textually [3], by graphical user interfaces [4, 5], touch guidance [6], teleoperation or actual demonstration of tasks [7, 8]. We believe that common programming methods for robots will be to complex to handle specialized tasks. In the future it can be expected that robots are employed in almost all fields in human life. This raises the need for alternative programming methods for complex tasks.

PbD is already used by many industrial service robots [9, 10]. Other approaches are mostly focusing on adding markers [11, 7, 8] or external sensors to the demonstrator [12], usually interfering with the demonstrator during the task. We propose an approach that does not rely on changes of the environment and does not rely on additional sensors except a common RGB camera.

Current robotic systems that lack a certain desired behavior, commonly need an expert programmer to add the missing functionality. Contrary, we introduce an approach related to programming robots by visual demonstration [13]

that can be applied by common users. Provided a basic scene understanding, the robot observes a person demonstrating a task and is then able to reproduce the observed action sequence using its semantic knowledge base.

Fig. 1: Approach overview for extracting action informations from 2D image sequences in order to execute them on a mobile robot. Exemplary object detections (yellow, pink) and human pose estimates (green) are observed. Actions are recognized using a set of constraints. For replicating the observed actions we used two mobile robots equipped with an arm.

We image different use cases for our approach. The observed action can be associated with a command given by natural language, where parameters are exchangeable due to the semantic representation. Further, robots observing (and interpreting) tasks can create a task representation. This representation can be transfered to other robots [14] in order to execute the task without ever seeing it. Trajectory level learning combined with a task level representation learning could deprecate currently common programming methods for robot and enable non expert programmer to teach new tasks from ground up [15].

The core contribution is an approach for markerless action recognition based on Convolutional Pose Machines (CPM) [16], object observations [17] and continuous spatial relations. We show that the actions are executable on a robot that is able to execute a set of common actions. The initial scene analysis allows semantic reasoning in case the required object is not present. Further, this allows executing the same action sequence with different objects which is a major benefit over action sequencing approaches that rely on positional data only. Even so we are demonstrating our approach on 2D observations, the formulations are also adaptable in 3D. Fig. 1 gives an overview of our approach.

This paper is structured as follows. In Section II we discuss similar approaches. Section III describes our approach in depth. Experimental results are shown in Section IV. Section V concludes the paper with an outlook.

Ii Related Work

Many approaches for programming robot systems have been proposed in the past. Most common are approaches for guidance by force sensors [18, 14], teleoperation or sensory based approaches [19, 7, 11]. Sensor based approaches, like motion capturing camera systems are relying on active changes of the environment. Most commonly the changes are done by attaching either reflective or non-reflective markers to the demonstrator and/or the interacting objects [7, 11]. Most common are teaching by guidance approaches [18] that allow the guidance of a robot arm through force sensors. These approaches are subsumed under the term learning by demonstration. It usually describes approaches relying on a demonstrator moving a robot arm to perform a task. Using force torque sensors the arm is able to recognize the forces applied by a demonstrator. Throughout the movements, the joint angles are recorded and used for a later replay.

Wächter et al. [7] have shown an approach for action sequencing that analyses the demonstrated task by motion capture measurements and have executed the observed behavior by a humanoid robot. More advanced approaches take the pose of the robot arm and combine object estimates to act not just on a trajectory level, but associate the interacting objects. Koskinopoulo et al. [20] formulate a latent representation of task observations demonstrated by a human and employ them for human robot collaborative tasks. Magnanimo et al. [21]

proposed an approach to recognize tasks executed by a human and predict next actions and objects to manipulate using a Dynamic Bayesian Network. Schneider et al. 

[6] presented integrated object estimates and adapt touch guided trajectories to new object positions with Gaussian processes.

As we have pointed out, most of the approaches have in common to actively attach sensors or markers on the demonstrator. Only a small amount of approaches [13] deal with robotic systems that only employ the onboard sensor setup as an observer. Further, approaches like [22] [23] support the users by providing graphical user interfaces for programming and giving visual feedback during the programming procedure. Other approaches allow to program a robot by natural language [24] with a low-level set of commands guiding the arm movements.

Iii Approach

Fig. 2: Overview of our approach in action sequencing
Fig. 3: Ontology for Object Affordances in manipulation task

In this section, we describe our proposed approach in detail. Fig. 3

gives an overview over our approach. First, the robot analyzes the scene from its point of view using only on-board sensors. The goal is to detect objects and a demonstrator. Then, the robot starts observing the joints with special focus on the hands and object’s positions in the scene. Note that our approach requires prior knowledge about the objects used. For this purpose, the scene overview is forwarded to a neural network (YOLO

[17]), which assigns a class label to every detected object. Based on RGB images only, spatial relations between hands and objects, including semantic relations between objects and actions, are be segmented. As we work only with 2D data we need an additional mechanism which allows representing complex contextual knowledge for objects and possible actions. For this purpose we use a semantic knowledge base, where object affordances are modeled with an ontology. The ontology represents semantic relations between objects and possible actions as depicted in Fig. 3. Finally, the robot matches the sequence of observed sub-actions with a set of predefined reusable actions.

Iii-a Initial Scene Analysis

We assume that the robot is facing a table where a person is about to start a demonstration. First, the robot gets an initial view of the scene by checking if a human demonstrator is visible. Then the objects on the table are analyzed. During the initializing step we first assign every object from the set of trained objects , where is the number of trained objects, with two possible classes - active and passive for manipulable (cup, watering can) and not-manipulable (bowl, plant) objects. Note, that only objects that can be safely manipulated with one hand are manipulable in this context, since the experiments are performed on a robot with only one arm. We store the initial positions of passive objects and both, position and approximated local changes of bounding boxes of active objects (see Fig. 1). We use the local changes in order to determine complex actions such as pouring.
The object ontology is used in the current and next step of the proposed approach to build constraints for observation-action mapping.

We denote the observed object and it’s position as the estimated centroid point . In every RGB-frame we estimate hand positions and express them as , with . Data extracted from RGB-images is synchronized over time and stored in a common data frame . Moreover, we track local changes of the object in order to detect a pouring action. For this purpose, in every frame

we calculate a vector

using object bounding box coordinates

(1)

where are the pixel coordinates of the top left corner and are the pixel coordinates of the bottom right corner of the object’s bounding box in 2D image space. From the initial scene analysis we store the initial centroid of every detected object as . Further, for look up over the object ontology and assigning object to active_object or passive_object class we define a function as

Furthermore, we present a set with affordances and a function for retrieving all possible affordances for a given object from the knowledge base:

(2)

A third function indicates, whether a specific affordance is a valid affordance of object :

We additionally store the table bounding box using the extracted top left and bottom right corners of the table plane in 2D pixel space during initial scene analysis:

(3)

Iii-B Action Analysis

After the scene initialization the human demonstrator starts the manipulation task. During the demonstration we estimate the human’s joint positions by CPM [16], which provides human skeleton data containing 18 keypoints as it is shown in Fig. 1. The wrist positions are retrieved in image coordinates using the RGB camera. Moreover, we extract a list with detected objects from RGB-data and build a common synchronized data frame. This frame is used for resolving spatial relations [25] in order to determine the potential contacts between hands and objects, similar to the method presented by Waechter at al. in [7]. However, we do not utilize any marker-based capturing systems for observing the task. As mentioned above, all of the observations during the demonstration occur in 2D image space. We assign detected objects to possible actions considering affordances modeled in ontology.
In the following, we show in detail how actions are associated with the robot’s observations based on sequences of pre-defined constraints. Every constraint can be evaluated to be true or false. Consider the following set of constraints that we apply to every data frame :

  1. : The object

    is classified as an active object:

    (4)
  2. : The object is classified as a passive object:

    (5)
  3. : A set of contextual properties modeled in the ontology for active object contains the affordance :

    (6)
  4. : A set of contextual properties modeled in the ontology for passive object contains the affordance :

    (7)
  5. : A distance between a hand and an active object’s position is smaller than the distance threshold :

    (8)
  6. : A distance between a hand and an active object’s position is greater than the distance threshold :

    (9)
  7. : The number of data frames with valid condition is greater or equals the frame threshold :

    (10)
  8. : The number of data frames with valid condition is greater or equals the frame threshold :

    (11)
  9. : Position changes for the active object as well as for the hand have been detected in two consecutive frames:

    (12)
  10. : A rotation of the active object in consecutive frames is detected:

    (13)
  11. : The active object is on the table:

    (14)
  12. : The active object is located over the passive object :

    (15)

Note, in practice all equalities used in constrains () are modeled with a small to allow approximate equality.

We can represent a number of basic actions like object picking, placing, pouring and putting using this generalized constraints. For instance, a picking action is defined as a sequence on a subset of :

(16)

For association of the picking action with a set of observations, the list of affordances retrieved from the semantic knowledge base (constraint ) must contain the property (Fig.  3).

For the placing action we define the following sequence of constraints:

(17)

Analogically to the picking action, we can detect the placing action only if a correspondent property exists in the knowledge base for the active object being placed.

Once the action is defined as a sequence of constraints we store it similar to the approach presented by Waechter et al. [7]. Then we can reuse stored actions for building sub-sequences of constraints for association of more complex actions. In the following, we present the association of a pouring action. The pouring action involves two types of objects during execution: a passive object with property accept_pouring retrieved through constraint and an active object with property pour from constraint . We denote an active object as and a passive object as . Further, for association of the pouring action we calculate the approximate local changes of the active objects over time as defined in Section III. Consider the sequence of constraints for the pouring action:

(18)

Using this sequence of constraints we can infer more complex activities, for instance watering a plant:

(19)

where is a watering can and a plant.
Once the robot associated observations with an action it stores the sequence for later re-executions.

Iii-C Action Execution

Resulting actions from the analysis are taken and executed sequentially. Reusable actions have been modeled in advance and are based on a subset of actions available on our service robot [26].

For grasping an object the robot executes an initial scene analysis. Object position estimates are then transformed into a common coordinate system with the robot manipulator. For manipulating the object we use a full body trajectory execution, meaning that the trajectory contains movements for the robot’s torso to adjust it’s height, as well as the robots arm. The trajectory is calculated by a single motion path query [27] provided by a motion planning library [28].

In order to avoid obstacles we use the segmented plane of the tabletop segmentation as input for a local 3D gridmap [29] representation of the scene. We found this to be more error prone than using the full pointcloud as input due to noisy measurements.

Iv Experiments

Iv-1 Robot Description

As an experimental platform we use two robots: service robot Lisa [30] and the service robot TIAGo [31]

. Both robots are equipped with two degree of freedom (DoF) pan-tilt units with mounted RGB-D cameras on top of it. For voice interaction we mounted a directional microphone and the robot reports it’s current action by text to speech synthesis. Lisa performs manipulation tasks using 6 DoF Kinova Mico arm. TIAGo has an arm with 7 degrees of freedom.

Fig. 4: Experimental setup (left) where the robot observes a human demonstrator and executes the observed task (right)
Fig. 5: Different human demonstrators performing actions from the robots perspective in two setups: watering a plant and preparing meal.

Iv-2 Setup description

For experiments we use two different setups. The first scenario was built within a kitchen environment with common objects used for the preparation of a meal. For the second scenario, in which we associate the action sequence to a plant watering action, we used a simple setup with a plant, a watering can and a mug located on a table. Note that for these setups we refer to the same action sequence 18 using different sets of objects. The experimental setup are shown in Fig. 4. Exemplary images of different scenarios seen from the robot perspective are shown in Fig. 5.

In both cases the robot was placed at the opposite side of the table, facing the human demonstrator. The used objects have been trained in order to be recognized by the robot. The object locations as well as demonstrator’s location are not known by the robot in advance. Next to the weights for object and human body detection we modeled possible affordances of each object, which rely on robot actions for later execution, as it is described in III.

Iv-3 Description of experiments

In order to evaluate our approach we run 16 experiments with six different demonstrators and different object sets. The demonstrators were asked to perform following actions in a natural way: to pick a manipulatable object, to pour with it into a non-manipulatable object and then to place the object on the table. As discussed before, we modeled such objects in our ontology as active_objects and passive_objects meaning, that they can be manipulated only with one hand. The demonstrators used only one of the hands while performing the actions. Note, for experiments with spaghetti we adapted the actions sequence 18. We places spaghetti in a cup for better manipulation and ”poured” it into a pot. Still our object recognition system was able to recognize both spaghetti and the blue cup. We applied constraints to bounding boxes which were classified as spaghetti for association of the ”pouring ” action to the observations.

After the actions have been demonstrated, we asked the robots to execute the detected actions. We defined a successful action association if the robot could recognize basic actions correctly in a reasonable amount of time based on the given sequence of constraints. We do not focus on action execution in this paper, because robots use pre-defined execution modules. Instead we focus on correct interpretation of observed action sequences and a correct interpretation based on the constraints. Videos recorded throughout the experiments are available on our project page111https://userpages.uni-koblenz.de/~raphael/project/imitation_learning/.

Iv-a Results

We analyze the success rate of the action association. The results of our experiments are presented in three tables. The success rates of action sequence association with activities from the experimental setup is shown in Table I. The results indicate that our actual approach performs well on action assigning within a preparing meal scenario, but achieves a worse performance for watering plant activity. Further analysis indicate that the success of action sequence association decreases when big objects are manipulated by the demonstrator (watering can). Table II shows the performance of the system with respect to individual manipulable objects that were involved in experiments. Results depicted in Table II prove that our approach works well on objects that are slightly bigger then the human hand (blue cup, yellow cup, mug) achieving overall performance of 97.5% for picking and pouring, 100% for placing. Action association that involved the object watering can tend to be less accurate, reaching a maximum success rate of 50% on picking and pouring and 40% on placing. This might be overcome in adding the grasping poses to the ontology as well.

Action Preparing meal Watering a plant
Pick 1.0 0.625
Place 0.875 0.75
Pour 0.875 0.625
TABLE I: Success rate of association action sequences to activities

Object
Pick Pour Place
Blue cup 1.0 1.0 1.0
Mug 1.0 1.0 1.0
Pink watering can 0.5 0.5 0.4
Yellow cup 1.0 1.0 1.0
Spaghetti 0.9 0.9

TABLE II: Success rate of action association considering involved active objects during manipulation

We consider now constraints defined in Section III in details. Table III shows success rate of applying constraints to objects used in the experiments. Note that we skipped the first four constrains as they are not dynamic and do not change during demonstrations. Almost all constraints are recognized well with manipulable and non-manipulable objects achieving a high success rate of 100%. In case of using the watering can the distance from object centroid to wrist was not small enough to evaluate appropriate constraints as true. During some demonstrations the distance between the hand and the watering can was small enough only in some frames, which we considered as noise. Depending on the orientation of the watering can the distance to the hand was greater, then the hand object threshold during demonstrations. Therefore, the placing action could not be associated correctly.

Object/Const.
Blue cup 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
Mug 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
Watering can 0.6 0.4 0.4 0.4 1.0 1.0 1.0 1.0
Yellow cup 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
Spaghetti 1.0 1.0 1.0 1.0 1.0 1.0 1.0
TABLE III: Success rate of action association considering constrains

V Conclusion

We presented an approach to imitate human actions with a domestic service robot. The imitation is based on observations using only on-board sensors - without any augmentations of the environment by markers. Convolutional pose machines (CPM) are used for estimating the demonstrator’s joints during demonstration. Bounding boxes of objects are detected over the period of the demonstration. The spatial relations between the demonstrator’s hand and the observed objects in 2D pixel space together with modeled affordances in a knowledge base are used to define constrains for action association. The associated actions are then executed using a set of available actions on a domestic service robot. Results were demonstrated on a real robot by observing actions to prepare a meal and water a plant.

For future development we plan to add a continuous object segmentation [32] to augment the current approach that is based on an initial object analysis of the scene. An interesting extension to the proposed approach would be to consider using constraints in 3D robot coordinate system, as we detected some limitations during action association using 2D image space. Another extension would be to incorporate finger detection into the observation. Furthermore extending the current approach by pose estimations of the object or hand will allow further reasoning about more detailed action parameters i.e. for pouring.

Acknowledgement

We want to thank PAL Robotics for supporting us with the award of a TIAGo robot in terms of the European Robotics league.

10

References

  • [1] A. Billard, S. Calinon, R. Dillmann, and S. Schaal, “Robot programming by demonstration,” in Springer handbook of robotics.   Springer, 2008, pp. 1371–1394; .
  • [2]
  • [3] W. A. Gruver, B. I. Soroka, J. J. Craig, and T. L. Turner, “Industrial robot programming languages: a comparative evaluation,” IEEE transactions on systems, man, and cybernetics, no. 4, pp. 565–570, 1984; .
  • [4] U. Thomas, G. Hirzinger, B. Rumpe, C. Schulze, and A. Wortmann, “A new skill based robot programming language using uml/p statecharts,” in Robotics and Automation (ICRA), 2013 IEEE International Conference on.   IEEE, 2013, pp. 461–466; .
  • [5] D. C. MacKenzie and R. C. Arkin, “Evaluating the usability of robot programming toolsets,” The International Journal of Robotics Research, vol. 17, no. 4, pp. 381–401, 1998; .
  • [6] M. Schneider and W. Ertel, “Robot learning by demonstration with local gaussian process regression,” in Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on.   IEEE, 2010, pp. 255–260; .
  • [7] M. Wächter, S. Schulz, T. Asfour, E. Aksoy, F. Wörgötter, and R. Dillmann, “Action sequence reproduction based on automatic segmentation and object-action complexes,” in 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids), Oct 2013, pp. 189–195; .
  • [8] P. Pastor, H. Hoffmann, T. Asfour, and S. Schaal, “Learning and generalization of motor skills by learning from demonstration,” in Robotics and Automation, 2009. ICRA’09. IEEE International Conference on.   IEEE, 2009, pp. 763–768; .
  • [9] H. Friedrich, S. Mnch, R. Dillmann, S. Bocionek, and M. Sassin, “Robot programming by demonstration (rpd): Supporting the induction by human interaction,” Machine Learning, vol. 23, no. 2-3, pp. 163–189, 1996; .
  • [10] J. Aleotti, S. Caselli, and M. Reggiani, “Toward programming of assembly tasks by demonstration in virtual environments,” in Robot and Human Interactive Communication, 2003. Proceedings. ROMAN 2003. The 12th IEEE International Workshop on.   IEEE, 2003, pp. 309–314; .
  • [11] J. Aleotti and S. Caselli, “Robust trajectory learning and approximation for robot programming by demonstration,” Robotics and Autonomous Systems, vol. 54, no. 5, pp. 409 – 413, 2006, the Social Mechanisms of Robot Programming from Demonstration. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0921889006000212 ;
  • [12] S. Calinon and A. Billard, “Active teaching in robot programming by demonstration,” in Robot and Human interactive Communication, 2007. RO-MAN 2007. The 16th IEEE International Symposium on.   IEEE, 2007, pp. 702–707; .
  • [13] P. Azad, Visual perception for manipulation and imitation in humanoid robots.   Springer Science & Business Media, 2009, vol. 4; .
  • [14] N. Figueroa, A. L. Pais Ureche, and A. Billard, “Learning complex sequential tasks from demonstration: A pizza dough rolling case study,” in The Eleventh ACM/IEEE International Conference on Human Robot Interaction.   IEEE Press, 2016, pp. 611–612; .
  • [15] R. Caccavale, M. Saveriano, A. Finzi, and D. Lee, “Kinesthetic teaching and attentional supervision of structured tasks in human–robot interaction,” Autonomous Robots, pp. 1–17, 2018; .
  • [16] S. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh, “Convolutional pose machines,” CoRR, vol. abs/1602.00134, 2016. [Online]. Available: http://arxiv.org/abs/1602.00134 ;
  • [17] J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” CoRR, vol. abs/1506.02640, 2015. [Online]. Available: http://arxiv.org/abs/1506.02640 ;
  • [18] D. Seidel, C. Emmerich, and J. J. Steil, “Model-free path planning for redundant robots using sparse data from kinesthetic teaching,” in Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on.   IEEE, 2014, pp. 4381–4388; .
  • [19] A. Vakanski and F. Janabi-Sharifi, Robot Learning by Visual Observation.   John Wiley & Sons, 2017; .
  • [20] M. Koskinopoulou, S. Piperakis, and P. Trahanias, “Learning from demonstration facilitates human-robot collaborative task execution,” in 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), March 2016, pp. 59–66; .
  • [21] V. Magnanimo, M. Saveriano, S. Rossi, and D. Lee, “A bayesian approach for task recognition and future human activity prediction,” in The 23rd IEEE International Symposium on Robot and Human Interactive Communication, Aug 2014, pp. 726–731; .
  • [22] S. AlexTIAGo: the modular robot that adapts to different research needsandrova, Z. Tatlock, and M. Cakmak, “Roboflow: A flow-based visual programming language for mobile manipulation tasks,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on.   IEEE, 2015, pp. 5537–5544; .
  • [23] S. Alexandrova, M. Cakmak, K. Hsiao, and L. Takayama, “Robot programming by demonstration with interactive action visualizations.” in Robotics: science and systems, 2014; .
  • [24] M. Forbes, R. P. Rao, L. Zettlemoyer, and M. Cakmak, “Robot programming by demonstration with situated spatial language understanding,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on.   IEEE, 2015, pp. 2014–2020; .
  • [25] S. J. Rey and L. Anselin, “Pysal: A python library of spatial analytical methods,” Handbook of applied spatial analysis, pp. 175–193, 2010; .
  • [26] V. Seib, R. Memmesheimer, S. Manthe, F. Polster, and D. Paulus, “Team homer@unikoblenz approaches and contributions to the robocup@home competition,” RoboCup 2015: Robot World Cup XIX, vol. 9513, pp. 83–94, 2016; .
  • [27] J. J. Kuffner and S. M. LaValle, “Rrt-connect: An efficient approach to single-query path planning,” in Robotics and Automation, 2000. Proceedings. ICRA’00. IEEE International Conference on, vol. 2.   IEEE, 2000, pp. 995–1001; .
  • [28] I. A. Sucan, M. Moll, and L. E. Kavraki, “The open motion planning library,” IEEE Robotics & Automation Magazine, vol. 19, no. 4, pp. 72–82, 2012; .
  • [29] A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Burgard, “Octomap: An efficient probabilistic 3d mapping framework based on octrees,” Autonomous Robots, vol. 34, no. 3, pp. 189–206, 2013; .
  • [30] R. Memmesheimer, V. Seib, N. Yann Wettengel, F. Polster, D. Müller, M. Löhne, M. Roosen, I. Mykhalchyshyna, L. Buchhold, M. Schnorr, and D. Paulus, “Robocup 2017 - homer@unikoblenz (germany),” 2017, technical Report; .
  • [31] F. F. Jordi Pages, Luca Marchionni. (2016, Jun.) Tiago: the modular robot that adapts to different research needs. [Online]. Available: https://clawar.org/wp-content/uploads/2016/10/P2.pdf ;
  • [32] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” arXiv preprint arXiv:1511.00561, 2015; .