Robotic Hierarchical Graph Neurons. A novel implementation of HGN for swarm robotic behaviour control

by   Phillip Smith, et al.

This paper explores the use of a novel form of Hierarchical Graph Neurons (HGN) for in-operation behaviour selection in a swarm of robotic agents. This new HGN is called Robotic-HGN (R-HGN), as it matches robot environment observations to environment labels via fusion of match probabilities from both temporal and intra-swarm collections. This approach is novel for HGN as it addresses robotic observations being pseudo-continuous numbers, rather than categorical values. Additionally, the proposed approach is memory and computation-power conservative and thus is acceptable for use in mobile devices such as single-board computers, which are often used in mobile robotic agents. This R-HGN approach is validated against individual behaviour implementation and random behaviour selection. This contrast is made in two sets of simulated environments: environments designed to challenge the held behaviours of the R-HGN, and randomly generated environments which are more challenging for the robotic swarm than R-HGN training conditions. R-HGN has been found to enable appropriate behaviour selection in both these sets, allowing significant swarm performance in pre-trained and unexpected environment conditions.



page 9

page 10


CIMAX: Collective Information Maximization in Robotic Swarms Using Local Communication

Robotic swarms and mobile sensor networks are used for environmental mon...

A Self-Guided Approach for Navigation in a Minimalistic Foraging Robotic Swarm

We present a biologically inspired design for swarm foraging based on an...

Context-Aware Deep Q-Network for Decentralized Cooperative Reconnaissance by a Robotic Swarm

This paper addresses the problem of decentralized cooperation in a robot...

G-flocking: Flocking Model Optimization based on Genetic Framework

Flocking model has been widely used to control robotic swarm. However, w...

Swarm Behaviour Evolution via Rule Sharing and Novelty Search

We present in this paper an exertion of our previous work by increasing ...

Cooperative Pollution Source Localization and Cleanup with a Bio-inspired Swarm Robot Aggregation

Using robots for exploration of extreme and hazardous environments has t...

Universal Swarm Computing by Nanorobots

Realization of universal computing units for nanorobots is highly promis...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Recent advancements in swarm robotic behaviour creation has seen an increase in task effectiveness for swarm robotics in non-trivial tasks, such as data-transfer via manet creation (Smith et al., 2018b). However, this behaviour evolution approach suffers from poor transferability between environments and problem instances. That is, significant variation between evolution environment(s) and operation environment(s) sees considerable reductions in operation task performance.

To overcome this limitation, this paper explores swarm behaviour macro-adjustments in the form of environment identification for behaviour switching. We achieve this switch by training a hgn pattern classifier with environment observations and associating each pattern with a behaviour from a pre-defined repertoire. The use of hgn has been selected due to its noted ability to work on computationally-limited devices, such as single board computers used for swarm robotics (Nasution and Khan, 2008). However, as environmental data often contains mixed inputs which may including pseudo-continuous numbers, rather than uniform categorical values, and as local environment observations may relate to multiple global environments, this paper presents a novel extension to hgn, rhgn.

The exploration of this rhgn in swarm behaviour switching is conducted in this paper via simulations of a data-transfer task seen in Smith et al. (2018b). This task is partially-observable, and requires behaviour heterogeneity both across the swarm and across task duration. As such, it is seen as a significant test-bed for the rhgn swarm implementation.

Two experiments are presented in this paper to explore the rhgn swarm: pre-trained environment operations and untrained environment operations. The former validates the rhgn implementation by training the rhgn in a collection of environments specifically developed for the behaviour repertoire, and evaluating in these same environments or in concatenations of these environments. The latter implements the rhgn driven swarm in randomly generated environments without further training. Thus emulating the swarm being deployed in an unpredictable operation with no specific prior preparation. The rhgn is evaluated in terms of resulting swarm performance compared to each behaviour being solitarily utilised, and against a random behaviour selector. Additionally, the environment matching accuracy of rhgn is assessed.

The primary contribution of this paper is a novel implementation of hgn for environment matching in robotic agents, rhgn. This contribution is achieved via altering the pattern memory structure of the gn and by altering the output to probabilistic environment-matches which utilise temporal prediction fusion for associated behaviour selection. Additionally, as this rhgn is implemented in a swarm of robots, prediction fusion is also conducted via intra-swarm sharing. The value of this novel hgn implementation is determined via: exploration of the performance by an rhgn equipped robotic swarm in both environments known a priori

and unknown; analysis of rhgn pattern matching accuracy; and comparison of the above qualities against a swarm randomly selecting behaviours.

2 Related work

2.1 Behaviour Selection

In this study, robot behaviour selection takes heavy inspiration from the growing research field of hh (Burke et al., 2013; Glover and Kochenberger, 2006). This inspiration draws directly from hh as ‘heuristics’ and ‘behaviours’ are seen to be interchangeable ideas across different disciplines.

In hh, a wide array of meta-heuristics, data-mining and meta-learning algorithms have been implemented to have a system utilise an appropriate heuristic for each problem instance. This heuristic selection has been either a single decision during start-up (Tabataba and Mousavi, 2012; Nagavalli et al., 2017; Leng et al., 2017; Burke et al., 2006; Thabtah and Cowling, 2008; Terashima-Marín et al., 2008; Smith-Miles, 2008; Hagenauer and Helbich, 2017) or a periodic decision during operation (Misir et al., 2009; Tavares et al., 2018; Soria-Alcaraz et al., 2014).

For small-scale heuristic selection, a trial and error approach was presented in Tabataba and Mousavi (2012). Each heuristic was tested in a limited run simulation and the best (known) heuristic utilised in a complete implementation. This approach was limited to a selection during startup and had computation requirements scaling from heuristics stored. Additionally, behaviours with deceptive early-operation performance may have impacted the selection quality. This simulation trialling was also seen in the field of swarm robotics in Nagavalli et al. (2017) with heuristic selection being used for A* swarm behaviour sequence searching. This approach allowed for complete behaviour planning in homogeneous swarm tasks. However, it was limited to deterministic tasks, such as targeted locomotion or area coverage. Additionally, this work relied on a centralised controller being supplied all swarm agent observations and commanded all swarm agent behaviour switches.

To avoid the drawbacks of the trial-and-error approaches, permanent knowledge may be created which associates novel problem instances and pseudo optimal heuristics (Burke et al., 2006; Thabtah and Cowling, 2008; Terashima-Marín et al., 2008; Smith-Miles, 2008; Hagenauer and Helbich, 2017). In Leng et al. (2017) a centralised swarm distribution system utilised such knowledge. This system held associations between required tasks and swarm behaviours. However, the agents of this system did not autonomously derive the required task from environment observations but rather relied on operator commands. In contrast, Burke et al. (2006)

presented a selection method which derived an optimal feature vector for problem-heuristic associations. Similarly, in

Smith-Miles (2008) a feed-forward mlp neural network was trained for optimal meta-heuristic prediction via problem feature inputs. Finally, in Thabtah and Cowling (2008) and Hagenauer and Helbich (2017)

problems were categorised via decision trees created from past experiences, with

Hagenauer and Helbich (2017) finding these trees to be more accurate than both feature vectors and mlp. However, all these approaches were limited to single behaviour/heuristic selection process during start-up. Such an approach is known to limit overall accuracy when operating in partially observable environments, as initial environmental observations may not be indicative of the environment as a whole Smallwood and Sondik (1973).

The variance in algorithms used in hh selection techniques, along with the similarities drawn between

heuristic selection and all meta-algorithms in Pappa et al. (2013), show that heuristic selection may be achieved by any pattern matching techniques. This is particularly true for large, pseudo-continuous value, pattern matching, which, for many of the above algorithms, is known to reduce recall accuracy (Kim, 2008) and require larger training sets (Baum and Haussler, 1989) and time (Hettiarachchige et al., 2018). Additionally, these algorithms are seen to suffer from poor recall accuracy as pattern similarities increase (Nasution and Khan, 2008). To address these shortcomings, this research looks to an alternative pattern matching algorithm, hgn.

2.2 Hierarchical Graph Neurons

In 2008, the one-shot learning algorithm, hgn, was developed by Nasution and Khan (2008). This algorithm expanded on the original, single layer, graph neuron algorithm (Khan and Ramachandran, 2002), overcoming the crosstalk issue. In this debut paper hgn was seen to achieve pseudo-real-time recall speeds; accurately recalling patterns with discrete identifications, even with significant pattern distortion; and was noted to be applicable for small, single-board computers, such as wireless sensor modules. Additionally, hgn was compared to a single-cycle back-propagation mlp which demonstrated a significant superiority by hgn in regards to scaling impact, both for pattern size and memorised quantity.

As a brief overview , hgn is a layered architecture of gn for pattern classification. The base layer consists of a gn row for each pattern component and a gn column for each value the components may take. Each subsequent layer of the hgn has two fewer rows than the previous, with the outer-most rows being removed. This layered reduction continues until a one-row layer is created as the top of the triangular hierarchy. When a pattern is passed to the hgn one gn of each row is activated in the base layer, with the activation being determined by the component value and the associated column index. The activated gn broadcast the pattern value to all gn in neighbouring rows and in-turn receives two value broadcasts. The combination of the received values is then associated with a sub-pattern identification in each active gn and passed up to the gn with equal row and column position in the above layer. This sub-pattern recalling is repeated in each layer until a gn in the top-layer is activated. This top-layer recall is the combination of all prior sub-pattern recalls and is thus representative of the entire pattern.

In the last decade hgn has been specialised and refined for a range of problems. In Mahmood et al. (2008), hgn was refined to dhgn. This improvement separated the input pattern into sub-patterns, each being individually classified via a hgn. The overall pattern was classified via a majority vote. dhgn permitted equivalent performance to hgn with fewer computation neurons. As such, it was tested with distorted image identification, showing greater classification speed in comparison to a state-of-the-art algorithm of the time. Finally, within the domain of swarm robotics, Hettiarachchige et al. (2018)

used a sequential hgn for anti-swarm motion tracking. This work saw hgn outperform a recurrent neural network.

From this review, it can be concluded that hgn, and more specifically dhgn, is a light-weight, accurate and scalable pattern matching approach. hgn shows potential for use in low-complexity agents, such as that used for swarm robots.

3 System design

In this section, the structure and novelties of the proposed rhgn are presented and the training process for the explored swarm application is discussed. The rhgn of this study utilises the distributed structure of dhgn for the aforementioned computation reductions. However, the novel contributions of this algorithm are equally applicable to standard hgn.

3.1 Robotic hgn

The novel contributions of rhgn target two identified issues with hgn (or dhgn). Firstly, the mixed inputs, which may contain pseudo-continuous numbers, taken from environment observations result in numerous unused pattern components. This results in standard (D)HGN unnecessarily consuming considerable memory. Secondly, standard (D)hgn output discrete pattern classifications, however, localised environment observations may be seen across multiple environments and thus requires a fuzzy pattern match.

3.1.1 Mixed, pseudo-continuous inputs

To further describe the first issue, let us explore a simple environment pattern with three inputs, A, B and C, with respective ranges , , and . Traditional hgn classify patterns consisting of uniform ranged categorical components and thus each column will have a set gn of size r. Additionally, the triangular structure of hgn results in the total number of gn being , where n is the pattern length.

For mixed-range inputs, r must be the largest value to accommodate the input ranges of all inputs, thus for this example and therefore the hgn is theoretically of infinite size. However, as these observations will be made by robotic devices only pseudo-continuous ranges are possible. If we assume the agent is using unsigned 32-bit integers as , r becomes and a total of gn are created during start-up, though only a fraction of these gn are utilised. That is, input C has gn, with only two being activatable. A diagram of this traditional hgn is shown in Figure 1 (a), with black squares being non-activatable gn.

A novice solution to this scaling issue is to create each hgn row via individual variable range. However, such a solution will only reduce this example to gn. Instead, rhgn overcomes this issue by dynamically generating gn as each variable is observed in training. During this generation process a new gn is added at the column of each layer at the allocated row. Additionally, to prevent invalid neighbour communication after row additions, each gn row is isolated behind a ‘gate’. These gates collect outbound messages from the active gn to pass to the neighbouring row’s gate. The gates then send inbound messages to only these active gn, rather than broadcasting to all gn in the row. An example of this dynamic GN creation and gated row connection is shown in Figure 1(b). Returning to the sample environment pattern of A, B and C, we see a linear relation between observed component values and gn size. If we assume training consists of 2 million patterns, and all patterns have a unique value for B (an unrealistic extreme) rhgn will hold only gn.

(a) Standard HGN, muliple redundant GN and connections.
(b) rhgn, only required GN created. One connecton per row.
Figure 1: Green/light boxes are utilised GN, black boxes are non-activatable. White circles are the row gates connected to neighbouring gates.

3.1.2 Probability environment matches

The second issue with (D)HGN is the agent localised observation patterns may correlate to many environments, while hgn traditionally outputs a single pattern identification. To overcome this issue, rhgn associates each discrete pattern classification to a probability tuple, where each probability relates to a possible environment. To form these probability outcomes, the training process of hgn is extended to record the occurrence of each pattern identification for each environment. After all training data has been examined, these environment counts are normalised and stored in hash-tables using the unique pattern classification as the recall key. This process does not interfere with the underlying classification of hgn, and thus a one-to-one relation is assured between each pattern and probability tuple.

In addition to these probabilities, rhgn adjusts the upper sub-pattern concatenation of dhgn. In standard dhgn, after each sub-hgn of the dhgn classified a sub-pattern, a majority vote was used to determine the overall pattern. This approach introduced inaccuracy as significant variation in a sub-pattern was overlooked during voting. In contrast, rhgn implements an upper hgn, taking the argmax of the lower hgn probabilities as inputs. This allows variation in any one sub-pattern to be acknowledged during final pattern classification. Should this upper hgn fail to match an observed pattern, an averaging decision fusion of the lower hgn probability tuples is taken.

3.2 Swarm Environment Matching

In this paper, we equip each agent of a swarm with an rhgn for the purpose of classifying environment state and switching the active behaviour to that listed as most appropriate. Figure 2 depicts the proposed process of creating and training the rhgn and how the agents utilise it for behaviour switching. This process is broken into three main sections: behaviour design, rhgn training, and rhgn swarm execution.

Figure 2:

System Process: 1) Behaviour creation and feature extraction; 2) rhgn and probability training; 3) swarm implementation with behaviour selection. Green denotes simulation environments. Yellow denotes hgn processes.

3.2.1 Behaviour Creation

In this first process, a number of behaviours are developed by a human designer and stored in the swarm agent repertoire. Each agent behaviour is developed such that when implemented in the full swarm an overall emergent behaviour is achieved. These behaviours are developed in this study via a sequence of conditional executions, combined with the dynamic neighbourhood targeting system, discussed in Smith et al. (2018a). However, the rhgn system is not restricted to such a behaviour mechanism, and any robotic control may be implemented. The behaviours developed explicitly for this study are further defined in Section 4.3.

After development, each behaviour is implemented in a number of environments. The agents’ observations during operation are extracted for later environment matching. These observations are made by each agent, in each time-step. Furthermore, the completed pattern database consists of observations from every behaviour in each example environment. Supplying the hgn with such a volume of observations improves the pattern matching range and reoccurring observations allows greater accuracy in environment probability prediction.

3.2.2 rhgn Training

When a significant behaviour repertoire has been created for the swarm, the rhgn is trained with the aforementioned observations. During this training, the dynamic gn rows are populated and the probability tuples are created and linked to rhgn outputs, as discussed in Section 3.1.2.

3.2.3 rhgn Implementation

During operation, each swarm agent is supplied all behaviours, though only has one active at any given time-step. The active behaviour controls the agent, as discussed in the behaviour creation phase, with incoming environment data determining agent actions.

In addition to action control, environment data is fed to the rhgn each time-step as a linear pattern. The hgn output value is matched to the associated environment probability tuple. Each prediction tuple is stored in a collection, and after time-steps the predictions are averaged for behaviour (re)selection. This fusion prevents behaviour thrashing and improves environment match accuracy by predicting over a time series, not a single snapshot of the environment which may be limited or misleading due to sensing range and the stochastic environment. Furthermore, swarm agents share environment prediction with one-another while in communication range, similar to the swarm belief propagation of Trianni et al. (2016); Reina et al. (2015) and Smith et al. (2018a). These external prediction sets are added to the agents’ collections and are equally incorporated into the fusion process.

After all prediction fusion, a single environment is matched by each agent via the highest probability. The corresponding behaviour for the selected environment is implemented for time-steps, until the next environment match. It may be observed that as each agent makes an independent environment match, the swarm becomes behaviour-heterogeneous. Such a feature, combined with the prediction sharing, allows the swarm to accurately diversify temporally and spatially. That is, a portion of the swarm may selection one behaviour to overcome a local challenge, and then select another behaviour when this local challenge changes. Meanwhile, a separated portion of the swarm may utilise a third behaviour to overcome a different challenge.

4 Experiment Design

This section presents: the networking data-transfer task of the swarm; the rhgn setting for this study; the three behaviours provided to the rhgn, which are referred to as mb; the two experiments of this study, rhgn in manually designed environments and in randomly generated environments; and finally the result representation and analytical tools used to validate rhgn.

To validate the performance of rhgn in the two experiments of this study, a random behaviour selector is additionally implemented for comparison. This algorithm, denoted as Rand(mb), randomly selects an mb for the entire swarm during start-up and does not change this behaviour for the full operation duration. This comparison between rhgn and Rand(mb) is conducted between resulting swarm fitness and environment identification accuracy.

4.1 Networking Task

The swarm is tasked with facilitating a simplified data-transfer in which data-packets are transferred between network-nodes. This transfer consists of 1,000 data-packets and the accompanying acknowledgement-packets. The task is to be completed in a hostile environment, with obstacles restricting agent mobility and communication, and jamming devices heavily restricting communication within an area. Each agent is capable of storing ten packets in a buffer but is limited to transmitting or receiving one packet per time-step. Additionally, agents may move up to , with a time-step being 0.1 seconds. Agents are equipped with a simple LIDAR, for obstacle detection, and simulate tof and signal triangulation to detect neighbouring devices (fellow swarm agents, network-nodes and jammers). The data-transfer process of these experiments use the Gaussian shadow ldpl model (Rappaport et al., 1996)

with a signal loss exponent of 2.5, a transmission strength of 12dBm and a Gaussian standard deviation of 3. This model estimates the signal power loss over distance, determining if a communication attempt is successful. For a more in-depth discussion of the swarm agent operations the reader is referred to

Smith et al. (2017) and Smith et al. (2018b).

The data-transfer task is terminated when a time-limit is reached, or all packets reach their destinations. As this networking task aims to achieve high data throughput, the swarm’s fitness is measured via,


where is the packets that reached the destination within and is the time for all packets to be transferred. The termination criteria of this task results in either , causing a negative fitness, or , causing a positive fitness. In this study, is set to 50,000 time-steps, allowing ample environment traversal time.

4.2 rhgn Structure

For the above data-transfer application, the rhgn of this study distributes a pattern of 48 components into three sub-rhgn, as used in dhgn. These rhgn each focus on an aspect of the observations, which are: network conditions, packet statuses and neighbourhood conditions. The input pattern of these three rhgn are supplied in the appendix. It can be noted that some of these inputs may be superfluous for this study, however, the hgn pattern matching process allows for such redundancy.

rhgn pattern matching is conducted each time-step, and probabilities are fused for behaviour selection every 500 time-steps. Intra-swarm belief sharing has agents broadcast every 10 time-steps. During belief fusion, agents combine locally collected probabilities and neighbour probabilities with equal weighting. After behaviour selection, an agent’s probability collection is cleared.

4.3 Designed Behaviours

The experiments of this study have the rhgn and Rand(MB) agents equipped with a behaviour repertoire of three mb. Each of these behaviours has noted strengths and weaknesses and allows the swarm to operated in different conditions.

The first mb evenly spaces agents between packet destination, packet origin, and other swarm agents. When in communication range, agents send packets to neighbours closer to said packet’s destination. Obstacles are avoided via a repulsive virtual force (Chang et al., 2003), provided it does not prevent even distribution of agent spacing and no jamming avoidance is implemented.

The second mb focuses on wall circumvention via a ferrying behaviour (Zhao et al., 2004). Agents travel towards other swarm members or packet destinations without the even spacing of mb-1. Upon obstacle detection, the movement profile is dominated by repulsive and orbital virtual forces (Rezaee and Abdollahi, 2014; Chang et al., 2003) relative to the obstacle, as prior seen in Smith et al. (2018b). This motion profile continues until a closer obstacle is detected, the target is in communication range, or the agents are unable to move without collision. In the latter case, the agents reverse orbiting direction. This behaviour allows effective circumvention of signal blocking obstacles, but may overly manoeuvre around minor, avoidable obstacles.

The third mb focuses on jammer avoidance. As with the first mb, agents attempt even spatial distribution. However, if jamming noise is detected, agents move away from the noise source and avoid the area until 500 time-steps with no noise detection pass. This behaviour effectively overcomes jamming devices, however, false-positive jammer identification (which may occur when detecting other communication) causes unnecessary agent re-positioning and area avoidance.

These three mb are validated in six manually designed environments, Env. 1.1, 1.2, 1.3, 2.1, 2.2 and 2.3. This validation has each mb operate in each environment and report the swarm fitness. The results are used to create the environment-behaviour mapping, shown in Table 1, and the pattern-environment labels for rhgn training.

Environment 1.1 1.2 1.3 2.1 2.2 2.3
Behaviour 1 2 1 1 1 3
Table 1: Environment-Behaviour associations for rhgn.

4.4 Experiment one: Designed Environments

For the manually designed environments, swarms of eight agents are implemented in eight environments with the three mb, with the trained rhgn and with Rand(mb). These environments, designed to challenge the mb, consist of the six listed in Table 1 and two additional environments, Env. 3.1 and 3.2, which the rhgn has not been explicitly trained for.

The first three environments (1.1, 1.2 and 1.3), shown in Figure 3, focus on obstacle configuration. In these operations, the swarm is tasked with mono-directional data-transfer between two network-nodes, labelled source and sink. Env. 1.1 has the source and sink 60m apart, though separated by several thin walls with an attenuation factor of each. The swarm agents may communicate through these walls, thus best performance is seen by mb-1 positions the swarm directly between the source and sink. In contrast, Env. 1.2 has thicker walls which cause signal attenuation of , preventing data-transfer through them. As such, only mb-2 can complete this operation by forming an arc around the walls. Finally, Env. 1.3 has the source and swarm start in a small walled area. To connect to the sink, the agents must pass through a small opening. This opening is sized so the wall avoiding actions of mb-2 will be triggered and the swarm cannot pass. Therefore, the best performance is seen with any mb other than mb-2.

In relation to rhgn challenges, these three environments present similar local obstacle information to each agent. The rhgn must, therefore, recognise the neighbourhood and network conditions to identify the environment. Furthermore, as all agents may not be positioned such that the distinguishable network characteristics are observable, the intra-swarm belief propagation must be utilised for all agents to have correct environment identification.

(a) Env. 1.1
(b) Env. 1.2
(c) Env. 1.3
Figure 3: Three obstacle configurations for the swarm to navigate. Lines are walls, with attenuation represented by wall thickness, grey circles are source and sink as labelled.

Env. 2.1, 2.2 and 2.3 each have three network-nodes, A, B and C in an equilateral triangle with edges of 100m. Additionally, a jamming device is placed in the centre of this triangle. Topologically, these environments are identical, however, the distribution of packet origins and destinations, along with the jammer being active, are unique for each environment. These configurations are listed in Table 2. Env. 2.1 and 2.2 are seen to be similar settings, both seeing best performance by the non-jamming behaviours (mb-1,2). Identification between these two environments is limited to packet distribution and is thus seen as a challenge for the rhgn pattern matching. Env. 2.3 has the jammer active and thus requires the anti-jamming abilities of mb-3.

Env. A B C Jammer
Out (%) In (%) Out (%) In (%) Out (%) In (%)
2.1 100 0 0 50 0 50
2.2 33 33 33 33 33 33
2.3 33 33 33 33 33 33
Table 2: Network configuration of devices A, B and C for environment 2.1,2.2 and 2.3.
Figure 4: Env. 3.1: a combination of Env. 1.2 and Env. 1.3.

Finally, for the additional Env. which rhgn is not trained for, components of the prior Env. are combined to make spacial hybridisations. Env. 3.1 combines the network restricting walls of Env. 1.2 and the narrow passage of Env. 1.3, as shown in Figure 4. This environment requires agents to use mb-1/mb-3 to pass through the passage and mb-2 to circumvent the walls. Similarly, Env. 3.2 combines 2.2 and 2.3. The packet distributed for 3.2 has all agents send to one another, however, the jammer is relocated from the centroid of A, B and C, to the centroid of A and B. Thus mb-1 or -2 are optimal for connections A-C and C-B, and mb-3 is required for connecting A-B.

To validate the abilities of rhgn in these environments, a comparative study is conducted between the mb, rhgn and Rand(mb) by implementing each swarm in the eight designed environments 50 times. Each implementation instance has a unique network randomisation seed and swarm member starting locations. Comparisons in the former six environments validate the environment matching ability of the rhgn. Comparisons in the latter two environments explore rhgn allowing spatial induced behaviour heterogeneity in the swarm.

4.5 Experiment two: Generated Environments

The second experiment of this study examines the ability of rhgn to utilise the behaviour repertoire in prior unseen conditions. This emulates the swarm having been training with controlled examples and then being deployed for real-world operation.

This experiment implements the mb, rhgn and Rand(MB) swarms in a further 50 environments which are randomly created. These environments are more challenging than the designed cases as the number of network-nodes is selected randomly from the set {2,3,4}, these nodes are mobile, and the data-transfer requirements between these nodes change during the operation. These requirement changes are discretised into data-transfer sub-operations and referred to as a network stage. As an example of these stages, consider a network with two network-nodes, A and B, which undergoes two network stages. In the first stage, packets are sent from A to B. The swarm must make a mono-directional connection. After these packets are transferred, the network requirement changes to stage two which has A and B sending packets to each other. The swarm must reconfigure to facilitate this bi-directional connection. These network stages, and the network-node movement, requires the swarm to utilise multiple behaviours to facilitate all network conditions.

For the mobile network-nodes, waypoints are defined for each networking stage during environment generation. Waypoints are traversed by the mobile nodes at a speed of , with the path between waypoints being a straight line. Upon reaching the final waypoint of the stage, the node becomes stationary.

For the network requirements of a stage, each network-node is assigned a percentage of the packets to be sent in the stage, and a percentage of these packets which should end at the node. During the creation of these values, it is enforced that all nodes are either a packet origin or destination in at least one stage. That is, no network-node is idle for the full operation.

In this study, the number of network stages is limited to a random value from the set {1,2,3}. Each generated environment has the swarm transfer 1,000 packets, as in the prior experiment and these packets are evenly divided between the network stages.

In addition to the network-nodes, each environment generation also creates numerous obstacles and a jammer. Each obstacle is semi-randomly positioned in the environment. This positioning is limited so the network-nodes’ motions are not blocked. Similarly, the jammer is placed in the centre of the network-nodes but with the restriction of all network-nodes being outside jamming range. Thus the jammer is located in an area which the swarm agents are expected to enter, and thus will impact the swarm, but will not impact the network-nodes. Finally, the environment generator assigns an active or inactive state to the jammer for each network stage.

4.6 Result representation

For both environment experiments the rhgn pattern matching is validated by comparing the fitness of the three mb swarms, the rhgn swarm and the Rand(mb) swarm. This comparison presents the median and quartiles of each swarm fitness, and performs a Mann-Whitteny U-test between the rhgn fitness results and the results of all other swarms. Additionally, this analysis examines the percentage of implementations which the rhgn and Rand(mb) achieve a fitnesses of at least 95% of each mb, and 95% of the optimal mb of that instance. This leniency of 5% is given for rhgn as environment matching cannot be expected in the initial time-steps, due to agents not having adequate environment interaction and thus limited pattern based belief. This initial learning time leads to a small reduction in swarm fitness.

In addition to this fitness comparison, the accuracy of rhgn environment prediction is explored for the first six designed environments. The accuracy is only tested in these cases as correct values are known and are not dependent on agent location. This analysis presents the error-rate over simulation time for each environment, averaged over the eight swarm agents and 50 environment instances. Additionally, the one-versus-all accuracies and scores for each environment prediction are presented. These latter statistics are found across the eight swarm agents in all 300 swarm implementations (50 implementations of 6 environments). The accuracy, , and scores of each environment, e, are respectively measured via,


where is true positive selection of environment e, is true negative selection, and and are the respective false counterparts.

To analyse the spatial heterogeneity in Env. 3.1, 3.2, and the random environments, the environment maps are presented with markings denoting agent locations during behaviour selection and behaviour chosen. Each figure is the temporal concatenation of all time-steps over the operation or stage. For Env. 3.1 and 3.2, these figures are also concatenations of all 50 implementations.

5 Results and Discussion

5.1 Designed Environments

Figure 5: Box plot of swarm fitnesses using mb, rhgn and Rand(MB) swarms in designed environment with direct training. rhgn seen to achieve near equivalent performance to the optimal mb in each environment.

To begin the exploration of rhgn, Figure 5 shows the performance of the three mb, the trained rhgn and Rand(MB) in the eight designed environments. In the six environments which rhgn was explicitly trained for (Env. 1.1-2.3), minimal performance difference between rhgn and the optimal mb is seen. Additionally, in three of these environments rhgn shows significant improvement over Rand(MB). In the two unseen environments (Env. 3.1 and 3.2), rhgn outperforms all mb and Rand(MB). These performance matching achievements by rhgn are further supported in Table 4, showing in all environments rhgn closely matches or outperforms the optimal mb in at least 52% of implementations, and in some environments this match rate is 100%.

mb-1 mb-2 mb-3 Rand(MB)
1.1 0.12 0.84 0.24
1.2 0.88
1.3 0.92 0.22
2.1 0.51
2.2 0.02 0.12 0.8
2.3 0.65
3.2 0.04 0.13
Table 3: Mann-Whitney U-Test between mb and rhgn, and between Rand(MB) and rhgn, in designed environment. In all but 2.1 rhgn and the optimal mb have significant overlap. This shows the rhgn is not only selecting, but correctly utilising the behaviours.
rhgn: mb-1
rhgn: mb-2
rhgn: mb-3
rhgn: Max(mb)
Rand(mb): Max(mb)
1.1 98% 100% 98% 98% 88%
1.2 96% 52% 96% 52% 28%
1.3 86% 100% 82% 80% 46%
2.1 98% 68% 100% 66% 62%
2.2 100% 100% 100% 100% 98%
2.3 100% 60% 60% 60% 28%
3.1 80% 100% 92% 80% 40%
3.2 100% 100% 96% 100% 86%
Table 4: Percentage of designed behaviour instance which rhgn and Rnad(MB) achieves of maximum mb fitness.

For the behaviour matching in the designed environments which rhgn is directly trained for, the most prominent success is seen in Env. 1.2, 1.3 and 2.3. In these cases, some mb are unable to complete the task and thus have median fitnesses below 0. In contrast, rhgn utilised the optimal behaviour and achieves median fitnesses above 0 in all three cases. Furthermore, due to poor performing mb, Rand(MB) sees considerably low median and quartile values and in Env. 1.3 sees considerable interquartile range . This shows rhgn can effectively select the correct behaviour and overcome some mb being invalid while Rand(MB) performance is significantly reduced by a poor mb in the repertoire. Additionally, the superiority of rhgn over the failing mb and Rand(MB) is seen to be statistically significant in Table 3, with low p-values shown for these results. Also, Table 3 shows high p-values for rhgn and the optimal mb in these cases. This supports rhgn correctly matching the environment to an mb and achieving equivalent results.

In relation to rhgn matching or outperforming the optimal mb in Table 4, these results can be separated into three tiers of performance: high (Env. 1.1, 2.2 ), medium (Env. 1.3), and acceptable (Env. 1.2, 2.1, 2.3). This divide shows a relation between the performance of rhgn in an environment and the percentage of repertoire mb which are functional in that environment. That is, high performance is seen when all three mb are effective in the environment, medium performance is seen with being effective, and acceptable behaviour when the rhgn is limited to one of the three behaviours to solve the task. This relationship is due to misinformed agents (agents observing patterns which are associated with other environments) utilising incorrect behaviours for the given environment and the impact such behaviour usage has on the swarm-wide operation. The high-performance cases can still utilise these incorrect behaviours, with the agents operating within expectation. The low performance cases see these behaviours negatively impact the swarm, with agents moving out of position for neighbour interaction or attempting actions known to fail in the environment. That being said, all environments see rhgn able to match or outperform the mb in at least half the implementations and rhgn percentage are higher than Rand(MB) percentages in all environments. In the worst performance match for both behaviour selectors, Env. 1.2, rhgn achieving 52% which is 24% higher than Rand(MB). Additionally, the overall match rate to the optimal mb in these six Env. is 79.5% for rhgn and only 59.5% for Rand(MB). This shows that rhgn has achieved relatively accurate performance.

For Env. 3.1, no mb (and thus no Rand(MB)) in all 50 instances is seen to solve the data-transfer task within time-limit, . In contrast, rhgn achieves 36% operation success. Additionally, 80% of operations seen rhgn achieve at least 95% of the optimal mb performance in Table 4 and of these 46 operations, rhgn has an average fitness improvement of 860% over the optimal mb. These results are significantly higher than Rand(MB), with a match rate of only 40%. Furthermore, the higher fitness of rhgn is seen to be statistically significant to all mb and Rand(MB) in Table 3. This higher performance due to rhgn correctly matching the components of the environment is displayed in Figure 6 (a) with the passageway to be identified as Env. 1.3, thus mb-1 (red) is activated, and the wall sections are identified as Env. 1.2, causing mb-2 (blue) to be used.

For Env. 3.2, both the median and upper-quartile of rhgn surpassed all mb in Figure 5, though this improvement is less significant than in Env. 3.1 and Table 3 shows this difference to be not statistically significant. On the other hand, Table 4 shows 100% of operations have rhgn reach at least 95% of the optimal mb fitness. In Figure 6 (b) this higher performance is seen to be due to agents primarily utilising mb-3 when close to the jammer, and primarily utilising mb-1 when the jammer is sufficiently distant, as was predicted to be the case. Thus the rhgn is correctly achieving a spatially heterogeneous behaviour selection. In relation to Rand(MB), the low performance of mb-1 and mb-2 are again reducing the median performance, and resulting in several failing instances. In contrast rhgn has not instances which the swarm cannot transfer all data within .

Figure 6: Environment-Behaviour matching mapped over operation map for Env. 3.1 (a) and Env. 3.2 (b). Red selecting mb-1; blue is mb-2; green is mb-3; white dots are network-node and jammers (as labelled), white squares are unreachable walled area. For Env. 3.1, strong use of mb-1 around passage, mb-2 around other walls. For Env. 3.2, strong use of mb-3 around jammer (between A and B), noticeable use of mb-1 on A-C edge. Each figure is the combined mapping of all operation runs.

In addition to exploring the swarm fitness, Figure 7 shows the underlying hgn environment matching mean error-rate over swarm operations for Env. 1.1-2.3. From this graph, it can be seen that most environments have some significant error-rates during early operation. However, as the agents distribute about the environment, and thus collect more informative pattern observations, these error-rates quickly decline. This is especially true with Env. 1.1-1.3, with error-rates dropping to within time-steps. For Env. 2.1-2.3, some errors continue throughout the operation. However, these error-rates are acceptably small, given how similar the environments appear from the agents’ local observations; 2.2 is only distinguishable from 2.1 within the agents’ local observations by a changing packet source; 2.3 is mistaken for 2.2 when the agents are not impacted by the jamming device.

In addition to the error-rate over time, Table 5 shows the environment matching accuracy of the swarm is between 92.4% and 99.1% and the scores reach a top of 97.15% and a bottom score of 57.81%. These values show that rhgn has a considerably high environment match rate over all 300 implementations. Furthermore, these values are considerably higher than the random environment matching algorithm, which is predicted to achieve a {TP, FP, TN, FN } tuple of {}, giving an accuracy of and scores of for all environments.

Figure 7: rhgn error-rate (%) of each designed Env. over time-steps. Error-rates/time-step vectors recorded for all 8 agent in 50 implementations. Presented values are mean of these 400 vectors.
Env. 1.1 1.2 1.3 2.1 2.2 2.3
F1 (%) 93.3 97.15 95.7 65.88 57.81 88.12
Accuracy (%) 98.5 98.5 99.1 92.4 92.4 92.5
Table 5: scores for the designed environments which rhgn is explicitly trained for.

To conclude this exploration of rhgn in the designed environments, it is shown that rhgn can aptly identify the environment, or partial-environment, and utilises the associated behaviour for optimal swarm operation. This correct identification is confirmed by high accuracy and scores. rhgn allowed the swarm to hold a median fitness above 0 for all environments, and a median fitnesses higher than any mb in composite environments. This shows that although mb are capable solutions to the data-transfer task in the intended solutions, they have limited flexibility and cannot be used for any implementation. rhgn overcomes this flexibility limitation, allowing the swarm to achieve high fitness in a wider range of implementations, including more complex environments.

5.2 Generated Environments

Figure 8: Fitness of three mb, rhgn and Rand(MB) in 50 generated environments. Environments ordered by maximum mb fitness for clarity. rhgn holds close fitness to optimal mb in all environments.

To further explore the ability of rhgn, Figure 8 shows the fitnesses of the three mb, rhgn and Rand(MB) for 50 randomly generated environments, ordered by the optimal mb fitness. Table 6 presents the associated Mann-Whitney U-Test results. As can be seen, rhgn continues to achieve swarm fitnesses similar to the optimal mb in the majority of environments. Additionally, in several environments, the spatial and temporal behaviour diversification of rhgn allows fitnesses greater than a single mb. However, this fitness improvement is limited, and a notable correlation exists between rhgn performances and the potential of the behaviour repertoire. That is, in all environments, rhgn is only able to produce swarm fitnesses slightly higher than the best mb. In relation to Rand(MB), this correlation is seen to exist between both the optimal and sub-optimal behaviours; several environments see Rand(MB) select an mb with lower fitness than the other mb which leading to considerably lower performance than rhgn. In relation to statistical similarity, Table 6 show rhgn again has low overlap with the failing mb-2 and Rand(MB), but very high overlap with mb-1 and mb-3, which are optimal in most environments.

mb-1 mb-2 mb-3 Rand(MB)
0.91 0.01 0.66 0.05
Table 6: Mann-Whitney U-Test between mb and rhgn, and between Rand(MB) and rhgn, in generated environments. rhgn has high statistical overlap with mb-1 and mb-3 and low overlap with mb-2 and Rand(MB).

This appropriate behaviour usage by rhgn is further demonstrated in Table 7 with the percentage of environments with mb fitness again shown. These results show that rhgn achieved close to optimal behaviour in 78% of environments, a value significantly high given the underlying hgn has no prior experience with the 50 generated environments. Additionally, as the optimal behaviour changes for each environment, rhgn achieves significantly higher performance than each mb used in isolation. This is seen from the fitness match rate being up to 92%. Finally, rhgn achieved far higher match rates than Rand(MB), which reaches only 56%.

Finally, to demonstrate the rhgn performance in these challenging, generated environments, Figure 9 again maps the behaviours over the environment landscape for a noteworthy environment. This environment has two network stages and thus each is depicted separately. In stage 1, left, the jammer is active and the swarm is seen to identify this pattern component and primarily utilise mb-3. Additionally, agents trapped by the obstacles in the lower-right area are switched to mb-2, circumventing the obstacle. In stage 2, right, the jammer is disabled and the swarm is positioned away from obstacles. As such mb-1 sees primary usage. This diverse and dynamic environment shows rhgn allows the swarm to heterogeneously utilise the behaviours as required across both time and space.

From this exploration, it is concluded that rhgn can effectively guide the swarm agents’ behaviour usage, even when the environments are unknown and distorted pattern matching is required. Additionally, rhgn continues to function when the environments are more complex than training conditions.

Figure 9: Environment-Behaviour matching map over generated environment with two communication stages. Colouring as in Figure 6, white arrows depict network-node motion with start and end locations. Environment matches shown to allow both spacial and temporal behaviour heterogeneity to solve dynamic task.

. rhgn: mb-1 rhgn: mb-2 rhgn: mb-3 rhgn: Max(mb) Rand(mb): Max(mb) 92% 90% 84% 78% 56%

Table 7: Percentage of generated environments which rhgn and Rand(MB) achieves fitness of mb. All mb and max(MB) shows strong matching performance of rhgn

Experiment data-sets have been made available at Monash FigShare, doi:

6 Conclusion

This paper presents an extension of hgn for online behaviour selection in a robotic swarm tasked with data-transfer between networking devices. This hgn extension, named rhgn, allows pattern matching of real-value inputs without costly memory or computation consumption and outputs match probabilities for more effective temporal and intra-swarm prediction fusion.

Using the proposed rhgn with three manually designed behaviours, the swarm was implemented in a number of human-designed and randomly generated environments. These rhgn swarm performances were compared against the individual behaviours and a random behaviour selector.

In relation to the designed environments, it was found that the rhgn closely matched the performance of the optimal behaviour when directly trained in the environment. Additionally, rhgn achieved an environment-matching accuracy up to 99.1% and a top score of 97.15%, far higher than the random behaviour selector. For environments which were a concatenation of the training conditions, rhgn outperformed all individual behaviours and the random behaviour selector.

In relation to the generated environments, which introduced more challenging versions of the swarm operation, the rhgn continued to match or outperform the supplied behaviours and the random behaviour selector. However, the rhgn driven swarm performance was limited by the potential of the supplied behaviours.

This limitation leads to our future work which will combine our prior studies in swarm behaviour creation (Smith et al., 2018b) with this study of behaviour selection. Such a combination aims to produce a swarm behaviour controller which may have the swarm select from the behaviour repertoire during operation, followed by autonomously create a new, more appropriate, behaviour for the environment post operation. Such a combination will allow continued repertoire extension as the swarm is deployed in more environments, allowing life-long behaviour learning.


Funding for this research was provided by Cyber and Electronic Warfare Division, Defence Science and Technology Group, Commonwealth of Australia.

Conflict of Interest


Credit Authorship Contribution Statement

Phillip Smith: Software, Conducted Simulations, Interpreted Results, Writing - Original draft preparation. Aldeida Aleti: Project Supervisor, Writer - feedback. Cheng-Siong Lee: Project Supervisor, Writer - feedback. Robert Hunjet: Project Supervisor, Writer - feedback, Conceptualization of this study. Asad Khan: Project Supervisor, Writer - feedback, Conceptualization of this study.


  • Baum and Haussler (1989) Baum, E.B., Haussler, D., 1989. What size net gives valid generalization?, in: Advances in neural information processing systems, pp. 81–90.
  • Burke et al. (2013) Burke, E.K., Gendreau, M., Hyde, M., Kendall, G., Ochoa, G., Ozcan, E., Qu, R., 2013. Hyper-heuristics: a survey of the state of the art. Journal of the Operational Research Society 64, 1695–1724. doi:10.1057/jors.2013.71.
  • Burke et al. (2006) Burke, E.K., Petrovic, S., Qu, R., 2006. Case-based heuristic selection for timetabling problems. Journal of Scheduling 9, 115–132.
  • Chang et al. (2003) Chang, D.E., Shadden, S.C., Marsden, J.E., Olfati-Saber, R., 2003. Collision avoidance for multiple agent systems, in: 2003 IEEE Conference on Decision and Control, IEEE. pp. 539–543.
  • Glover and Kochenberger (2006) Glover, F.W., Kochenberger, G.A., 2006. Handbook of metaheuristics. volume 57. Springer Science & Business Media.
  • Hagenauer and Helbich (2017) Hagenauer, J., Helbich, M., 2017.

    A comparative study of machine learning classifiers for modeling travel mode choice.

    Expert Systems with Applications 78, 273–282.
  • Hettiarachchige et al. (2018) Hettiarachchige, Y., Khan, A., Barca, J.C., 2018. Multi-object tracking of swarms with active target avoidance, in: 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), IEEE. pp. 1204–1209.
  • Khan and Ramachandran (2002) Khan, A., Ramachandran, V., 2002. A peer-to-peer associative memory network for intelligent information systems, in: ACIS 2002 Proceedings, pp. 6–17.
  • Kim (2008) Kim, Y.S., 2008.

    Comparison of the decision tree, artificial neural network, and linear regression methods based on the number and types of independent variables and sample size.

    Expert Systems with Applications 34, 1227–1234.
  • Leng et al. (2017) Leng, Y., Yu, C., Zhang, W., Zhang, Y., He, X., Zhou, W., 2017. Task-oriented hierarchical control architecture for swarm robotic system. Natural Computing 16, 579–596.
  • Mahmood et al. (2008) Mahmood, R., Muhamad Amin, A.H., Khan, A., 2008. A lightweight, fast and efficient distributed hierarchical graph ne uron-based pattern classifier. International Journal of Intelligent Engineering and Systems 1, 9–17. doi:10.22266/ijies2008.1231.02.
  • Misir et al. (2009) Misir, M., Wauters, T., Verbeeck, K., Berghe, G.V., 2009. A new learning hyper-heuristic for the traveling tournament problem, in: Proceedings of the 8th Metaheuristic International Conference (MIC09). Hamburg: Germany, Citeseer.
  • Nagavalli et al. (2017) Nagavalli, S., Chakraborty, N., Sycara, K., 2017. Automated sequencing of swarm behaviors for supervisory control of robotic swarms, in: 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE. pp. 2674–2681.
  • Nasution and Khan (2008) Nasution, B.B., Khan, A.I., 2008. A hierarchical graph neuron scheme for real-time pattern recognition. IEEE Transactions on Neural Networks 19, 212–229.
  • Pappa et al. (2013) Pappa, G.L., Ochoa, G., Hyde, M.R., Freitas, A.A., Woodward, J., Swan, J., 2013.

    Contrasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms.

    Genetic Programming and Evolvable Machines 15, 3–35. doi:10.1007/s10710-013-9186-9.
  • Rappaport et al. (1996) Rappaport, T.S., et al., 1996. Wireless communications: principles and practice. volume 2. prentice hall PTR New Jersey.
  • Reina et al. (2015) Reina, A., Valentini, G., Fernández-Oto, C., Dorigo, M., Trianni, V., 2015. A design pattern for decentralised decision making. PLOS ONE 10, 1–18. URL:, doi:10.1371/journal.pone.0140950.
  • Rezaee and Abdollahi (2014) Rezaee, H., Abdollahi, F., 2014. A decentralized cooperative control scheme with obstacle avoidance for a team of mobile robots. IEEE Transactions on Industrial Electronics 61, 347–354.
  • Smallwood and Sondik (1973) Smallwood, R.D., Sondik, E.J., 1973. The optimal control of partially observable markov processes over a finite horizon. Operations research 21, 1071–1088.
  • Smith et al. (2017) Smith, P., Hunjet, R., Aleti, A., Barca, J.C., 2017. Adaptive data transfer methods via policy evolution for uav swarms, in: 2017 27th International Telecommunication Networks and Applications Conference (ITNAC), IEEE. pp. 1–8.
  • Smith et al. (2018a) Smith, P., Hunjet, R., Aleti, A., Barca, J.C., et al., 2018a. Data transfer via uav swarm behaviours: Rule generation, evolution and learning. Australian Journal of Telecommunications and the Digital Economy 6, 35–58.
  • Smith et al. (2018b) Smith, P., Hunjet, R., Khan, A., 2018b. Swarm learning in restricted environments: an examination of semi-stochastic action selection, in: 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), IEEE. pp. 848–855.
  • Smith-Miles (2008) Smith-Miles, K.A., 2008. Towards insightful algorithm selection for optimisation using meta-learning concepts, in: Neural Networks, 2008. IJCNN 2008.(IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on, IEEE. pp. 4118–4124.
  • Soria-Alcaraz et al. (2014) Soria-Alcaraz, J.A., Ochoa, G., Swan, J., Carpio, M., Puga, H., Burke, E.K., 2014. Effective learning hyper-heuristics for the course timetabling problem. European Journal of Operational Research 238, 77–86.
  • Tabataba and Mousavi (2012) Tabataba, F.S., Mousavi, S.R., 2012. A hyper-heuristic for the longest common subsequence problem. Computational biology and chemistry 36, 42–54.
  • Tavares et al. (2018) Tavares, A.R., Anbalagan, S., Marcolino, L.S., Chaimowicz, L., 2018.

    Algorithms or actions? a study in large-scale reinforcement learning., in: IJCAI, pp. 2717–2723.

  • Terashima-Marín et al. (2008) Terashima-Marín, H., Ortiz-Bayliss, J.C., Ross, P., Valenzuela-Rendón, M., 2008.

    Hyper-heuristics for the dynamic variable ordering in constraint satisfaction problems, in: Proceedings of the 10th annual conference on Genetic and evolutionary computation, ACM. pp. 571–578.

  • Thabtah and Cowling (2008) Thabtah, F., Cowling, P., 2008. Mining the data from a hyperheuristic approach using associative classification. Expert Systems with Applications 34, 1093–1101. doi:10.1016/j.eswa.2006.12.018.
  • Trianni et al. (2016) Trianni, V., De Simone, D., Reina, A., Baronchelli, A., 2016. Emergence of consensus in a multi-robot network: From abstract models to empirical validation. IEEE Robotics and Automation Letters 1, 348–353.
  • Zhao et al. (2004) Zhao, W., Ammar, M., Zegura, E., 2004. A message ferrying approach for data delivery in sparse mobile ad hoc networks, in: Proceedings of the 5th ACM international symposium on Mobile ad hoc networking and computing, ACM. pp. 187–198.

Appendix A RHGN Pattern


  • neighbourhood size

  • sink ID

  • source ID

  • unique sinks in 10 steps

  • sink changes in 10 steps

  • unique sources in 10 steps

  • source changes in 10 steps

  • unique sinks in 100 steps

  • sink changes in 100 steps

  • unique sources in 100 steps

  • source changes in 100 steps

  • unique sinks in 500 steps

  • sink changes in 500 steps

  • unique sources in 500 steps

  • source changes in 500 steps

  • unique sinks in 1000 steps

  • sink changes in 1000 steps

  • unique sources in 1000 steps

  • source changes in 1000 steps

  • jamming strength

  • network noise state (, , , , , ) )


  • agent packets held

  • closest swarm neighbour has packets

  • closest swarm neighbour is packet full

  • sink-ward closest swarm neighbour has packets

  • sink-ward closest swarm neighbour is packet full

  • source-ward closest swarm neighbour has packets

  • source-ward closest swarm neighbour is packet full

  • closest non-swarm neighbour has packets

  • closest non-swarm neighbour is packet full

  • sink-ward closest non-swarm neighbour has packets

  • sink-ward closest non-swarm neighbour is packet full

  • source-ward closest non-swarm neighbour has packets

  • source-ward closest non-swarm neighbour is packet full


  • assumed source to sink distance (rounded to 1 dec.)

  • distance of closest swarm neighbour (rounded to 1 dec.)

  • signal strength of closest swarmneighbour (rounded to 1 dec.)

  • distance of sink-ward closest swarm neighbour (rounded to 1 dec.)

  • signal strength of sink-ward closest swarm neighbour (rounded to 1 dec.)

  • distance of source-ward closest swarm neighbour (rounded to 1 dec.)

  • signal strength of source-ward closest swarm neighbour (rounded to 1 dec.)

  • distance of roughly sink-ward closest swarm neighbour (rounded to 1 dec.)

  • distance of closest non-swarm neighbour (rounded to 1 dec.)

  • signal strength of closest non-swarmneighbour (rounded to 1 dec.)

  • distance of sink-ward closest non-swarm neighbour (rounded to 1 dec.)

  • signal strength of sink-ward closest non-swarm neighbour (rounded to 1 dec.)

  • distance of source-ward closest non-swarm neighbour (rounded to 1 dec.)

  • signal strength of source-ward closest non-swarm neighbour (rounded to 1 dec.)

  • closest wall distance (rounded to 1 dec.)