Methods for Combining and Representing Non-Contextual Autonomy Scores for Unmanned Aerial Systems

by   Brendan Hertel, et al.
UMass Lowell

Measuring an overall autonomy score for a robotic system requires the combination of a set of relevant aspects and features of the system that might be measured in different units, qualitative, and/or discordant. In this paper, we build upon an existing non-contextual autonomy framework that measures and combines the Autonomy Level and the Component Performance of a system as overall autonomy score. We examine several methods of combining features, showing how some methods find different rankings of the same data, and we employ the weighted product method to resolve this issue. Furthermore, we introduce the non-contextual autonomy coordinate and represent the overall autonomy of a system with an autonomy distance. We apply our method to a set of seven Unmanned Aerial Systems (UAS) and obtain their absolute autonomy score as well as their relative score with respect to the best system.



page 1

page 2

page 3

page 4


Reinforcement Learning for Maneuver Design in UAV-Enabled NOMA System with Segmented Channel

This paper considers an unmanned aerial vehicle enabled-up link non-orth...

Relative Visual Localization for Unmanned Aerial Systems

Cooperative Unmanned Aerial Systems (UASs) in GPS-denied environments de...

A Real-time Control Approach for Unmanned Aerial Vehicles using Brain-computer Interface

Brain-computer interfacing (BCI) is a technology that is almost four dec...

Software Radios for Unmanned Aerial Systems

As new use cases are emerging for unmanned aerial systems (UAS), advance...

A Multi-Stage model based on YOLOv3 for defect detection in PV panels based on IR and Visible Imaging by Unmanned Aerial Vehicle

As solar capacity installed worldwide continues to grow, there is an inc...

Combination of linear classifiers using score function -- analysis of possible combination strategies

In this work, we addressed the issue of combining linear classifiers usi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Autonomy is a concept which is continually aspired for in robotics. As robots become more present in the world, for many operations they must become more human-independent, or autonomous. Determining an actual autonomy score for a robotic system, however, has been counter intuitive and equivocal due to the presence of various factors measured in different units. Combining these factors in a meaningful way has proven to be difficult, and many existing frameworks for measuring autonomy are only applicable partially in special cases [2]. In this paper, we build upon an existing framework and propose a novel method for determining the overall potential autonomy of a system.

A framework which can adequately measure autonomy must satisfy certain requirements. Firstly, the autonomy score must be reproducible which means measurements must be quantitative, not qualitative [7]. Additionally, the autonomy score must have absolute meaning, not relative meaning [2]. Scales which have relative meanings can only be used for comparing the autonomy of systems, and do not convey any information about the absolute autonomy of the system. Finally, the autonomy score must be a meaningful combination of the relevant factors and properties of the system [7].

Existing autonomy evaluation methods can be divided into two main categories: contextual and non-contextual [5]. Contextual autonomy evaluation methods take into account features from both the mission and the environment (e.g., ALFUS [8]). Whereas non-contextual methods only rely on implicit system capabilities and do not consider the mission and environment features (e.g., NCAP [3]). Contextual autonomy evaluation methods provide mission-specific measures of the system based on the complexity of the mission and the environment where the operation takes place. Non-contextual autonomy evaluation methods, on the other hand, provide a potential measure of the system independence as an overall expected score.

In this paper, we focus on the non-contextual autonomy evaluation methods. We specifically employ and build upon the Non-Contextual Autonomy Potential, or NCAP [3], for evaluating seven small Unmanned Aerial Systems (UAS) with different capabilities and specifications. We then discuss various methods for combining autonomy scores and their advantages and disadvantages.

2 Background on Non-Contextual Autonomy Potential (NCAP)

NCAP divides the architecture of a UAS into four layers: perception, modeling, planning, and execution [3]. Each of these four layers builds off of the previous layer. The UAS uses its sensors to acquire raw data from the environment, encodes the data in a useful way to build a model (e.g., a map), uses its algorithms to compute a plan (i.e. path), and executes a plan in the real-world. These four layers form a loops starts from perception, goes through modeling, planning, and execution, then updates the states and repeats. While the information acquisition and execution steps depend on the quality and quantity of the system’s hardware, the modeling and planning (and to some extent the execution) steps depend on the quality of the system’s software (i.e., algorithms).

In evaluating the perception layer, we consider both proprioceptive and exteroceptive sensors that are used to perceive the robot’s status and its surroundings. The modeling layer includes algorithms that use the collected raw data to build an abstract model of the environment. Modeling includes various operations such as mapping, localization, target and obstacle detection. The planning layer is where the UAS uses the internally built model to plan a sequence of actions to reach the goal, which is considered a high-level human-provided knowledge such as safety concerns and mission goals. The planning layer might generate multiple plans but should be able to distinguish an optimal plan. Planning can include either path or behavior generation. The generated optimal plan includes a sequence of actions that should be executed by the UAS without requiring human assistance. Note that we consider robots within the same family (i.e., small unmanned aerial systems) and comparing robot autonomy for systems in different families is beyond the scope of this paper.

3 UAS Platforms

In our evaluations, as depicted in Fig. 1, we examine seven small UAS platforms: the Cleo Robotics Dronut [1], Flyability Elios 2 [6], Lumenier Nighthawk 3 [9], Parrot ANAFI USA GOV [10], Skydio X2D [11], Teal Drones Golden Eagle [12], and Vantage Robotics Vesper [14]. These platforms have been selected due to their variety in capabilities and intended applications. While the Nighthawk 3, Parrot, Skydio X2D, Golden Eagle, and Vesper are intended for outdoor reconnaissance, the Dronut and Elios 2 are intended for indoor reconnaissance, specifically in urban and industrial environments. In our evaluations, the resulting data has been anonymized by assigning the platforms labels A through G without any specific ordering or correlation. For more information about these platforms, please see references.

Figure 1: From left to right: Cleo Robotics Dronut, Flyability Elios 2, Lumenier Nighthawk 3, Parrot ANAFI USA GOV, Skydio X2D, Teal Drones Golden Eagle, Vantage Robotics Vesper

4 Measuring NCAP

The existing NCAP framework allows us to combine the component and engineering level tests into a predictive measure of UAS autonomous performance. It should be noted that NCAP encapsulates the potential of a UAS to operate autonomously, not an evaluation of the UAS’s actual task-based autonomous performance. To measure the NCAP score for a UAS platform, we measure its autonomy level and combine it with an overall component performance score. One of the shortcomings of NCAP is that it simply added the scores to produce a final autonomy score for the system. In the following sections, we explain measuring the autonomy level and the component performance denoted as and , respectively. Then, we discuss methods for combining these scores and calculating a single autonomy score for our UAS platforms and ranking them accordingly.

4.1 Measuring NCAP Autonomy Levels ()

The autonomy level,

, is an overall measure of non-contextual autonomy that helps to classify the system into four classes in range 0 (i.e., no autonomy) to 3 (i.e., full autonomy) 

[4]. NCAP assigns an autonomy level to a UAS that only has perception sensors but is not using them to build a model. An example of is a UAS equipped with multiple cameras that is operated entirely by teleoperation. A UAS that captures raw data and constructs a model of the task or the environment is assigned an autonomy level of . An example of is a UAS that uses the data from its RGB camera to detect an object in the environment but still requires teleoperation to move through the environment. Given a high-level goal, a UAS with uses a planning algorithm and the constructed model to generate a plan. An example of is a UAS equipped with a path planning algorithm that enables it to plan paths using a world model but requires a user to select a best path. And a UAS with can execute the generated plan without human assistance. A UAS is only considered , fully autonomous, if it requires no human input during its mission. Based on this classification process, table 1 shows the autonomy level for each UAS in Fig. 1. It can be seen that the specifications and capabilities of each system in different layers were used to determine the autonomy level for that platform.

Platform Perception Modeling Planning Execution
2 RGB camera, thermal
camera, LiDAR, GPS
SLAM capabilities
Obstacle avoidance
Avoids obstacles,
RGB camera, IMU,
thermal camera,
5 distance sensors
Models surroundings
using distance sensors
None None 1
RGB camera, thermal
None None None 0
2 RGB cameras, IMU
thermal camera, GPS
Uses GPS and IMU
for positioning in maps,
visual modeling of targets
Geofencing and
autonomous navigation
Target tracking,
return to home
6 RGB cameras, thermal
camera, GPS
Maps surrounding areas
from camera imaging
Plans best path
in environment
Obstacle avoidance,
path execution
RGB camera, thermal
camera, GPS, IMU
None None None 0
RGB camera, thermal
camera, GPS
None None None 0
Table 1: Evaluation and reasoning of autonomy levels of each UAS platform.

4.2 Measuring NCAP Component Performance ()

To calculate the , we have selected a set of important features including flight time, charging time, RGB streaming resolution, Field of View (FOV), maximum range, thermal camera resolution, weight, maximum flight speed, number of sensors, and number of pre-programmed (i.e. smart) behaviors. Table 2 shows the extracted data for the set of features for each UAS platform. It has to be noted that any number or combination of features can be included in the calculation of

and the calculation is not limited to the specific set of features selected here. Various methods can be used for the aggregation of the data in Table 

2 and since NCAP does not specify a particular combination method, we investigate five methods in the next section.

Max Flight
# of
# of
UAS A 15 50 FHD 100° 2000 N/A 370 3 3 2
UAS B 10 90 FHD30p 114° 500 160120 1450 6.5 10 7
UAS C 22 - FHD - 2000 620x512 1200 3 4 5
UAS D 32 120 HD 84° 4000 320256 500 14.7 10 5
UAS E 23 120 4k60p 200° 3500 320p 775 16 11 10
UAS F 30 45 4k 90° 3000 320256 1044 22 7 3
UAS G 50 - FHD 63° 4000 320p 697 20 6 2
Table 2: Selected UAS platform features used for measurements

4.3 Methods for Combining Scores

In this section, we discuss methods for combining test scores and apply them to the obtained data in Table 2 for the calculation of . The most common method is to normalize the data and use a weighted sum [13]. However, one of the main disadvantages of the weighted normalized sum is that different normalization techniques sometimes result in dissimilar scores. Additionally, we show that calculating the autonomy score using a weighted product results is more favorable results.

4.3.1 Weighted Normalized Sum

Weighted Normalized Sum is the most common method used for combining scores for a set of features denoted by for where is the number of features. The normalization step is required to make features in different units comparable. Several normalization techniques exist. We employ and compare four techniques:

  1. [topsep=0pt,itemsep=0pt,parsep=0pt,partopsep=0pt]

  2. Divide by maximum (). This normalization technique converts the maximum value to and the rest of values become a number less than .

  3. Divide by the sum (). This normalization technique produces a proportional value for each original number.

  4. Range mapping from the current to (). This normalization technique considers both the minimum and maximum values and maps them to and , respectively. All other values are converted to a number in range .

  5. z-score (

    ). In this statistical normalization technique, also known as standard score, we first subtract the mean value from the data and then divide all values by the standard deviation of the data.

The weighted normalized sum can be represented as where represents one of the normalization techniques (). In our experiments, we use two weighting schemes: (a) uniform where where each feature is assigned a weight value of

, and (b) a user-defined vector

based on preferences and importance of each feature as follows: , with indices corresponding to feature indices in Table 2.

It has to be noted that increasing and decreasing the value of the features can have reverse effect of the autonomy of the system. For the features that do not comply with the “more is better” rule (e.g., charging time), we negated the effect of that feature in the sum. This could be interpreted as a negative weight value too. The resulting evaluations of the component performance for each normalization technique using the uniform weights and the user-defined weights can be seen in Table 4 and Table 4, respectively. These tables also show that platform rankings in parenthesis and it can be seen that different normalization techniques result is different rankings within the same data. To give a perspective the scores from Table 1 were included in both tables.

UAS A 0.18 (6) 0.33 (5) -0.98 (7) 0.05 (7) 2.48 (6) 3
UAS B 0.17 (7) 0.30 (7) 0.22 (2) 0.05 (6) 2.29 (7) 1
UAS C 0.20 (5) 0.30 (6) -0.09 (5) 0.06 (5) 2.60 (5) 0
UAS D 0.35 (4) 0.50 (4) 0.05 (4) 0.09 (4) 3.48 (4) 3
UAS E 0.53 (1) 0.72 (1) 0.93 (1) 0.14 (1) 4.63 (1) 3
UAS F 0.39 (2) 0.57 (2) 0.05 (3) 0.10 (3) 3.81 (2) 0
UAS G 0.38 (3) 0.52 (3) -0.27 (6) 0.10 (2) 3.56 (3) 0
Table 4: score of each UAS using different combination methods and user-defined weights.
UAS A 0.24 (7) 0.19 (7) -0.98 (7) 0.06 (7) 3.17 (7) 3
UAS B 0.43 (4) 0.43 (3) -0.08 (4) 0.12 (3) 4.66 (4) 1
UAS C 0.39 (6) 0.34 (6) -0.14 (6) 0.11 (6) 4.51 (5) 0
UAS D 0.48 (2) 0.47 (2) 0.01 (3) 0.13 (2) 5.36 (2) 3
UAS E 0.78 (1) 0.86 (1) 1.28 (1) 0.21 (1) 8.51 (1) 3
UAS F 0.44 (3) 0.43 (4) -0.10 (5) 0.11 (4) 5.01 (3) 0
UAS G 0.42 (5) 0.38 (5) 0.16 (2) 0.11 (5) 4.39 (6) 0
Table 3: score of each UAS using different combination methods and uniform weights.

4.3.2 Weighted Product

Each of these normalization methods have known disadvantages [13]

. For instance, unlike the divide by sum technique, neither the range mapping nor the z-score preserve proportionality of the data. z-score is also extremely sensitive to outlier data points. Consequently, we investigate the weighted product which is an alternative method used for combining scores. The Weighted product is represented as

and does not require a normalization step because rescaling has no effect on the outcome. Similar to the weighted normalized sum method, negative weights represent the “less is better” features. Unlike the weighted normalized sum methods, in the weighted product method the weights do not depend on the units of measurement of the features [13]. The UAS platform scores and their corresponding rankings using the weighted product method have been also reported in Table 4 and Table 4, compared alongside summing methods. It can be seen that all methods agree on UAS E being rank but there is no consensus on any other UAS platform’s ranking.

4.4 Combined NCAP Score vs. NCAP Coordinate

The combined NCAP score is calculated by combining the obtained autonomy level, , with the component performance score, . The combination of these scores allows for a final score which comprehensively incorporates the system’s overall potential autonomy level with the potential of its components. Originally, NCAP overall autonomy score is obtained by simply adding the two evaluations and  [5]. One of the issues with this method is that and represent very different measures. classifies platforms into different classes and partially represents a qualitative evaluation of the systems, however, provides a quantitative evaluation of the system components. To address this problem, we represent the modified NCAP score in an NCAP coordinate with as the -axis and as the -axis. Fig. 2 illustrates the NCAP coordinate for each UAS using the discussed combining methods and two feature weight vectors.

Figure 2: UAS platform autonomy measure represented in the NCAP coordinate with uniform weight values (left) and user-defined weight values (right).
UAS A 0.34 0.40 1.91 0.09 2.16
UAS B 2.03 2.05 2.12 2.00 3.08
UAS C 3.02 3.03 3.17 3.00 3.63
UAS D 0.18 0.23 0.88 0.05 1.15
UAS E 0 0 0 0 0
UAS F 3.00 3.00 3.13 3.00 3.11
UAS G 3.01 3.01 3.23 3.00 3.19
Table 6: Relative potential autonomy distance with respect to UAS E for user-defined weights.
UAS A 0.54 0.67 2.26 0.15 5.34
UAS B 2.03 2.05 2.42 2.00 4.34
UAS C 3.03 3.04 3.32 3.00 5.00
UAS D 0.30 0.38 1.26 0.08 3.15
UAS E 0 0 0 0 0
UAS F 3.02 3.03 3.30 3.00 4.61
UAS G 3.02 3.04 3.20 3.00 5.10
Table 5: Relative potential autonomy distance with respect to UAS E for uniform weights.

4.5 Potential Autonomy Distance

Representing the autonomy scores in the NCAP space allows for assigning an absolute measure of a system’s overall autonomy that we call the Potential Autonomy Distance (AD) and measure as the Euclidean distance from a system’s NCAP coordinate, , to the origin, . Since both and are non negative values, the bigger autonomy distance represent a system with higher potential autonomy.

Additionally, this representation allows for determining a relative measure of system autonomy when comparing multiple systems. To obtain this measure, we first calculate the potential autonomy distance (AD) for all the platforms, then select the platform with the highest distance as the reference, and find the relative AD differences between the reference and of all other platforms. In our experiments, it can be seen that the UAS E has the highest absolute autonomy distance, so we selected it as our reference. The resulting relative autonomy measure can be seen in Tables 6 and 6. A higher relative AD measure indicates more distance between that system and the system with the highest autonomy score and less distance to non-autonomy level (i.e., ).

5 Conclusions

We have investigated five methods of combining features, and found that the weighted product method provides a consistent combination of the values. We also introduced the non-contextual autonomy coordinate and represented the overall autonomy of a system using an autonomy distance. We applied our method to a set of seven UAS and obtained their absolute autonomy scores as well as their relative scores with respect to the best system.


This work is sponsored by the Department of the Army, U.S. Army Combat Capabilities Development Command Soldier Center, award number W911QY-18-2-0006.


  • [1] Cleo Robotics Dronut. Note: 05-21-2021 Cited by: §3.
  • [2] R. Clothier, B. Williams, and T. Perez (2014) A review of the concept of autonomy in the context of the safety regulation of civil unmanned aircraft systems. In Proceedings of the Australian System Safety Conference 2013 (ASSC 2013), pp. 15–27. Cited by: §1, §1.
  • [3] P. J. Durst, W. Gray, and M. Trentini (2011) A non-contextual model for evaluating the autonomy level of intelligent unmanned ground vehicles. In Proceedings of the 2011 Ground Vehicle Systems Engineering and Technology Symposium, Cited by: §1, §1, §2.
  • [4] P. J. Durst, W. Gray, and M. Trentini (2013) Development of a non-contextual model for determining the autonomy level of intelligent unmanned systems. In Unmanned Systems Technology XV, Vol. 8741, pp. 874111. Cited by: §4.1.
  • [5] P. J. Durst and W. Gray (2014) Levels of autonomy and autonomous system performance assessment for intelligent unmanned systems. Technical report Engineering Research Center Vicksburg MS Geotechnical and Structures Lab. Cited by: §1, §4.4.
  • [6] Flyability Elios 2. Note: 05-21-2021 Cited by: §3.
  • [7] N. Gyagenda, O. Gamal, and R. Hubert (2017) A non-contextual method for determining the degree of autonomy to develop in a mobile robot. IFAC 50 (2), pp. 271–276. Cited by: §1.
  • [8] H. Huang (2004) Autonomy levels for unmanned systems framework; volume I: terminology. NIST Special Publication: Gaithersburg, MD, USA, pp. 1011. Cited by: §1.
  • [9] Lumenier Nighthawk 3. Note: 05-21-2021 Cited by: §3.
  • [10] Parrot ANAFI. Note: 05-21-2021 Cited by: §3.
  • [11] Skydio X2D. Note: 05-21-2021 Cited by: §3.
  • [12] Teal Golden Eagle. Note: 05-21-2021 Cited by: §3.
  • [13] C. Tofallis (2014) Add or multiply? a tutorial on ranking and choosing with multiple criteria. INFORMS Transactions on education 14 (3), pp. 109–119. Cited by: §4.3.2, §4.3.
  • [14] Vantage Robotics Vesper. Note: 05-21-2021 Cited by: §3.