Research Project 2: Drone-supported AI-based Generation of 3D Maps of Indoor Radio Environments

by   Ken Mendes, et al.

A Radio Environment Map (REM) is a powerful tool in enhancing the experience of radio-enabled agents but building such a REM can be a laborious undertaking, especially in three dimensions. This project shows how such a REM of an indoor three-dimensional space can be generated in an autonomous and scalable way. Building on the results of the preceding Research Project 1, multiple drones are used to map the WiFi signals present in such a space in a real-world environment where the drones are each able to visit 36 waypoints and collectively gather thousands of WiFi beacon data samples. This report also includes an analysis of the collected data and concludes by proposing machine-learning based techniques to predict the signal strength of known access points in locations not visited by the drones.



There are no comments yet.


page 1

page 2

page 4

page 6


Small UAVs-supported Autonomous Generation of Fine-grained 3D Indoor Radio Environmental Maps

Radio Environmental Maps (REMs) are a powerful tool for enhancing the pe...

Topological Indoor Mapping through WiFi Signals

The ubiquitous presence of WiFi access points and mobile devices capable...

Indoor positioning system using WLAN channel estimates as fingerprints for mobile devices

With the growing integration of location based services (LBS) such as GP...

A Fast-rate WLAN Measurement Tool for Improved Miss-rate in Indoor Navigation

Recently, location-based services (LBS) have steered attention to indoor...

Urban volumetrics: spatial complexity and wayfinding, extending space syntax to three dimensional space

Wayfinding behavior and pedestrian movement pattern research relies on o...

PropEM-L: Radio Propagation Environment Modeling and Learning for Communication-Aware Multi-Robot Exploration

Multi-robot exploration of complex, unknown environments benefits from t...

Indoor Localization Techniques Within a Home Monitoring Platform

This paper details a number of indoor localization techniques developed ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

This research project is a continuation of Research Project I: Drone-based Autonomous Generation of 3D Maps of Indoor Radio Environments [1] where we showed how a customized Crazyflie drone can autonomously gather IEEE 802.11b, 802.11g and 802.11n beacon frame data in the 2.4 GHz Industrial, Scientific and Medical (ISM) band and send that data to a base station for storage and further processing. The drone was equipped with a custom integrated WiFi module and an Ultra Wide Band (UWB) positioning system with high accuracy. The drones got their instructions, including waypoints to visit, from software running on a base station which featured the ability to strategically shut down the controlling radio to minimize interference while the drone scans the IEEE 802.11 2.4 GHz band.

Building on that work, this project aims to provide a framework for using additional drones that can be seamlessly integrated into the system, allowing for sequential data collection with multiple drones. This results in a scalable way to efficiently gather beacon frame data within a room while making improvements to the drones’ stability when scanning compared to the previous project. As before, we are primarily interested in collecting the signal strength of access points in the vicinity. Once this data is collected, further analysis will be done to show the possibilities this method provides on a larger scale.

The drones gather this data at discrete locations within a three-dimensional space. We conclude this project by proposing machine-learning models that can predict the signal strength of these recorded access points with a reasonable error at positions within that space which were not visited.

Ii Scalability and enhancements

Ii-a Drones

This project uses two Crazyflie 2.1 [2] drones as shown in figure 1 for the data collection.

Fig. 1: Two customized Crazyflie 2.1 drones

The hardware configuration of these drones has not changed, they are each equipped with two expansion decks:

  • A Loco Positioning Deck that acts as a tag in the UWB based Loco Positioning System (LPS), this allows the drones to calculate their position with decimetre-level accuracy

  • A custom deck with an ESP8266 WiFi module used to collect IEEE 802.11b, 802.11g and 802.11n signals in the 2.4 GHz ISM band

To benefit from upstream fixes and improvements, the custom firmware [3] of the drones was rebased on the 2021.06 [4] crazyflie-firmware release.

Ii-B Hovering improvements

During project 1, one of the major measures we took to minimize radio interference was to shut down the controlling radio during the three seconds that measurements were being collected by the WiFi module. Both operate in the 2.4 GHz ISM band and the reduction in interference was significant.

Unfortunately, we also saw some occasional drifting of the drone during those few seconds that it lost contact with the base station. When the drone loses its radio connection, it also loses its ability to get new setpoints (target positions) from the base station. When no new setpoint is received for over 500 ms, the drone will set its attitude angles (pitch, roll and yaw) to 0 in order to keep itself stabilized. While this system does provide some stability, it is not necessarily making the drone hold its position when it has sufficient momentum or for example an unbalanced propeller.

Figure 2 details how the base station’s custom Python client can forward setpoints (target locations) to the Commander in the drone’s firmware by making use of the CFlib library.

Fig. 2: The Crazyflie commander framework

In order to make the drone hold its position after shutting down the radio connection, an extra FreeRTOS Task 2 was added to the driver of the ESP8266 deck that will feed back the scanning position every 100ms to the drone’s commander during such a scan. This task gets resumed at the start of the scanning task and suspended at the end of it so that it doesn’t interfere with regular waypoint activities. This feedback process results in the drone not just having stability while scanning, but also actively maintaining its position.

[h] Hovering Task [style=tango,bgcolor=mannibg,fontsize=]C static void hoverWhileScanning(void* arg) setpoint_t hoverSetpoint; uint8_t hoverCount = 0;

while(1) // Hover while we are scanning getHoverSetpoint( &hoverSetpoint, position.x, position.y, position.z );

if (hoverCount == 10) consolePrintf( ”Hovering at position x: (double)hoverSetpoint.position.x, (double)hoverSetpoint.position.y, (double)hoverSetpoint.position.z );

hoverCount = 0;

commanderSetSetpoint(&hoverSetpoint, 3); hoverCount++; vTaskDelay(M2T(100));

Iii Data collection

Iii-a Test environment

The 3D volume for the drones to scan is a rectangular cuboid of 3.74m long (x-axis), 3.20m wide (y-axis) and 2.10m high (z-axis), located in a living room in a big apartment building. This provides a real-world environment where signals of many access points in different configurations are available.

At each of the 8 corners of the cuboid, an LPS anchor is placed to enable the drone to calculate its position within the volume.

Anchor placement is done according to the 8 anchor reference setup of Bitcraze as shown in figure 3.

Fig. 3: 8 Anchors LPS reference setup

In order to minimize radio interference, the drones are run in sequence, not in parallel. Since we’re working with multiple drones, the Loco Positioning System is configured to use one of the Time Difference of Arrival protocols (TDoA2) instead of the Two Way Ranging (TWR) protocol.

Positions of the anchors are documented in table I.

Anchor ID x y z
TABLE I: LPS Anchor positions in meters

Iii-B Scan locations

The endurance test we ran during project 1 [1] showed that a Crazyflie drone in this configuration can operate for a little over 6 minutes reliably when flying stationary and performing a scan every 8 seconds. We can expect its endurance to be lower when the demands increase: visiting different locations and doing more frequent scans.

With this constraint in mind, 72 locations evenly spread over the volume to scan were identified with each drone being responsible for scanning 36 of them. The drones have 4 seconds to fly from one location to the next and require 3 seconds to perform a scan. Scanning 36 locations should therefore take at least or 4 minutes and 12 seconds. If we add the time required to take off, land and the more intensive itinerary, the drones will come close to their maximum operating time.

Fig. 4: Scanned locations

Figure 4 shows the distribution of the locations to scan (waypoints) for both drones. The eight black spheres represent the anchors of the Loco Positioning System.

Iii-C Client software

The drones are controlled by a base station: a laptop running a custom Python client [5] developed during project 1 which is able to communicate with the Crazyflie drones using the Python library provided by Bitcraze.

The client is responsible for sending the drones from waypoint to waypoint and instructing them to scan. Once a scan at a waypoint is finished, the drone will send the results back to the client where they are parsed, enriched with a timestamp and stored for further processing.

The collected samples will be tuples where the timestamp is set by the client upon receiving the other tuple elements from the drone.

For this project, the client was modified to be able to control multiple drones in a sequential fashion with a matching set of waypoints and parameters: radio address, starting position and yaw. While this project shows that working for two drones, it can be easily scaled up to many more by simply adding sets of waypoints and parameters. This keeps the added complexity of introducing an extra drone small and constant.

Iv Data processing

Iv-a Exploration

Using this setup, data was collected for further analysis and processing. A total of 2696 samples were collected, 1495 by drone A and 1201 by drone B. During data collection, drone A was active for 5 minutes 3 seconds and drone B for precisely 5 minutes.

Table III shows a few interesting characteristics on the collected samples.

Characteristic Value
Distinct MAC addresses
Distinct SSIDs
Distinct channels
Median RSSI
TABLE II: Characteristics of collected samples

As mentioned in III-A, the data was collected in a big apartment building, this explains why there are many more MAC addresses than SSIDs as the major ISPs are advertising their own networks on the routers/modems provided by them. These SSIDs like TelenetWiFree are broadcasted by multiple devices.

Fig. 5: Amount of samples collected per MAC address

Figure 5 shows that some access points (MAC addresses) were only seen in a few locations while 11 access points were seen in all 72 scanned locations.

In figure 6 we see that the majority of the samples were collected in just three IEEE 802.11 channels: 1, 6 and 11.

Fig. 6: Amount of samples collected per IEEE 802.11 channel in the 2.4 GHz band

Iv-B Difference in collected samples between the drones

As we saw in the previous section, there is a relatively big difference in samples collected between drone A and drone B, this is unexpected and warrants further investigation.

When we look at the samples collected per drone and scanned location (figure 7), we see no obvious issues with the amount of samples collected by drone B, except that the amount of samples in general seems to be lower than for drone A. There are environmental factors that can play a role however:

  • The positive x-axis and negative y-axis point towards the centre of the apartment building where we can expect to see more signals.

  • There is a wall segment that is 40 cm wider where drone B’s measurements are taken compared to drone A, as illustrated in figure 7.

Fig. 7: Amount of samples per drone and scanned location

When we expect to see more signals (more access points) towards the centre of the building, then we should see that increase gradually, irrespective of which drone collected the sample. An illustration of this can be seen in figure 8 which shows a histogram per axis that groups the x and y values in bins of 0.5 m with the height representing the amount of samples collected by the drones in that bin. We can clearly see that the amount of samples collected increases with an increasing x-coordinate and a decreasing y-coordinate.

Fig. 8: Histograms showing the amount of samples collected per bin of 0.5 m along the x and y-axis

Iv-C Pre-processing

A few pre-processing steps have been taken before continuing with the data:

  1. Since SSIDs can be shared between devices, they are not that useful and aren’t used. Where appropriate, signals will be grouped based on their MAC address.

  2. The timestamps are left out of consideration as well. The time difference between the first and last collected sample is less than 10 minutes.

  3. MAC addresses with less than 16 samples will be dropped (See figure 5). While this number is arbitrary, it’s not unreasonable since:

    • The goal of the project is to predict RSSI values of access points for which we have measurements

    • Enough data points per MAC address are required in order to build a reliable model. The data needs to be split up into a train / test set and potentially a validation set as well.

    • There is a sufficient amount of MAC addresses with (close to) the maximum of 72 samples.

  4. MAC and channel features will be considered categorical and one-hot encoded

This pre-processing results in 2565 retained samples (131 dropped) with the features and types as illustrated in table III:

Feature Type
x float64
y float64
z float64
rssi int64
mac object (one-hot encoded)
channel object (one-hot encoded)
TABLE III: Features and types of pre-processed samples

Iv-D Loss function

For this regression problem, the accuracy of estimators will be measured based on the root mean square error of their predictions.

Iv-E Train / Test split

In order to have an unbiased view on an estimator’s predictive capacity, the pre-processed data will be split into a training () and test (

) set. For those estimators that require an additional validation set for tuning their hyperparameters, the validation set will be taken out of the training set.

Iv-F Baseline estimator

In order to assess more elaborate estimators we’ll use a baseline estimator that always returns the mean. The DummyRegressor class of the scikit-learn package was used for this with the strategy set to ”mean” (mean as strategy performs slightly better than median). Running this regressor on the whole dataset as-is yields a root mean square error (RMSE) of dBm. However, taking the mean RSSI over the whole dataset is not the best approach since we can expect the RSSI values of a single access point to be close to each other, but not necessarily close to (all) other access points.

The baseline was therefore adjusted to generate a regressor per MAC address, that way it will return the mean per access point. This resulted in a reduction of the error, with an RMSE now of dBm. This error will be used to compare other estimators to.

Iv-G k-Nearest Neighbors estimator

Our data is very locational as it represents signals in a 3D space, a k-nearest neighbour regressor seems therefore interesting.

The K-nearest neighbour regressor was implemented using the KNeighborsRegressor of the sci-kit learn library. As features the x, y, z coordinates were chosen as well as the one-hot encoded MAC addresses. Including the one-hot encoded MAC addresses has the advantage that samples with a different MAC address will be considered farther away than similar samples with the same MAC address.

The kNN regressor was configured to use Euclidean distance by setting

metric=minkowski and p=2. Euclidean distance makes sense since we have , and coordinates in a three-dimensional space. The weights and n_neighbors parameters were tuned using a grid search where the optimal values were weights=distance and n_neighbors=5.

This resulted in an RMSE of dBm, slightly better than the baseline.

As mentioned earlier, the one-hot encoded MAC addresses play an important role, a sample with a different MAC would have distance

with the latter 2 terms coming from different values in the one-hot encoded columns. A sample from the same access point would have distance

because their one-hot encoded MAC columns would completely match.

When the KneighborsRegressor’s weights parameter is set to ”distance”, RSSI values of neighbours are weighed by the distance to them. It would be interesting to have samples with a different MAC address even farther away than what the terms are currently contributing. This can be achieved by multiplying the one-hot encoded values by a chosen factor.

The optimal value for this factor was calculated by doing a grid search on values between 1 and 20. Based on the training set, performance was best when using a factor of 3 leading to an RMSE of dBm. This grid search also tuned the optimal value of the n_neighbours parameter upwards to 16.

Iv-H k-Nearest Neighbors estimator per MAC address

Instead of giving samples with a different MAC address a greater distance, we can also apply the same technique as was done for the baseline estimator: build a k-nearest neighbours estimator per MAC address.

We keep the hyperparameters of these MAC-based regressors the same as in the previous section but exclude the one-hot encoded MAC addresses since we’re building a regressor per MAC address. While the result with an RMSE of dBm is quite close the previous estimator, an improvement was expected as there is no reason taking samples of unrelated MAC addresses into account would yield a better result. This collection of regressors can only work with a small subset of the samples per regressor though, which might explain the lack in performance.

Iv-I Neural Network

The last solution to this regression problem is to build a neural network that can predict RSSI values of our test set. The Keras library was used to build and test the network and different solutions and configurations were considered, including:

  • Multiple hidden layers with a varying amount of nodes

  • Normalized RSSI values

  • Multiple inputs: 1 for the x, y, z coordinates and 1 for the hot-encoded MAC addresses that get combined into a common hidden layer

  • Different activation functions and optimizers

While many of these solutions had a competitive RMSE when ran against the test set, a simple neural network with a single hidden layer of 16 nodes outperformed all with an RMSE of 4.4870 dBm. While this is quite a bit better compared to our baseline, it does fall short of the best k-nearest neighbours solution discussed earlier.

This best performing neural network had the following configuration:

  • An input layer for the x, y, z coordinates and the one-hot encoded MAC addresses

  • A sigmoid activation function

  • A hidden layer with 16 fully connected nodes

  • A linear activation function

  • An output layer with a single node for the prediction

  • An Adam-based optimizer

Iv-J Comparison

Figure 9 shows a comparison of the RMSEs of the different regressors that were tested. While the regressor that always predicted the mean RSSI of the training set didn’t perform well, the results of the other regressors are close to each other, making it difficult to propose one over the other.

Fig. 9: Histograms showing the amount of samples collected per bin of 0.5 m along the x and y-axis

V Future Work

While this project has shown how drones can be used to gather IEEE 802.11b, 802.11g and 802.11n beacon data at scale with the purpose of building a radio environment map, several opportunities for expansion remain. The drones together with the base station record timestamps with the data they gather. Exploring how the data changes and how the different regressors behave with data spanning multiple days or even weeks would add an extra dimension. Another interesting expansion would be to replace the custom ESP8266 deck with another lightweight sensor so that this system can be repurposed for collecting different types of data. Finally, Bitcraze - the company that designed the open Crazyflie platform - has finalized development on a new compatible infrared-based positioning system called Lighthouse. The range of this positioning system is smaller but its accuracy and precision are competitive while requiring less anchors and being cheaper overall. This could allow for an easier to deploy solution.

Vi Conclusion

This project has shown how the setup with an accurate positioning system, a customized drone and a controlling base-station can be used at scale to build radio environment maps. Fully autonomous, it can gather thousands of IEEE 802.11b, 802.11g and 802.11n beacon data points in the 2.4 GHz band using multiple drones and with improved stability in a real-world environment.

Several methods are proposed in this report that enable us to predict RSSI values at unknown locations with a root mean square error smaller than 4.5 dBm, outperforming the baseline mean-based predictor.

Vii Acknowledgement

I would like to thank my supervisor for this project, Dr. Filip Lemic, for his guidance and support over the course of this project. I would also like to thank Prof. Dr. Jeroen Famaey for giving me the opportunity to continue my research in this area.

Crazyflie related schematics and images (Figures 2 and 3) are copyright Bitcraze AB and are used according to their Creative Commons Attribution 3.0 License.