DronePaint: Swarm Light Painting with DNN-based Gesture Recognition

by   Valerii Serpiva, et al.

We propose a novel human-swarm interaction system, allowing the user to directly control a swarm of drones in a complex environment through trajectory drawing with a hand gesture interface based on the DNN-based gesture recognition. The developed CV-based system allows the user to control the swarm behavior without additional devices through human gestures and motions in real-time, providing convenient tools to change the swarm's shape and formation. The two types of interaction were proposed and implemented to adjust the swarm hierarchy: trajectory drawing and free-form trajectory generation control. The experimental results revealed a high accuracy of the gesture recognition system (99.75 precision of the trajectory drawing (mean error of 5.6 cm in comparison to 3.1 cm by mouse drawing) over the three evaluated trajectory patterns. The proposed system can be potentially applied in complex environment exploration, spray painting using drones, and interactive drone shows, allowing users to create their own art objects by drone swarms.



page 1


SwarmPaint: Human-Swarm Interaction for Trajectory Generation and Formation Control by DNN-based Gesture Interface

Teleoperation tasks with multi-agent systems have a high potential in su...

DroneTrap: Drone Catching in Midair by Soft Robotic Hand with Color-Based Force Detection and Hand Gesture Recognition

The paper proposes a novel concept of docking drones to make this proces...

Towards Decentralized Human-Swarm Interaction by Means of Sequential Hand Gesture Recognition

In this work, we present preliminary work on a novel method for Human-Sw...

A wearable general-purpose solution for Human-Swarm Interaction

Swarms of robots will revolutionize many industrial applications, from t...

Dynamic Drawing Guidance via Electromagnetic Haptic Feedback

We propose a system to deliver dynamic guidance in drawing, sketching an...

DroneLight: Drone Draws in the Air using Long Exposure Light Painting and ML

We propose a novel human-drone interaction paradigm where a user directl...

Swarm Fabrication: Reconfigurable 3D Printers and Drawing Plotters Made of Swarm Robots

We introduce Swarm Fabrication, a novel concept of creating on-demand, s...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Human and aerial swarm interaction (HSI) nowadays serves multiple purposes, such as search and rescue operations, cargo delivery, remote inspection, etc. One of the prominent implementations of the robotic swarms had recently emerged in art industry, where drones perform as scalable and interactive tools for light and spray painting. For example, an autonomous drone equipped with a spray gun-holding arm was developed by (Vempati et al., 2018) for spray painting on various three-dimensional surfaces. Furthermore, a multi-drone graffiti was proposed by (Uryasheva et al., 2019) with a task dispatch system based on the parametric greedy algorithm.

Several research papers focus on interactive art concepts where drones provide the color pallet by LED arrays. For instance, (Dubois, 2015) proposed an interactive choreographic show where humans and drones move synchronously with precise swarm behavior computation. A tangible experience of HSI was introduced by (Gomes et al., 2016) where drones serve as a colorful interactive 3D display. Another practical approach was presented by (Knierim et al., 2018), who proposed drones with light beacons as a navigation system that projects the map instructions into the real world.

With these considerations, a real-time control interface over the swarm is required to deliver the user an immersive real-time experience of painting. Many researchers propose gesture-based interfaces as a versatile and intuitive tool of HSI. For example, a tactile interface for HSI with an impedance-based swarm control was developed by (Tsykunov et al., 2019). A multi-channel robotic system for HSI in augmented reality was suggested by (Chen et al., 2020). (Suresh and Martínez, 2019) proposed a complex control approach with arm gestures and motions, which are recorded by a wearable armband, controlling a swarm’s shape and formation. (Alonso-Mora et al., 2015) and (Kim et al., 2020) suggested real-time input interfaces with swarm formation control. However, their approach was developed only for mobile robot operation in 2D space. The wearable devices for high mobility of the user were proposed by (Byun and Lee, 2019), suggesting an epidermal tactile sensor array to achieve the direct teleoperation of the swarm by human hand.

Previously developed systems have achieved low time delays and high precision of the control, yet their trajectory generating capability is limited to the direct gesture input and simple hand motions. We propose the DronePaint system for drone light painting with DNN-based gesture recognition to make the way we communicate with drones intuitive and intelligent. Only with a single camera and developed software any not-experienced user will be capable of generating impressive light drawings in midair.

2. System overview

Before deploying the swarm of drones, the operator positions themselves in front of the webcam, which sends the captured footage to the gesture recognition module. As soon as DronePaint is activated, the module starts to recognize the operator’s position and hand gestures, awaiting the “take off” command to deploy the swarm (Fig. 2a). After the drones take off, the gesture recognition module generates a trajectory drawn by the operator with gestures. The developed DronePaint interface allows the operator both to draw and erase the trajectory (Fig. 2c, d) to achieve the desired result. After that, the trajectory is processed by the trajectory planning module to make it suitable for the swarm of drones. Then, the processed trajectory is sent simultaneously to the swarm control module and the flight simulation module using the ROS framework. Finally, the swarm of drones is led by the swarm control module by the received trajectory.

Figure 2. Example of gestures and commands to control a drone.

Drones complete their work at the command “land” performed by the operator with the appropriate gesture (Fig. 2b).

2.1. System architecture

The developed DronePaint system software consists of three modules: human-swarm interface, trajectory processing module, and drone control system.

Figure 3. System architecture. The three key modules: human-swarm interface, trajectory processing module, and drone control system.

The hardware part consists of Vicon Tracking system with 12 IR cameras for drone positioning and PC with mocap framework, a PC with CV system and drone-control framework, Logitech HDPro Webcam C920 of @30FPS for recognizing the operator hand movements and gestures, small quadcopters Crazyflie 2.0, and PC with Unity environment for visual feedback provided to the operator (Fig. 3). Communication between all systems is performed by ROS framework.

2.2. Gesture recognition

The human-swarm interface consists of two components: hand tracking and gesture recognition modules. The hand tracking module is implemented on the base of the Mediapipe framework. It provides high-fidelity tracking of the hand by employing Machine Learning (ML) to infer 21 key points of a human hand per a single captured frame. The gesture recognition module is based on Deep Neutral Network (DNN) to achieve high precision in human gesture classification, used for drone control and trajectory generation.

For convenient drone control we propose 8 gestures: “one”, “two”, “three”, “four”, “five”, “okay”, “rock”, and “thumbs up”. A gesture dataset for the model training was recorded by five participants. It consists of 8000 arrays with coordinates of 21 key points of a human hand: 1000 per each gesture (200 per each person). We used normalized landmarks, i.e., angles between joints and pairwise landmark distances as features to predict the gesture class. It resulted in accuracy of 99.75% when performing validation on a test set (Fig. 4).

Figure 4. The classification accuracy of the developed gesture recognition system.

Using coordinates of the landmarks, we calculate the hand coordinates and size on the image received from the USB camera. To calculate the distance between the hand and camera the value of the palm size was applied.

2.3. Trajectory processing

To ensure a smooth flight of the drone, a trajectory with equidistant flight coordinates is required. The drawn trajectory may be uneven or contain unevenly distributed coordinates. Therefore, the trajectory processing module smooths the drawn trajectory with an alpha-beta filter (filtration coefficient equals 0.7) and then interpolates it uniformly (Fig.

5). After that, the trajectory coordinates are transformed from the DronePaint interface screen (pixels) to the flight zone coordinate system (meters). The coordinates of the generated trajectory are sequentially transferred to the drone control system using ROS framework. The time intervals between each coordinate transfer depend on the distance between the coordinates and the flight speed of the drone.

Figure 5. Hand trajectory normalization. Trajectory recorded from hand movement (red line). Trajectory with filtration and linear interpolation (blue line).

2.4. Swarm control algorithm

The potential field approach was adjusted and applied to UAVs for robust path planning and collision avoidance between the swarm units. The basic principle of this method lies in the modeled force field, which is composed of two opposing forces, i.e., attractive force and repulsive force. The attractive force pulls the UAV to the desired position, located on the drawn and processed trajectory, while the repulsive force repels the UAV. The repulsive force centers are located on the obstacle surfaces and the other UAVs.

3. Experimental Evaluation


We invited 7 participants aged 22 to 28 years (mean=24.7, std=1.98) to test DronePaint system. 14.3% of them have never interacted with drones before, 28.6% regularly deal with drones, almost 87% of participants were familiar with CV-based systems or had some experience with gesture recognition. To evaluate the performance of the proposed interface, we collected trajectories drawn by gestures (Fig. 6) and computer mouse from the participants. Drawing by the mouse is a reference point in determining the convenience and accuracy of the proposed method.

Figure 6. View of user screen. Hand-drawn trajectory and target trajectory of square shape are illustrated by red and green lines, respectively.

After that, the three best attempts were chosen to evaluate the performance of users with two trajectory generating interfaces. Fig. 7 shows two of three ground truth paths (solid blue line), where for each one a user traces the path several times by hand gesture motion (red dashed line) and a mouse (green dashed line).

3.1. Trajectory tracing error

The comparative results of gesture-drawn and mouse-drawn trajectory generation are presented in Table 1.

Square Circle Triangle
Max error, cm 18.61 10.51 17.36 8.18 12.81 6.23
Mean error, cm 6.45 3.69 6.33 3.29 4.13 2.19
RMSE, cm 8.09 4.61 7.85 4.03 5.15 2.70
Time, sec 15.50 5.52 13.47 4.89 12.04 4.50
Table 1. Comparison experiment for trajectories drawn by the hand (H) and mouse (M).
Figure 7. Square trajectory drawn by the hand gestures (red dashed line) and mouse (green dashed line).

The experimental results showed that overall mean error equals 5.6 cm (95% confidence interval [CI], 4.6 cm to 6.6 cm) for the gesture-drawn trajectories and 3.1 cm (95% CI, 2.7 cm to 3.5 cm) for the mouse-drawn trajectories. The ANOVA results showed a statistically significant difference between the user’s interaction with trajectory patterns:

= 4.006, -value = 0.025 0.05. On average, the trajectory generated with gestures deviates by 2.5 cm farther from the “ground truth” path compared to one drawn by a computer mouse. The high positional error has presumably occurred due to the lack of tangible experience during the trajectory generation with gesture interface. This problem could be potentially solved by the integration of a haptic device, which will allow users to feel the displacement of their hand and position it more precisely.

4. Conclusions and Future Work

In this paper, a novel swarm control interface is proposed, in which the user leads the swarm by path drawing with the DNN-based gesture recognition and trajectory generation systems. Thus, DronePaint delivers a convenient and intuitive toolkit to the user without any additional devices, achieving both high accuracy and variety of the swarm control. The developed system allowed the participants to achieve high accuracy in trajectory generation (average error by 5.6 cm, max error by 9 cm higher than corresponding values during the mouse input).

In future work, we plan to add an entire body tracking to control a multitude of agents. For example, the movement of the body or hands can change the drone’s speed and orientation, which will increase the number of human swarm interaction scenarios. Additionally, we will explore methods to control a swarm in different positioning systems, such as GPS for outdoor performance. Finally, a swarm behavior algorithm for distributing tasks between drones will increase path generation performance and quality.

The proposed DronePaint systems can potentially have a big impact on movie shooting to achieve desirable lighting conditions with the swarm of spotlights controlled by an operator. Additionally, it can be used in a new generation of a light show where each spectator will be able to control the drone display, e.g., navigate the plane, launch the rocket, or even draw the rainbow in the night sky.

The reported study was funded by RFBR and CNRS, project number 21-58-15006.


  • J. Alonso-Mora, S. Haegeli Lohaus, P. Leemann, R. Siegwart, and P. Beardsley (2015) Gesture based human - multi-robot swarm interaction and its application to an interactive display. In 2015 IEEE International Conference on Robotics and Automation (ICRA), Vol. , pp. 5948–5953. External Links: Document Cited by: §1.
  • S. Byun and S. Lee (2019) Implementation of hand gesture recognition device applicable to smart watch based on flexible epidermal tactile sensor array. Micromachines 10 (10). External Links: Link, ISSN 2072-666X, Document Cited by: §1.
  • M. Chen, P. Zhang, Z. Wu, and X. Chen (2020) A multichannel human-swarm robot interaction system in augmented reality. Virtual Reality & Intelligent Hardware 2 (6), pp. 518–533. External Links: ISSN 2096-5796, Document, Link Cited by: §1.
  • C. Dubois (2015) Sparked: a live interaction between humans and quadcopters. In ACM SIGGRAPH 2015 Computer Animation Festival, SIGGRAPH ’15, New York, NY, USA, pp. 140. External Links: ISBN 9781450333269, Link, Document Cited by: §1.
  • A. Gomes, C. Rubens, S. Braley, and R. Vertegaal (2016) BitDrones: towards using 3d nanocopter displays as interactive self-levitating programmable matter. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI ’16, New York, NY, USA, pp. 770–780. External Links: ISBN 9781450333627, Link, Document Cited by: §1.
  • L. H. Kim, D. S. Drew, V. Domova, and S. Follmer (2020) User-defined swarm robot control. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20, New York, NY, USA, pp. 1–13. External Links: ISBN 9781450367080, Link, Document Cited by: §1.
  • P. Knierim, S. Maurer, K. Wolf, and M. Funk (2018) Quadcopter-projected in-situ navigation cues for improved location awareness. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI ’18, New York, NY, USA, pp. 1–6. External Links: ISBN 9781450356206, Link, Document Cited by: §1.
  • A. Suresh and S. Martínez (2019) Gesture based human-swarm interactions for formation control using interpreters. IFAC-PapersOnLine 51 (34), pp. 83–88. Note: 2nd IFAC Conference on Cyber-Physical and Human Systems CPHS 2018 External Links: ISSN 2405-8963, Document, Link Cited by: §1.
  • E. Tsykunov, R. Agishev, R. Ibrahimov, L. Labazanova, A. Tleugazy, and D. Tsetserukou (2019) SwarmTouch: guiding a swarm of micro-quadrotors with impedance control using a wearable tactile interface. IEEE Transactions on Haptics 12 (3), pp. 363–374. External Links: Document Cited by: §1.
  • A. Uryasheva, M. Kulbeda, N. Rodichenko, and D. Tsetserukou (2019) DroneGraffiti: autonomous multi-uav spray painting. In ACM SIGGRAPH 2019 Studio, SIGGRAPH ’19, New York, NY, USA. External Links: ISBN 9781450363167, Link, Document Cited by: §1.
  • A. S. Vempati, M. Kamel, N. Stilinovic, Q. Zhang, D. Reusser, I. Sa, J. Nieto, R. Siegwart, and P. Beardsley (2018) PaintCopter: an autonomous uav for spray painting on three-dimensional surfaces. IEEE Robotics and Automation Letters 3 (4), pp. 2862–2869. External Links: Document Cited by: §1.