I Introduction
Unmanned Aerial Vehicles (UAVs) require high precision control of aircraft positioning, especially during landing and takeoff. This problem is challenging largely due to complex interactions of rotor airflows with the ground. The aerospace community has long identified the change in aerodynamic forces when helicopters or aircraft fly close to the ground. Such ground effects cause an increased lift force and a reduced aerodynamic drag, which can be both helpful and disruptive in flight stability [1], and the complications are exacerbated with multiple rotors. Therefore, performing automatic landing of UAVs is riskprone, and requires expensive highprecision sensors as well as carefully designed controllers.
Compensating for ground effect is a longstanding problem in the aerial robotics community. Prior work has largely focused on mathematical modeling (e.g. [2]) as part of system identification (ID). These groundeffect models are later used to approximate aerodynamics forces during flights close to the ground and combined with controller design for feedforward cancellation (e.g. [3]). However, existing theoretical ground effect models are derived based on steadyflow conditions, whereas most practical cases exhibit unsteady flow. Alternative approaches, such as integral or adaptive control methods, often suffer from slow response and delayed feedback. [4] employs Bayesian Optimization for openair control but not for takeoff/landing. Given these limitations, the precision of existing fully automated systems for UAVs are still insufficient for landing and takeoff, thereby necessitating the guidance of a human UAV operator during those phases.
To capture complex aerodynamic interactions without not being overlyconstrained by conventional modeling assumptions, we take a machinelearning (ML) approach to build a blackbox ground effect model using Deep Neural Networks (DNNs). However, incorporating blackbox models into a UAV controller faces three key challenges. First, DNNs are notoriously datahungry and it is challenging to collect sufficient realworld training data. Second, due to highdimensionality, DNNs can be unstable and generate unpredictable output, which makes the system susceptible to instability in the feedback control loop. Third, DNNs are often difficult to analyze, which makes it difficult to design provably stable DNNbased controllers.
The aforementioned challenges pervade previous works using DNNs to capture highorder nonstationary dynamics. For example, [5, 6] use DNNs to improve system ID of helicopter aerodynamics but not for downstream controller design. Other approaches aim to generate reference inputs or trajectories from DNNs [7, 8, 9, 10]. However, such approaches can lead to challenging optimization problems [7], or heavily rely on welldesigned closedloop controller and require a large number of labeled training data [8, 9, 10]. A more classical approach of using DNNs is direct inverse control [11, 12, 13] but the nonparametric nature of a DNN controller also makes it challenging to guarantee stability and robustness to noise. [14]
proposes a provably stable modelbased Reinforcement Learning method based on Lyapunov analysis. However, their approach requires a potentially expensive discretization step and relies on the native Lipschitz constant of the DNN.
Contributions. In this paper, we propose a learningbased controller, NeuralLander, to improve the precision of quadrotor landing with guaranteed stability. Our approach does directly learns the ground effect on coupled unsteady aerodynamics and vehicular dynamics. We use deep learning for system ID of residual dynamics and then integrate it with nonlinear feedback linearization control.
We train DNNs with spectral normalization of layerwise weight matrices. We prove that the resulting controller is globally exponentially stable under bounded learning errors. This is achieved by exploiting the Lipschitz bound of spectrally normalized DNNs. It has earlier been shown that spectral normalization of DNNs leads to good generalization, i.e. stability in a learningtheoretic sense [15]. It is intriguing that spectral normalization simultaneously guarantees stability both in a learningtheoretic and a controltheoretic sense.
We evaluate NeuralLander for trajectory tracking of quadrotor during takeoff, landing and near ground maneuvers. NeuralLander is able to land a quadrotor much more accurately than a naive PD controller with a preidentified system. In particular, we show that compared to the PD controller, NeuralLander can decrease error in direction from 0.13m to zero, and mitigate and drifts by 90% and 34% respectively, in 1D landing. Meanwhile, NeuralLander can decrease error from 0.12m to zero, in 3D landing.^{1}^{1}1Demo videos: https://youtu.be/C_K8MkC_SSQ We also demonstrate that the learned groundeffect model can handle temporal dependency, and is an improvement over the steadystate theoretical models in use today.
Ii Problem Statement: Quadrotor Landing
Given quadrotor states as global position , velocity , attitude rotation matrix , and body angular velocity , we consider the following dynamics:
(1)  
where
is the gravity vector, and
and are the total thrust and body torques from four rotors predicted by a nominal model. We use to denote the output wrench. The linear equation relates the control input of squared motor speeds to the output wrench with its nominal relation given as :(2) 
where and denote some empirical coefficient values for force and torque generated by an individual rotor, and denotes the length of each rotor arm.
The key difficulty of precise landing is the influence of unknown disturbance forces and torques , which originate from complex aerodynamic interactions between the quadrotor and the environment. For example, during the landing process, when the quadrotor is close to ground, vertical aerodynamic force will be significant. Also, as increases, air drag will be exacerbated, which contributes to .
Problem Statement: For (1), our goal is to learn the unknown disturbance forces and torques from partial states and control inputs, in order to improve the controller accuracy. In this paper, we are only interested in position dynamics (the first two equations in eq. 1). As we mainly focus on landing and takeoff, the attitude dynamics is limited and the aerodynamic disturbance torque is bounded. We take a deep learning approach by approximating using a Deep Neural Network (DNN)， followed by spectral normalization to guarantee the stability of the DNN outputs. We then design an exponentiallystabilizing controller with superior robustness than using only the nominal system dynamics. Training is done offline, and the learned dynamics is applied in the onboard controller in realtime.
Iii Learning Stable DNN Dynamics
To learn the residual dynamics, we employ a deep neural network with Rectified Linear Units (ReLU) activation. In general, DNNs equipped with ReLU converge faster during training, demonstrate more robust behavior with respect to hyperparameters changes, and have fewer vanishing gradient problems compared to other activation functions such as
sigmoid, tanh [16].Iiia ReLU Deep Neural Networks
A ReLU deep neural network represents the functional mapping from the input to the output , parameterized by the DNN weights :
(3) 
where the activation function is called the elementwise ReLU function. ReLU is less computationally expensive than tanh and sigmoid because it involves simpler mathematical operations. However, deep neural networks are usually trained by firstorder gradient based optimization, which is highly dependent on the curvature of the training objective and can be very unstable [17]. To alleviate this issue, we apply the spectral normalization technique [15] in the feedback control loop to guarantee stability.
IiiB Spectral Normalization
Spectral normalization stabilizes DNN training by constraining the Lipschitz constant of the objective function. Spectral normalization has also been shown to generalize well [18] and in machine learning generalization is a notion of stability. Mathematically, the Lipschitz constant of a function is defined as the smallest value such that
It is known that the Lipschitz constant of a general differentiable function
is the maximum spectral norm (maximum singular value) of its gradient over its domain
.The ReLU DNN in eq. 3 is a composition of functions. Thus we can bound the Lipschitz constant of the network by constraining the spectral norm of each layer . Therefore, for a linear map , the spectral norm of each layer is given by . Using the fact that the Lipschitz norm of ReLU activation function is equal to 1, with the inequality , we can find the following bound on :
(4) 
In practice, we can apply spectral normalization to the weight matrices in each layer during training as follows:
(5) 
The following lemma bounds the Lipschitz constant of a ReLU DNN with spectral normalization.
Lemma III.1
For a multilayer ReLU network , defined in eq. 3 without an activation function on the output layer. Using spectral normalization, the Lipschitz constant of the entire network satisfies:
with spectrallynormalized parameters .
IiiC Constrained Training
We apply firstorder gradientbased optimization to train the ReLU DNN. Estimating
in (1) boils down to optimizing the parameters in the ReLU network model in eq. 3, given observed value ofand the target output. In particular, we want to control the Lipschitz constant of the ReLU network.
The optimization objective is as follows, where we minimize the prediction error with constrained Lipschitz constant:
subject to  (6) 
Here is the observed disturbance forces and is the observed states and control inputs. According to the upper bound in eq. 4
, we can substitute the constraint by minimizing the spectral norm of the weights in each layer. We use stochastic gradient descent (SGD) to optimize
eq. 6 and apply spectral normalization to regulate the weights. From Lemma III.1, the trained ReLU DNN has a bounded Lipschitz constant.Iv Neural Lander Controller Design
We design our controller to allow 3D landing trajectory tracking for quadrotors. Our controller integrates a DNNbased dynamic learning module with a proportionalderivative (PD) controller. In order to keep the design simple, we redesign the PD controller to account for the disturbance force term learned from the ReLU DNN. We solve for the resulting nonlinear controller using fixedpoint iteration.
Iva Reference Trajectory Tracking
The position tracking error is defined as . We design an integral controller with the composite variable:
(7) 
with as a positive diagonal matrix. Then is a manifold on which exponentially quickly. Now we have transformed the position tracking problem to a velocity tracking one, we would like the actual force exerted by the rotor to satisfy:
(8) 
so that the closedloop dynamics would simply become . Hence, these exponentiallystabilizing dynamics guarantee that converge exponentially and globally to with bounded error, if is bounded [19, 20](see Sec. V). Let denote the total desired force vector from the quadrotor, then total thrust and desired force direction can be computed from eq. 8,
(9) 
with being the direction of rotor thrust (typically axis of quadrotors). Using and fixing a desired yaw angle, SO(3) or a desired value of any attitude representation can be obtained [21]. We assume the attitude controller comes in the form of desired torque to be generated by the four rotors. One such example is:
(10) 
where with , or see [20] for SO(3) tracking control. Note that eq. 10 guarantees exponential tracking of a desired attitude trajectory within some bounded error in the presence of some disturbance torques.
IvB Learningbased Discretetime Nonlinear Controller
Using methods described in Sec. III, we define as the approximation to the disturbance aerodynamic forces, with being the partial states used as input features. Then desired total force is revised as .
Because of the dependency of on , the control synthesis problem here uses a nonaffine control input for :
(11) 
With , We propose the following fixedpoint iterative method for solving eq. 11
(12) 
and is the control input from the previous timestep in the controller. The stability of the system and convergence of controller eq. 12 will be proved in Sec. V.
V Nonlinear Stability Analysis
The closedloop tracking error analysis provides a direct correlation on how to tune the neural network and controller parameter to improve control performance and robustness.
Va Control Allocation as Contraction Mapping
We first show that converges to the solution of eq. 11 when all states are fixed.
Lemma V.1
Fixing all current states, define mapping based on eq. 12:
(13) 
If is Lipschitz continuous, and ; then is a contraction mapping, and converges to unique solution of .
with being a compact set of feasible control inputs; and given fixed states as , and , then:
Thus, . Hence, is a contraction mapping.
VB Stability of Learningbased Nonlinear Controller
Before continuing to prove stability of the full system, we make the following assumptions.
Assumption 1
The desired states along the position trajectory , , and are bounded.
Note that trajectory generation can guarantee tight bounds through optimization [21, 22] or simple clipping.
Assumption 2
updates much faster than position controller. And onestep difference of control signal satisfies with a small positive .
Tikhonovs’s Theorem (Theorem 11.1 [23]) provides a foundation for such a timescale separation, where converges much faster than the slower dynamics. From eq. 13, we can derive the following approximate relation with :
By using the fact that the frequencies of attitude control () and motor speed control () are much higher than that of the position controller () in practice, we can safely assume that , , and in one update step become negligible. Furthermore, can be limited internally by the attitude controller. It leads to:
With being a small constant and from Lemma. V.1, we can deduce that rapidly converges to a small ultimate bound between each position controller update.
Assumption 3
The approximation error of over the compact sets , is upper bounded by , where .
DNNs have been shown to generalize well to the set of unseen events that are from almost the same distribution as a training set [24, 25]. This empirical observation is also theoretically studied in order to shed more light toward an understanding of the complexity of these models [26, 18, 27, 28]. Our experimental results show that our proposed training method in Sec. III generalizes well on unseen events and results in a better performance on unexplored data (Sec. VIC). Composing our stability result rigorously with generalization error would be an interesting direction for future work. Based on these assumptions, we can now present our overall robustness result.
Theorem V.2
We begin the proof by selecting a Lyapunov function based on as , then by applying the controller eq. 8, we get the timederivative of :
Let
denote the minimum eigenvalue of the positivedefinite matrix
. By applying the Lipschitz property of the network approximator lemma III.1 and Assumption 2, we obtainUsing the Comparison Lemma[23], we define and to obtain
It can be shown that this leads to finitegain stability and inputtostate stability (ISS) [29]. Furthermore, the hierarchical combination between and in eq. 7 yields (14). Note that disabling integral control in eq. 7 (i.e., ) results in .
By designing the controller gain and Lipschitz constant of the DNN, we ensure and achieve exponential tracking within bound .
Vi Experiments
In our experiments, we evaluate both the generalization performance of our DNN as well as overall control performance of NeuralLander. The experimental setup is composed of 17 motion capture cameras, the communication router for sending signals and the drone. The data was collected from an Intel Aero quadrotor weighting 1.47 kg with a computer running on it (2.56 GHz Intel Atom x7 processor, 4 GB DDR3 RAM). We retrofit the drone with eight reflective markers to allow for accurate position, attitude and velocity estimation at 100Hz. The Intel Aero drone and the test space are shown in Fig. 1.
Via Bench Test
To identify a good nominal model, we first performed bench tests to estimate , , , , and , which are mass, diameter of rotor, air density, gravity, and thrust coefficient, respectively. The nondimensional thrust coefficient, , is defined as . Note that is a function of propeller speed, , and here we used a nominal value when RPM (the idle RPM) for following data collection session. How changes with is also discussed in Sec. VIC.
ViB RealWorld Flying Data and Preprocessing
In order to estimate the effect of disturbance force , we collected states and control inputs, while flying the drone close to the ground, manually controlled by an expert pilot.
Our training data is shown in Fig. 2. We collected a single trajectory with varying heights and velocities. The trajectory has two parts. Part I (0s250s in Fig. 2) contains maneuver at different fixed (0.05m1.5m) with random and . This can be used to estimate the ground effect. Part II (250s350s in Fig. 2) includes random , and motions to cover the feasible state space as much as possible. For this part, we aim to learn about nondominant aerodynamics such as air drag. We note that our training data is quite modest in size by the standards of deep learning.
Since our learning task is to regress from state and control inputs, we also need output data of . We utilized the relation from eq. 1 to calculate . Here is calculated based on the nominal from the bench test (Sec. VIA). Our training set consists of sequences of , where is the observed value of . The entire dataset was split into training (60%), test (20%) and validation set (20%) for model hyperparameter tuning.
ViC DNN Prediction Performance
We train using a deep ReLU network, where , with , , ,
corresponding to global height, global velocity, attitude, and control input. We build the ReLU network using PyTorch, an opensource deep learning library
[30]. Our ReLU network consists of four fullyconnected hidden layers, with input and the output dimensions 12 and 3, respectively. We use spectral normalization (SN) eq. 5 to bound the Lipschitz constant.To investigate how well our DNN can estimate , especially when close to the ground, we compare with a wellknown 1D steady ground effects model [1, 3]:
(15) 
where is the thrust generated by propellers, is the rotation speed, is the idle RPM, and depends on the number and arrangement of propellers ( for a single propeller, but must be tuned for multiple propellers). Note that is a function of . Thus, we can derive from .
Fig. 3(a) shows the comparison between the estimated from DNN and the theoretical ground effect model eq. 15 as we vary the global height (assuming when ). We see that our DNN can achieve much better estimates than the theoretical ground effect model. We further investigate the trend of with respect to the rotation speed . Fig. 3(b) shows the learned over the rotation speed at a given height, in comparison with the measured from the bench test. We observe that the increasing trend of the estimates is consistent with bench test results for .
To understand the benefits of SN, we compared predicted by DNNs trained both with and without SN. Fig. 3(c) shows the results. Note that 1 m/s to 1 m/s is covered in our training set but 2 m/s to 1 m/s is not. We see differences in:

Ground effect: increases as decreases, which is also shown in Fig. 3(a).

Air drag: increases as the drone goes down () and it decreases as the drone goes up ().

Generalization: the spectral normalized DNN is much smoother and can also generalize to new input domains not contained in the training set.
In [18], the authors theoretically show that spectral normalization can provide tighter generalization guarantees on unseen data, which is consistent with our empirical results. An interesting future direction is to connect generalization theory more tightly with our robustness guarantees.
ViD Control Performance
We used PD controller as the baseline controller and implemented both the baseline and NeuralLander without an integral term in (78). First we tested these two controller for the 1D takeoff/landing task, i.e., moving the drone from to and then returning it to , as shown in Fig. 4. Second we compare the controllers for the 3D takeoff/landing task, i.e., moving the drone from to and then returning it to , as shown in Fig. 5. For both tasks, we repeated the experiments times and computed the means and the standard deviations of the takeoff/landing trajectories.^{2}^{2}2Demo videos: https://youtu.be/C_K8MkC_SSQ
From Figs. 4 and 5, we can conclude that the main benefits of our NeuralLander are: (a) In both 1D and 3D cases, NeuralLander can control the drone to precisely land on the ground surface while the baseline controller cannot land due to the ground effect. (b) In both 1D and 3D cases, NeuralLander could mitigate drifts in and directions, as it also learned about nondominant aerodynamics such as air drag.
In experiments, we observed a naive unnormalized DNN () can even result in crash, which also implies the importance of spectral normalization.
Vii Conclusions
In this paper, we present NeuralLander, a deep learning based nonlinear controller with guaranteed stability for precise quadrotor landing. Compared to traditional ground effect models, NeuralLander is able to significantly improve control performance. The main benefits are: (1) our method can learn from coupled unsteady aerodynamics and vehicle dynamics, and provide more accurate estimates than theoretical ground effect models; (2) our model can capture both the ground effect and the nondominant aerodynamics, and outperforms the conventional controller in all directions (, and ); (3) we provide rigorous theoretical analysis of our method and guarantee the stability of the controller, which also implies generalization to unseen domains.
Future work includes further generalization of the capabilities of NeuralLander handling unseen state and disturbance domains even generated by a wind fan array. Another interesting direction would be to capture a longterm temporal correlation with RNNs.
Acknowledgement
The authors thank Joel Burdick, Mory Gharib and Daniel Pastor Moreno. The work is funded in part by Caltech’s Center for Autonomous Systems and Technologies and Raytheon Company.
References
 [1] I. Cheeseman and W. Bennett, “The effect of ground on a helicopter rotor in forward flight,” 1955.
 [2] K. Nonaka and H. Sugizaki, “Integral sliding mode altitude control for a small model helicopter with ground effect compensation,” in American Control Conference (ACC), 2011. IEEE, 2011, pp. 202–207.
 [3] L. Danjun, Z. Yan, S. Zongying, and L. Geng, “Autonomous landing of quadrotor based on ground effect modelling,” in Control Conference (CCC), 2015 34th Chinese. IEEE, 2015, pp. 5647–5652.
 [4] F. Berkenkamp, A. P. Schoellig, and A. Krause, “Safe controller optimization for quadrotors with Gaussian processes,” in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2016, pp. 493–496. [Online]. Available: https://arxiv.org/abs/1509.01066
 [5] P. Abbeel, A. Coates, and A. Y. Ng, “Autonomous helicopter aerobatics through apprenticeship learning,” The International Journal of Robotics Research, vol. 29, no. 13, pp. 1608–1639, 2010.
 [6] A. Punjani and P. Abbeel, “Deep learning helicopter dynamics models,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015, pp. 3223–3230.
 [7] S. Bansal, A. K. Akametalu, F. J. Jiang, F. Laine, and C. J. Tomlin, “Learning quadrotor dynamics using neural network for flight control,” in Decision and Control (CDC), 2016 IEEE 55th Conference on. IEEE, 2016, pp. 4653–4660.
 [8] Q. Li, J. Qian, Z. Zhu, X. Bao, M. K. Helwa, and A. P. Schoellig, “Deep neural networks for improved, impromptu trajectory tracking of quadrotors,” in Robotics and Automation (ICRA), 2017 IEEE International Conference on. IEEE, 2017, pp. 5183–5189.
 [9] S. Zhou, M. K. Helwa, and A. P. Schoellig, “Design of deep neural networks as addon blocks for improving impromptu trajectory tracking,” in Decision and Control (CDC), 2017 IEEE 56th Annual Conference on. IEEE, 2017, pp. 5201–5207.
 [10] C. SánchezSánchez and D. Izzo, “Realtime optimal control via deep neural networks: study on landing problems,” Journal of Guidance, Control, and Dynamics, vol. 41, no. 5, pp. 1122–1135, 2018.
 [11] S. Balakrishnan and R. Weil, “Neurocontrol: A literature survey,” Mathematical and Computer Modelling, vol. 23, no. 12, pp. 101–117, 1996.
 [12] M. T. Frye and R. S. Provence, “Direct inverse control using an artificial neural network for the autonomous hover of a helicopter,” in Systems, Man and Cybernetics (SMC), 2014 IEEE International Conference on. IEEE, 2014, pp. 4121–4122.
 [13] H. Suprijono and B. Kusumoputro, “Direct inverse control based on neural network for unmanned small helicopter attitude and altitude control,” Journal of Telecommunication, Electronic and Computer Engineering (JTEC), vol. 9, no. 22, pp. 99–102, 2017.
 [14] F. Berkenkamp, M. Turchetta, A. P. Schoellig, and A. Krause, “Safe modelbased reinforcement learning with stability guarantees,” in Proc. of Neural Information Processing Systems (NIPS), 2017. [Online]. Available: https://arxiv.org/abs/1705.08551
 [15] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization for generative adversarial networks,” arXiv preprint arXiv:1802.05957, 2018.

[16]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in
Advances in neural information processing systems, 2012, pp. 1097–1105.  [17] T. Salimans and D. P. Kingma, “Weight normalization: A simple reparameterization to accelerate training of deep neural networks,” in Advances in Neural Information Processing Systems, 2016, pp. 901–909.
 [18] P. L. Bartlett, D. J. Foster, and M. J. Telgarsky, “Spectrallynormalized margin bounds for neural networks,” in Advances in Neural Information Processing Systems, 2017, pp. 6240–6249.
 [19] J. Slotine and W. Li, Applied Nonlinear Control. Prentice Hall, 1991.
 [20] S. Bandyopadhyay, S.J. Chung, and F. Y. Hadaegh, “Nonlinear attitude control of spacecraft with a large captured object,” Journal of Guidance, Control, and Dynamics, vol. 39, no. 4, pp. 754–769, 2016.
 [21] D. Morgan, G. P. Subramanian, S.J. Chung, and F. Y. Hadaegh, “Swarm assignment and trajectory optimization using variableswarm, distributed auction assignment and sequential convex programming,” Int. J. Robotics Research, vol. 35, no. 10, pp. 1261–1285, 2016.
 [22] D. Mellinger and V. Kumar, “Minimum snap trajectory generation and control for quadrotors,” in 2011 IEEE International Conference on Robotics and Automation, May 2011, pp. 2520–2525.
 [23] H. Khalil, Nonlinear Systems, ser. Pearson Education. Prentice Hall, 2002.
 [24] C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning requires rethinking generalization,” arXiv preprint arXiv:1611.03530, 2016.

[25]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” in
Proceedings of the IEEE conference on computer vision and pattern recognition
, 2016, pp. 770–778.  [26] B. Neyshabur, S. Bhojanapalli, D. McAllester, and N. Srebro, “A pacbayesian approach to spectrallynormalized margin bounds for neural networks,” arXiv preprint arXiv:1707.09564, 2017.
 [27] G. K. Dziugaite and D. M. Roy, “Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data,” arXiv preprint arXiv:1703.11008, 2017.
 [28] B. Neyshabur, S. Bhojanapalli, D. McAllester, and N. Srebro, “Exploring generalization in deep learning,” in Advances in Neural Information Processing Systems, 2017, pp. 5947–5956.
 [29] S.J. Chung, S. Bandyopadhyay, I. Chang, and F. Y. Hadaegh, “Phase synchronization control of complex networks of Lagrangian systems on adaptive digraphs,” Automatica, vol. 49, no. 5, pp. 1148–1161, 2013.
 [30] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017.
Comments
There are no comments yet.