I Introduction
In recent years, quadrotors have been widely used for civilian and lawenforcement purposes, such as providing aerial surveillance, carrying out rescue missions, transporting goods over distance, and performing surveying and inspection tasks [1, 2, 3, 4]. In all these applications, the quadrotor is required to []precisely track a desired trajectory[]track a desired trajectory or spatial path, in order to perform the task safely and effectively.
Trajectory tracking for quadrotors poses a challenge on controller design. First, quadrotors are underactuated systems with nonlinear dynamics, making it a difficult control problem. Second, trajectory tracking precision of quadrotors can be affected by many factors, including uncertainty in the turnratetothrust map, time delays that are difficult to quantify, aerodynamic effects and other unpredictable factors such as friction in the actuators. []However[]Third, even in a perfect world, where the system dynamics are known exactly, a given classical controller cannot achieve perfect tracking for any arbitrary, []generally feasible, desired trajectory.
Our goal is to achieve improved trajectory tracking control for quadrotors while taking into account three features that are crucial for most realworld trajectory tracking applications (Figure 1 shows our specific application):

Stability of the control system and robustness to reasonable disturbances must be guaranteed to ensure safety of the operation.

[id=per, remark=]The tracking control system must be independent of a specific trajectory. [id=per, remark=][id=per,remark=]in other words, tThe system should be able to precisely track a new trajectory without adaptation.

The computational resources needed for the control system should be manageable such that the algorithm can be applied to small vehicles with limited computational power.
Simple controllers such as typical proportionalintegralderivative (PID) controllers can achieve adequate performance under certain conditions, for example low speeds and accelerations, while having all the crucial features mentioned above [5, 6]. However, PID controllers are difficult to tune and they tend to behave poorly on more aggressive trajectories. There exist[]s previous works on improving control for quadrotors or other robots, such as learning the dynamics or the inverse dynamics, iterative learning control and Gaussian Process learning. []However, we show in Section II that these approaches have drawbacks[], however, with respect to the three crucial features we identified above, which are relevant for realtime trajectory tracking.
In this paper, we propose a DNNbased []control system which improves the trajectory tracking performance by utilizing past flight experiences. After offline training from relevant flight examples, a generalized model is obtained with the DNN. This model can be evaluated in realtime[], to modify the []control[]reference signal given to the controller[]as the reference signal. With no prior knowledge of the system other than the training data, the proposed method demonstrates its ability to reduce trajectory tracking error by compensating for controller imperfections and unknown dynamics. Also, the DNN model is computationally efficient for realtime evaluation and effective even on []arbitrary,[]arbitrary trajectories, not trained on before,[]nontrained trajectories[]not trained on before, making it applicable to impromptu tracking tasks.
To validate the effectiveness of the proposed method and motivate this work, we implement an interactive flyasyoudraw application, where the quadrotor takes off to follow an arbitrary, handdrawn trajectory immediately after the user finishes drawing the trajectory. The application uses neural networks, pretrained with quadrotor flight data collected from periodic training trajectories, to obtain reference signals in realtime for an offboard feedback controller. This process is described in Figure 1. With this interactive application, we evaluate different DNN features, and compare the DNN performance with the baseline []PDcontrol []nonlinear controller[]system performance. Nine of the 30 userdrawn testing trajectories used in our experiments are shown in Figure 2
. Through the experiments, we demonstrate that the proposed approach, with proper feature selection for the DNN learning, is able to consistently enhance trajectory tracking precision for complex, arbitrary handdrawn trajectories. []This demonstrates its potential in realworld applications that require highly precise maneuvering, such as monitoring and inspection tasks, aerobatics, skywriting and airborne filming.[id=per, remark = mention generalizability in Intro]Moreover, because the DNN serves as a preblock outside the feedback control loop, the proposed method can be generalized as an addon to any blackbox, stable feedback control system. []which shows its potential in several control system applications.[]These characteristics of the proposed approach demonstrate its potential in realworld applications that require highly precise maneuvering, such as monitoring and inspection tasks, aerobatics, skywriting and airborne filming.
The paper is organized as follows. Section II summarizes related previous work on advanced trajectory tracking control. In Section III, we state the problem, followed by the general methodology in Section IV. The experimental setup is presented in Section V, and the corresponding results are presented in Section VI. A brief summary and discussion of the results are presented in Section VII.
Ii Related Work
Neural networks (NNs) are a generic approach for approximating functions given a large amount of data. []In previous work, NNs have been adopted to introduce modifications to feedback control loops.[]Previous work has demonstrated that NNs are able to learn the dynamics and the inverse dynamics of unmanned aerial vehicles (UAVs) [7, 8, 9].The papers [7] and [8] use NNs to learn the dynamics of a helicopter and a quadrotor[], respectively. In [9], an NN is used for direct inverse control of a quadrotor with promising initial simulated results for hover flight. []Both learning the dynamics and the inverse dynamics involve modification of the original feedback control loop.[]However, by involving NNs to modify the original feedback control loop, the stability of the control system will likely be affected.[]One of the biggest challenges of that is to ensure stability of the control system. When previously unseen inputs are given to the NN [id=per, remark=instability]that is part of the feedback control loop, the NN might generate unpredictable outputs, leading to instability of the system. Instead, we use DNNs (multilayer NNs) to learn a model that directly determines the reference inputs to the feedback control loop. The proposed DNNs act as a preblock outside the original feedback control loop, and run at a lower update rate, which makes the system much less susceptible to instability.
Iterative learning control (ILC) is an approach of improving the control precision by repeating the same task and learning from previous executions [10]. Through the repetition of one specific task, ILC learns an updated reference input and achieves highprecision tracking for this particular task. Unlike simple controllers that can fail to achieve aggressive maneuvers, previous work has demonstrated ILC’s ability of achieving high control precision on these tasks [11]. One significant drawback of this approach is that the experience of learning one specific task is not transferable to other tasks. Although [12] has shown that linear maps can optimize ILC initialization from previous experience, ILC still has to relearn through multiple iterations before achieving high precision for a new trajectory. Our approach allows training ahead of time, and the trained model generalizes to arbitrary trajectories without any adaption process. This feature makes it suitable for applications that require the vehicle to complete the desired task with high precision in a timely manner.
Gaussian Process (GP) learning is receiving growing attention in the control community and has been used in various control problems. For instance, an accurate kinematic control of a cabledriven surgical robot is implemented in [13]. Similar to the idea of learning the reference input in our proposed method, the GP learns the reference input to the controller of the surgical robot, improving the tracking precision of the endeffector. Our approach is different from this GP learning approach in two ways: 1) we apply the method on the quadrotor system which has different dynamics compared to surgical robots; 2) we employ DNNs as our learning technique instead of GPs. One advantage of using DNNs is that DNNs can summarize data using a fixedsize model. A GP model gets bigger as more data is collected, making the model large in terms of required storage and computationally expensive to evaluate. In contrast, when the size of the data set increases, DNNs adjust their parameters to better fit the training data without increasing the model size. Since modeling complex relations usually requires a large set of training data, the invariant model size makes the DNNs more promising on control systems with complex dynamics[]and, especially[], when computation is limited.
Iii Problem Statement
For a given dynamic system with a baseline feedback controller (see Figure 3), the problem is to learn a mapping from the desired trajectory and the current state to the reference input of the baseline controller, in order to enhance the tracking performance of the overall system for arbitrary desired trajectories. We define the desired trajectory as a sampled trajectory containing consecutive time steps, where represents the desired state at the time step. Learning []is[]should be done offline, and the learned mapping is applied in realtime.
Iv Methodology
[id=per,remark = ]Our DNN learning approach aims to address the problem stated in Section III
. In this section, we first introduce supervised learning with DNNs as the foundation in Section
IVA. Based on this, the proposed control design is presented in Section IVB. The last subsection, Section IVC, presents the importance of feature selection for DNN learning.Iva Supervised Learning with Deep Neural Network Model
[]Our approach builds upon supervised learning with DNN. This learning process requires the preparation of a large number of labeled training examples and the training of DNN on these examples. Each labeled training example consists of an input and expected output pair to describe what the function should output according to a specific input. The training of DNN []on[]the label training[]these examples involves backpropagation to minimize the loss over all training examples [14], defined as the Euclidean distance between the network’s output and expected output. After learning from the labeled training examples, the DNN can summarize a mapping from the training inputs to the training outputs.
[]A feedforward DNN with []rectified linear[]ReLU activation units [](ReLU) is used to learn the mapping formulated in Section []
III[]IV. To prepare the training examples for learning the target mapping from the flying data, we use the actual trajectory as training inputs and the reference signal as the labeled output. []The idea behind this selection is that if the actual trajectory was the desired trajectory, then the DNN should provide []this saved[]the reference signal to achieve perfect tracking. The specification of input and outputs for the DNN will be discussed in[]greater detail in Section VD.IvB DNN as Reference Generator
[]The proposed method modifies the control system design[]We modify the control system by adding a DNN block in front of the controller. At each time step , the trained DNN modifies the control signal in real time by giving the reference input to the controller based on the[]required desired trajectory as well as the current state of the quadrotor .
Figure 3 highlights the difference between the original control system and the proposed one. The reference states generated by the DNN over consecutive time steps form the reference trajectory , where is the reference state generated by the DNN at the time step. Also, the actual trajectory completed by the vehicle over the consecutive time steps is observed. The actual trajectory is expected to closely match the desired trajectory .
In control systems, current state feedback enables the control system to reject external disturbances if the controller is designed properly. Similarly, the extra loop introduced in our proposed system enables the DNN to adjust its output reference according to the current state to compensate the disturbances. We choose the feedback rate for this extra loop to be much lower than the original control loop to ensure that the stability of the original control system is not disrupted by the DNN signals. For example, in our experiments on the quadrotor control system, we design our DNN to send reference states at 7 Hz, which is 10 times slower than the control loop operating at 70 Hz. Therefore, DNN control signals can be nonintrusive to the original control system.
IvC Feature Selection for the DNN
Ideally, at each time step , the DNN would receive all the given information (both the entire desired trajectory and the current state ), and produce an optimal reference state as the input to the controller to minimize the quadrotor’s tracking error. However, this makes input dimension huge, and requires exponentially increasing amount of training data. Therefore, selecting proper state information for DNN is crucial for making the DNN learning effective.
[]A minimum feature selection is to only use the current desired state and the current state as the input, but this configuration may not be able to provide information for the DNN to model the hidden dynamics including time delay which may deteriorate the tracking performance. Hence, we consider including future desired states into the []DNN input. With the additional future information, the properlytrained DNN is expected to plan the control ahead of time by considering future desired states and improve control performance. In this paper, we investigate the influence of future states on DNN learning. It is hypothesized that the DNN can give a better performance when the desired states in the near future are introduced into the DNN input. []Realworld complex control systems usually have hidden dynamics including time delay, which deteriorates the tracking performance and may even cause instability problems. Hence, planning the control ahead of time by considering future desired states is expected to improve the control performance. Model Predictive Control (MPC), as one of the advanced control techniques that takes future desired states into account, can optimize a proper control sequence ahead of time based on a dynamic model of the system [camacho2013model]. Instead of optimizing the control sequence based on a predetermined model, we expect that properlytrained DNN can directly determine an optimized control signal based on the future desired states and the current state of the system. We hypothesize that the DNN considering the desired states in the near future can give a better performance.
To validate this hypothesis, we conducted experiments on a quadrotor in realworld environment to investigate 1) the effect of selecting different state information, including future desired states and current state feedback, as the input of the DNN on the DNN performance, and 2) the generalizability of this method for improving the tracking performance for different trajectories overall.
V Experiment Setup
Va []The Quadrotor Model and Experiment Platform
[]This subsection provides a glimpse of quadrotor dynamics as well as the experiment platform. For more details about the quadrotor dynamics and control, readers are referred to [5].
[]A typical quadrotor consists of a symmetrical crossshaped frame with four propellers mounted at the end of four arms. []The full state of the quadrotor consists of 12 components. The translational position of the quadrotor’s center of mass is defined as and the attitude, represented by Euler angles roll, pitch and yaw, is defined as . In addition to translational position and attitude, the full state[], , of the quadrotor includes the translational velocity, , and the rotational velocity, . []The acceleration in the axis, , is also included because the controller we used in the experiment requires along with the other state components as its inputs. In summary, the full state of the quadrotor is defined as .
The experiments are conducted on a Parrot AR.Drone 2.0 quadrotor. This commercial quadrotor suits the needs of this study as it features highly nonlinear dynamics, complex aerodynamics that are hard to model, and most importantly, an unmodified blackbox, which is an onboard controller that controls the vehicle’s roll, pitch and yaw by adjusting motor forces. The quadrotor’s states are all measured by the overhead Vicon motion capture system. The system features eight 4mega pixel Vicon cameras running at 200 Hz. A similar experimental setup is described in detail in [6]. The baseline control system used in this paper consists of two controllers: the onboard controller and the offboard controller. []The offboard controller is a standard PD controller implemented using the opensource Robot Operating System (ROS). The offboard controller, which runs at 70 Hz, receives the quadrotor’s current state and the desired state, and outputs to the onboard controller the desired roll (), pitch (), one of the three elements of rotational velocity (), and velocity in the direction (). []The offboard controller is a nonlinear controller, composed of a nonlinear transformation and standard PD controller. It is implemented using the opensource Robot Operating System (ROS). The controller runs at 70 Hz, []which receives the quadrotor’s current state and the []desired state[]reference, and outputs to the onboard controller the desired roll, pitch, yaw velocity and velocity (, , , ). The onboard controller runs at 200 Hz, receives the four commands from the offboard controller, and adjusts the four motor thrusts accordingly. The DNN feedback loop runs at 7 Hz, which is 10 times slower than the offboard controller.
VB Task Performance
Each task performed by the quadrotor involves following one of the predefined[], desired trajectories in the ()plane, where these trajectories are hand drawn through our interactive application (Figure 1). The error function for each task is defined as the rootmeansquare (RMS) error of N pairs of coordinates sampled at 7 Hz, the DNN feedback loop sampling rate, between the desired trajectory, , and the observed trajectory, :
(1) 
where is the Euclidean norm, while and are the position coordinates sampled at the time step from the desired trajectory and the observed trajectory , respectively. The quadrotor in the experiment repeats each task with and without the aid of the []same set of trained DNNs[]trained DNN. The percentage reduction in errors between corresponding flights is identified as the improvement of our method on this specific task:
(2) 
where and are the RMS errors in (1) with and without the DNN, respectively.
VC DNN InputOutput Specification
In general, the trained DNN provide a mapping from the current and selected desired states to the reference state:
(3) 
where is the vehicle’s current state at the time step, and are selected desired states from the desired trajectory . []Each of the states (, and ) mentioned above contains the full state of the quadrotor along with the translational acceleration on direction, . Among all translational accelerations, , and , only is included in these states because the controller we used in the experiment only requires along with the full state of the quadrotor, , as its inputs. In summary, the state of the vehicle is defined as .
Based on this general inputoutput mapping provided by the DNN, we explore three different configurations of the DNN to investigate the influence of including the future desired states and/or the current state feedback as inputs to the DNN (as discussed in Section IVC):

DNN with future desired states and the current state feedback;

DNN with future desired states and without the current state feedback;

DNN without future desired states and with the current state feedback.
All configurations consider one or more desired states at different time steps over the entire flight path as part of the input. Since we hypothesize that using future desired states can enhance tracking performance, we focus on the scenario where we only select the desired states from the current desired state and the future desired states, i.e., we select , where , for .
For the first configuration with the current state feedback, the current observed state and the selected desired states, , are given to the DNN at each time step. The actual input to the DNN consists of two major parts. The first part includes sets of [id=per,remark=obsolete]^{1}^{1}1The elements all come from the state of the quadrotor defined in Section VA, while is the extra variable given to the DNN to assist the training. from the current observed and the selected desired states. The second part includes sets of desired positions relative to the current observed position , where and are the position components from the selected desired states and the current observed state, respectively^{2}^{2}2Relative position information, instead of absolute position information, is used in order to reduce input data dimension.. The input to the DNN for the second configuration is similar to the first configuration with only being replaced by , and consequently being replaced by , where is the position components from . The last configuration is a special case of the first configuration, in which the current desired state is the only selected desired state in the DNN input ( and ). For this configuration, the DNN does not consider the future states, and generates the reference state only based on the current observed state and the current desired state.
The output of the DNN in the three configurations is the reference state . In our experiment, we reduce data complexity by only learning the difference between the reference state and the current desired state in translational position and velocity components.
VD Data Collection for DNN Training
To train the DNN, we need to collect the state data from realworld flights and select the training data. To that end, we design a 400second trajectory that oscillates sinusoidally in all the , and directions with different combinations of []frequencies and amplitudes to cover the feasible state space as much as possible. In particular, each of the three directions has its own oscillating frequency (0.27 Hz, 0.20 Hz and 0.13 Hz for the , and directions respectively). We also gradually increase the amplitudes from 0 to 2 m in all directions. On this trajectory, the quadrotor can reach a maximum velocity of 1.5 m/s and a maximum acceleration of 4 m/s. The maxima for rotational velocity and rotational acceleration are 0.4 rad/s and 1 rad/s respectively.
Using the baseline control system to follow the designed training trajectory, we collect a , pair at each time step, where is the current observed state and is the current desired state^{3}^{3}3The superscript * indicates a state from the training data.. Approximately 10,000 raw data pairs are collected at a 7 Hz rate from four flights on the training trajectory, and then organized in the consecutive time order.
We select and reorganize these raw data pairs to establish the labeled training set. Recall that the DNNs aim to learn a mapping from the current state and L selected desired states to the reference state that should be given to the controller at the current time step. For any pair , in the data set, we consider . If we treat as the current observed state of the quadrotor and as the selected desired states , []then is a feasible solution of the reference state []then may be selected as the reference state , given that is a point in a feasible reference sequence achieving perfect tracking with one sample delay. The mapping , therefore, is an approximate mapping of . Thus, for each pair , in the data set, we can form a training pair (, ) for learning the approximate mapping, where , is the input and is the labeled output. A labeled data set can then be obtained by collecting the training pairs (, ). As a result of training, using the supervised learning technique for our DNNs with the labeled data (as discussed in Section IVA), the DNNs are expected to learn an approximate mapping from the inputs to the output.
VE DNN Training
We train six different DNNs [id=per, remark=…]that share the same input states to find the mapping in (3), one for each of the []position and velocity elements of the reference state , }. Other elements in , [], maintain intact. []We construct the DNNs in such a way because in preliminary experiments we observed that training of the six outputs might require different number of iterations to converge. For example, velocity and position in the axis are easier to train whereas positions in  plane are much harder to train. We also observe that the six DNNs demonstrate comparable performance after 2,000 training iterations. However, it is possible to merge them to jointly learn []all[]the six outputs. We construct the DNNs using TensorFlow
, an opensource library originally developed by Google. Each DNN consists of four fully connected hidden layers, and each layer contains 128 neurons. 90% of the collected raw data pairs are used for training, while the rest are used for validation. Adam optimizer is used to tune the weight and bias parameters in the DNNs, and the learning rate is set at 0.0003
[15]. To prevent overfitting, a dropout rate of 0.5 is used [16]. For each training iteration, 30 training pairs are used. 2,000 iterations are done for training each output^{4}^{4}4The 30 training pairs are randomly selected from the 90% of the raw data pairs used for training..Vi Experimental Results
Via Impact of Future States on DNN Tracking Performance
The influence of introducing future desired states as part of the input to the DNNs is investigated. It is expected that the DNNs can use the future desired states information to better “plan” the flying path for the future desired states while compensating for the effect of hidden dynamics, including timedelay and other factors. In Figure 4, we show that the DNNs with future desired states and current state feedback performs significantly better than the baseline control and the DNNs with current state feedback but without future desired states. Also, Figure 5 highlights that the reference trajectories produced by the DNNs trained with future desired states and the current state feedback effectively correct the tracking error in both the  and directions.
ViB Impact of Feedback on DNN Tracking Performance
We also investigate the influence of removing the current state feedback from the DNN inputs. Our experiments are conducted on the DNNs trained with future desired states. To remove the feedback loop, we keep the DNNs while replacing the current state feedback with the desired state from the previous time step during actual flight, as discussed in Section VC. Table I highlights that the []control system[]DNN with current state feedback performs better than the []control system[]DNN without current state feedback, while both obtain considerable improvements over the baseline control system. The fact that the []control system[]DNN without the current state feedback can still have a comparable improvement offers us an alternative offline method of improving tracking performance. It makes our approach more versatile, especially when computational resources are not sufficient to support realtime calculations during flights.
Baseline Controller  DNN without Feedback  DNN with Feedback  

RMS Error, (m)  0.360  0.232  0.144 
Peak Error (m)  0.605  0.497  0.356 
Improvement, (%)    35.6  59.9 
ViC DNN Tracking Performance on Arbitrary Trajectories
To investigate generalizability of the trained DNNs on different trajectories, we evaluate the performance of the trained DNNs with future desired states and current state feedback on various trajectories. On a []50s segment from the 3D training trajectory (as discussed in Section VD), the DNNs outperform the baseline controller by 36%. The performance of the trained DNNs is also evaluated on unseen trajectories, including 30 different handdrawn trajectories and one specific trajectory (Trajectory 4 on Figure 2) with different velocity profiles^{5}^{5}5The 30 drawn trajectories all have a maximum velocity of 0.6 m/s and a maximum acceleration of 2 m/s. For Trajectory 4, the speed is changed by scaling the time domain along the desired trajectory.. In []Figure LABEL:fig:DNN_bar and Figure 7, we show that the DNNs with future desired states and the current state feedback reduce the RMS tracking errors by 43% on average over the 30 testing trajectories and training trajectory segment. Similar improvement is also obtained on one specific trajectory with different speeds as shown in Figure 6. Therefore, the DNNs trained with future desired states are capable of reducing the tracking error for trajectories with various shapes and speeds by a large margin, demonstrating its generalizability on different unseen trajectories. Figure 8 presents a longexposure image of a quadrotor following the letters “DSL” written by a visitor. Note that this is different from Trajectory 5 shown in Figure 2 []since this is the writeup of another visitor.
Vii Conclusions
In this paper, we have presented a DNNbased reference learning method[], able to learn from flight data and improve trajectory tracking control in quadrotor systems. By introducing information about future desired states in training data, the DNNs were able to account for system delay and hidden dynamics as shown from the significant reduction in tracking error overall. The main advantages of this proposed approach shown from our experiments are that 1) this approach can be applied to various control systems with complex dynamics while ensuring stability of the systems; []and 2) it requires no prior knowledge of the system to train the DNNs, and the trained DNNs can be applied to any unseen trajectories without any adaptation process; and 3) with wise feature selection and sufficient DNN training, this approach can be computationally efficient with a very small model, while demonstrating good performance on general trajectories. We have shown these advantages through the implementation of an interactive “flyasyoudraw” application, illustrating that the proposed method was readily applicable to various realworld trajectory tracking tasks. However, the overall improvements of this method are still limited by training data for the DNNs. Intelligent choices of learning targets and effective neural network designs are potential extensions to enhance the trajectory tracking performance.
References
 [1] “Central American Drug Compound Recon,” October 2010. [Online]. Available: https://www.aeryon.com/casestudies/centralamericadrug
 [2] C. Gothner, “Deputies using drones as searchandrescue tools,” August 2016. [Online]. Available: http://www.channel3000.com/news/deputiesusingdronesassearchandrescuetools/41297542
 [3] S. Shaw, “7Eleven Teams with Flirtey for First Ever FAAApproved Drone Delivery to Customer’s Home,” July 2016. [Online]. Available: http://corp.7eleven.com/news/072220167eleventeamswithflirteyforfirsteverfaaapproveddronedeliverytocustomershome
 [4] N. Michael, J. Fink, and V. Kumar, “Cooperative manipulation and transportation with aerial robots,” Autonomous Robots, vol. 30, no. 1, pp. 73–86, 2011.
 [5] Q. Lindsey, N. Michael, D. Mellinger, and V. Kumar, “The grasp multiple microuav testbed,” IEEE Robotics & Automation Magazine, vol. 17, no. 3, pp. 56–65, 2010.
 [6] S. Lupashin, M. Hehn, M. W. Mueller, A. P. Schoellig, M. Sherback, and R. D’Andrea, “A platform for aerial robotics research and demonstration: The Flying Machine Arena,” Mechatronics, vol. 24, no. 1, pp. 41–54, 2014.

[7]
A. Punjani and P. Abbeel, “Deep learning helicopter dynamics models,” in
IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 3223–3230. 
[8]
N. Mohajerin and S. L. Waslander, “Modular deep Recurrent Neural Network: Application to quadrotors,” in
IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2014, pp. 1374–1379.  [9] M. T. Frye and R. S. Provence, “Direct Inverse Control using an Artificial Neural Network for the Autonomous Hover of a Helicopter,” in IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2014, pp. 4121–4122.
 [10] A. P. Schoellig, F. L. Mueller, and R. D’Andrea, “Optimizationbased iterative learning for precise quadrocopter trajectory tracking,” Autonomous Robots, vol. 33, pp. 103–127, 2012.
 [11] F. L. Mueller, A. P. Schoellig, and R. D’Andrea, “Iterative learning of feedforward corrections for highperformance tracking,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 3276–3281.
 [12] M. W. Michael Hamer and R. D’Andrea, “Knowledge Transfer for HighPerformance Quadrocopter Maneuvers,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013, pp. 1714–1719.
 [13] J. Mahler, S. Krishnan, M. Laskey, S. Sen, A. Murali, B. Kehoe, S. Patil, J. Wang, M. Franklin, P. Abbeel, et al., “Learning accurate kinematic control of cabledriven surgical robots using data cleaning and gaussian process regression,” in IEEE International Conference on Automation Science and Engineering (CASE), 2014, pp. 532–539.
 [14] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
 [15] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint, arXiv:1412.6980, 2014.

[16]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov,
“Dropout: A Simple Way to Prevent Neural Networks from Overfitting,”
Journal of Machine Learning Research
, vol. 15, pp. 1929–1958, 2014.
Comments
There are no comments yet.