Log In Sign Up

Temporal Convolutions for Multi-Step Quadrotor Motion Prediction

by   Samuel Looper, et al.

Model-based control methods for robotic systems such as quadrotors, autonomous driving vehicles and flexible manipulators require motion models that generate accurate predictions of complex nonlinear system dynamics over long periods of time. Temporal Convolutional Networks (TCNs) can be adapted to this challenge by formulating multi-step prediction as a sequence-to-sequence modeling problem. We present End2End-TCN: a fully convolutional architecture that integrates future control inputs to compute multi-step motion predictions in one forward pass. We demonstrate the approach with a thorough analysis of TCN performance for the quadrotor modeling task, which includes an investigation of scaling effects and ablation studies. Ultimately, End2End-TCN provides 55 on an aggressive indoor quadrotor flight dataset. The model yields accurate predictions across 90 timestep horizons over a 900 ms interval.


Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning

Accurately predicting the dynamics of robotic systems is crucial for mod...

Physics-Inspired Temporal Learning of Quadrotor Dynamics for Accurate Model Predictive Trajectory Tracking

Accurately modeling quadrotor's system dynamics is critical for guarante...

Motion Prediction Using Temporal Inception Module

Human motion prediction is a necessary component for many applications i...

Modeling continuous-time stochastic processes using N-Curve mixtures

Representations of sequential data are commonly based on the assumption ...

A Data Driven Method for Multi-step Prediction of Ship Roll Motion in High Sea States

Accurate prediction of roll motion in high sea state is significant for ...

StretchBEV: Stretching Future Instance Prediction Spatially and Temporally

In self-driving, predicting future in terms of location and motion of al...

Simulating Surface Wave Dynamics with Convolutional Networks

We investigate the performance of fully convolutional networks to simula...

I Introduction

While autonomous robotic systems offer tremendous potential benefits in a wide range of commercial operations, their safe operation will require highly accurate localization and control methods for collision avoidance and action execution. Model-based state estimation and control have demonstrated strong performance and robustness across the operational domain of remote aircraft 

[14, 2, 10], autonomous vehicles [20, 15] and flexible manipulators  [22], to name a few. As such, dynamic system modeling is critical to the effort of developing safe autonomous robotic systems that can perform precise motions throughout their operating envelopes.

As a primary motivating example, in this work we focus on multi-step prediction for quadrotor UAVs. Indeed, developing models of quadrotor flight solely from first principles has proven to be a challenge. Quadrotors are underactuated systems whose translational dynamics are tightly coupled with highly nonlinear rotational dynamics. In real-world environments, aerodynamics, motor dynamics, and asymmetrical mass distributions can be significant disturbances, but are often poorly characterized in most physics-based quadrotor models [17].

Another line of research focuses on developing statistical quadrotor models from measured flight data. Specifically, discrete time neural network designs have shown the greatest promise in modeling complex quadrotor dynamics due to their strong expressive power. A recent work to benchmark neural network models on quadrotor state prediction performance employ Recurrent Neural Network (RNN) models to sequentially learn time-correlated features in quadrotor state telemetry time series data 

[24]. While these models have demonstrated state-of-the-art performance, they have several limitations. The sequential nature of these models leads to longer computation times due to the lack of parallelization, and can cause unstable gradients at training time [27]. Furthermore, current models are limited in their ability to learn time-correlated features over long time horizons [32].

Temporal convolution-based architecture provide a potential solution to the limits of RNNs for the task of quadrotor state modeling. Temporal Convolutional Networks (TCNs) have demonstrated the ability to accurately model time series in a variety of contexts [26],[3],[21] and have the potential to provide a sparse an efficient model able to learn features over long time histories. In this work, we apply TCNs to a discrete time multi-step series forecasting problem, which we adapt to the non-autonomous dynamics of robotic systems. This allows for TCN models to be trained and evaluated on indoor quadrotor flight telemetry.

Thus, in this paper, we perform the first in-depth study of convolution-based architectures for quadrotor modeling. We present End2End-TCN: a novel method of applying TCNs to robotic system modeling by integrating the control input into the input state vector. This model surpasses the current state of the art and several alternative models in prediction accuracy, generating useful future state predictions over longer periods of time for the purpose of model-based control and state estimation. We perform a comprehensive series of experiments to characterize the performance of TCNs with respect to model size and past state history length. We further provide an analysis of prediction samples and error distributions to characterize model performance. Most importantly, we demonstrate that a TCN-based model can provide a memory-efficient representation of quadrotor dynamics and yield a 55% reduction in prediction error over a 900 ms interval.

Ii Related Works

Empirical Methods. As a result of the success of model-based quadrotor control methods, the dynamics of quadrotor flight have been extensively studied in literature. Previous research that developed quadrotor test bed platforms [16], developed dynamical system models [9],[28] and characterized significant aerodynamic effects [23] have laid the foundation for a principled approach to developing quadrotor models. In these works, simplified models of quadrotor geometry, rotor thrust, and aerodynamics were used to derive equations of motion. Such physics-based models have been further refined by deriving more complex aerodynamic models [17] or by using blade element momentum theory [5] [6] to better characterize motor thrust. While many such models obtain parameter values through empirical measurement or offline system identification, recent works have used online parameter estimation to refine their physics-based models over time [7] [11] [35].

Neural Networks. Neural networks, on the other hand, provide powerful and flexible statistical models that can model highly complex time-varying phenomena. In the field of statistical rotorcraft flight modeling, early work by Punjabi and Abbeel [29] showed significant success in learning a nonlinear acceleration model for helicopter flight by training a simple artificial neural network on past flight telemetry, while others  [4] learned a simpler linearizable model for LQR control. Such models may successfully learn a latent representation of flight data, but are not designed to specifically learn time-correlated features, which have been demonstrated to improve performance in sequence domain tasks. On need look no further than the field of stock price modeling, where early artificial neural networks [34] were quickly surpassed by LSTM models [38] and TCNs [13] spcifically due to their ability to learn time-correlated features.

Sequence Modeling.

In recent years, deeper networks with new neural network architectures have led to major breakthroughs in sequence modeling. Much of this research has focused on Recurrent Neural Networks (RNNs). Mohajerin et al. leveraged recurrent architectures towards quadrotor modeling by training RNNs with Long-Short-Term Memory gated units on an indoor quadrotor dataset, which greatly improved prediction accuracy for future flight trajectories

[24]. This sequential approach mirrors the way discrete dynamical system models are integrated forward in time. However, the ability of an RNN to model time-varying phenomena is limited by the size of its hidden state representation [37], and RNN performance degrades significantly as time horizons extend [32], both of which limit their usability for quadrotor flight modeling. RNNs also have limitations that make them ill-suited for online robotics applications. They are less computationally efficient than convolution-based architectures that can leverage parallel computation hardware [3]

due to the cost of processing time series in a sequential method. Furthermore, RNNs can be challenging to train due to backpropagation through time, which can lead to gradient instability


Temporal Convolutional Networks. While RNNs were the dominant approach for time series predictive modeling [30],[1],[31], convolution-based approaches have emerged recently as a viable alternative. Early work by van den Oord et al. on WaveNet [26] introduced the causal convolution, which modified the standard discrete convolution operation to maintain the temporal structure of time series inputs. Dilated convolutions can be employed to make predictions over large, fixed time horizons and the resulting network can be parallelized for computational efficiency. This results in sparse networks that learn time-correlated features in an efficient and deterministic manner, which are called Temporal Convolutional Networks (TCNs).

Studies have shown that TCNs outperform recurrent networks across a wide range of sequence modeling tasks [3]. TCNs were further explicitly applied to time series modeling by Borovykh et al. [8]. More relevant to quadrotor modeling, TCNs were used in action segmentation tasks[21] and were combined with Empirical Mode Decomposition (EMD) to predict dynamic climate effects [36]. These prior works demonstrate that TCNs have the ability to learn temporal patterns in robotic motion over long periods and model highly complex dynamical systems.

Many applications of deep learning in robotics learn temporal patterns by simply concatenating images or system state inputs

[33]. However, this only works over short time periods. Recent work by Kaufmann et al. leveraged TCNs to process sensor input information in an end-to-end learning-based architecture for quadrotor control [19]. While this study demonstrates the utility of TCNs in the context of quadrotor state information processing, there is still a clear lack of research on the ability of TCNs to explicitly model robotic systems over a long horizon of future state predictions.

Iii Problem Formulation

By treating quadrotor flight dynamics as a time series predictive modeling problem, we can perform sequence-to-sequence modeling to learn a function that can predict future states. We first define a parameterization of the quadrotor state, , which includes position, , and velocity, , in a world frame, , orientation, , represented by Euler rotation angles from a body frame, , about axes XYZ to the world frame, , and rotation rate, , represented by the time derivative of XYZ Euler angles with respect to the body frame, . The diagram below denotes the world frame, , and body frame, , with respect to the quadrotor’s geometry. The way the geometry and reference frames are denoted is based on a quadrotor X-configuration, where the roll and pitch axes are offset by 45 degrees from the rotor arms.

Fig. 1: Model of a quadrotor as a rigid body from a body and inertial frame.
Fig. 2: Full End2End-TCN architecture.

This state represents the quadrotor’s pose with 6 degrees of freedom (given the orientation representation) and a measure of its first rate of change. The full system is further characterized by a control input,

, representing four motor commands which are generated by the quadrotor’s controller and linearly map to desired rotor speeds, .

In this discrete quadrotor state formulation, we consider a dynamic system represented by the function that maps a past state representation, , to a future state representation, , and a function that maps a state representation, , to a state observation, . In the non-autonomous case, the function maps both the past state, , and a control input, , to the state observation, .

However, to fully leverage the ability of convolutional neural networks to compute state predictions in parallel, we extend this formulation to a multi-step prediction case of length

. Note that in the non-autonomous case, past and future control inputs will be required as inputs to this function, as the future state, , is dependent on the future control input, . Furthermore, given the complexity of dynamic effects such as aerodynamics on quadrotor motion, the state parameterization above may not meet the Markov condition. Thus, we theorize that prediction accuracy will be improved by providing a sequence of input states. As such, we seek to model the function mapping a series of past states, past control inputs, and future control inputs to the series of future states. Note that this model assumes access to the full state representation, which is only possible in the case of weak nonlinear observability.

Modeling this discrete function can thus be formulated as a sequence-to-sequence modeling problem. We consider a sequence of prior system states, , prior control inputs, , and future control inputs, , and seek to estimate future system states, . Thus, given a sequence-to-sequence function generating a future system state prediction , we can use statistical methods to minimize a reconstruction loss over a set of known future quadrotor states.


Iv Methodology

Given historical quadrotor state data, neural network model inputs and labels are generated in a semi-supervised manner. As per the problem formulation, model inputs include prior quadrotor states , control inputs , and future control inputs . The sample labels correspond to a series of truncated quadrotor states, , which include the translational and rotational velocities from time to time .

A fully convolutional neural network model, dubbed End2End-TCN, is trained on this time series data to provide quadrotor state predictions over time steps. Crucially, in order to make multiple predictions over this non-autonomous dynamical system, past and future control input must be integrated into the discrete sequence modeling problem formulation. End2End-TCN integrates this information into a fixed sequence length input, , composed of augmented states, , for prior states () and, , for future states ().

The model is built on a series of causal convolutions, as first developed in [26], and as implemented in [3]. To achieve the desired effect, a causal convolution block is composed of a series of causal convolutions with dilations that increase exponentially at every layer, as depicted in figure 3.

Fig. 3: Causal convolutions over a series of layers with exponentially increasing dilation factor.

Causal convolution blocks are stacked with a nonlinear activation function and batch normalization, with a residual connection applied for gradient stability. These blocks are stacked in a layered architecture to form a deep, overparameterized neural network as in

[21] and [12] (see figure 2). End2End-TCN was designed to output a full time series of predicted states at every forward pass, allowing for simultaneous multi-step prediction of quadrotor states.

Iv-a Physics-based Model

A key part of the study of TCNs for quadrotor modeling is ascertaining whether prior knowledge of the system’s dynamics is required to improve prediction accuracy. Consequently, we develop a physics-based model of quadrotor flight derived for the AsTec Pelican flights in the test set. This model is based on a simplified wire-frame model of the quadrotor as per figure 1, which is represented by four arms with a uniform mass and a length . For the specified platform, the arms form a right angle with one another. Fixed to each arm is a rotor, which is modeled by a point mass generating a longitudinal thrust and rotational torque . The body frame is defined such that the rotors lie on the XY plane, the x-axis points in the direction directly between the first and second rotors, and the z-axis points in the direction of the torques generated by any individual rotor. The diagram in figure 1 depicts the wireframe quadrotor model and the two corresponding reference frames (inertial and body). The complex motor and rotor dynamics are approximated by a quadratic relationship between the rotor angular velocity, , in its discrete representation. This is based on the rotor dynamic equation in stead state with a freestream velocity of zero [23], which can be parameterized with respect to a thrust coefficient , the density of air , the rotor radius and the rotor area .


The total thrusts and torques can thus be calculated from individual rotor contributions in the vectorized equation below.


For state derivatives, we reference a quadrotor state in the form as per the problem formulation in section 3. The derivative of position is trivial, . The orientation derivative can be obtained from the body rotation rates with an additional coordinate transform in matrix form ().


Translational acceleration can be written with respect to the force and torque from equation 4 using Newton’s 2nd Law. Motor thrust is transformed from to , and additional inertial accelerations due to gravity () and translational drag . Lastly, rotational acceleration can be written from Euler’s Equations of Rotational Motion, with a body frame rotor torque and rotational drag .


To perform motion prediction, the equations of motion are discretized for all state variables used in motion prediction as per as per section III. Parameters are either empirically measured or estimated using nonlinear system identification, as in [24]. Numerical forward integration is then performed using a real-valued variable-coefficient ODE (VODE) solver. The predicted state variables after an interval is compared to learning-based methods trained on motion prediction for the same discrete time interval.

Iv-B Hybrid Models

On the other hand, we can use all or part of this physics-based model as a component in a hybrid architecture. We develop a series of hybrid models combining fully convolutional Temporal Convolutional Network component(s) with similar design parameters as End2End-TCN and the same total number of parameters. Physics-based components generate forward predictions in a sequential manner by forward integrating some or all of the dynamic system equations outlined in section IV-A. This results in three different Hybrid configurations. Motor-Hybrid uses a TCN component to model the aircraft’s rotor dynamics, generating motor thrust predictions for a given control input. AccelError-Hybrid uses a TCN component to model an additive term to the physics-based state derivative estimates, thus modeling the dynamics that are not captured by the simplified physics-based model. Lastly, Combined-Hybrid uses both TCN components of the models above.

Fig. 4: Components of Motor-Hybrid (top), AccelError-Hybrid (middle) and Combined-Hybrid (bottom).
Fig. 5: Velocity and body rate prediction errors over time for End2End-TCN (green) and reference models on a log plot.

V Experimental Results

V-a Experimental Design

We validate this approach and characterize model performance with respect to its prediction accuracy on real quadrotor flights. We evaluate End2End-TCN and several alternative predictive models on the WAVE Laboratory AsTec Pelican Quadrotor Dataset [25]

, which utilized sensor fusion across inertial, GNSS, and vision-based systems to collect high-precision quadrotor state estimates. Data are interpolated to report full quadrotor states at a sample rate of 100 Hz. The dataset is comprised of a series of indoor quadrotor flights, bounded within a 5 x 5 x 5 m volume. This mostly includes near-hover flight, pseudo-random rotations and curves in all axes, all within the nominal flight envelope of the AsTec Pelican quadrotor. In total, the dataset consists of 54 flights, with over 1,388,410 total samples of quadrotor telemetry data, 10% of which is used in the test set for this experiment.

V-B Comparative Study

To validate the performance of End2End-TCN, we compare its performance in terms of velocity and body rate prediction accuracy with alternative models. This includes the current state-of-the-art result on this dataset, which was achieved by Mohajerin in [24] with an LSTM Recurrent Neural Network Hybrid model to multi-step quadrotor prediction. The model is also compared to a physics-based model, and a series of hybrid models with both TCN and physics-based components, as outlined in section 4.

max width= Model MSE Error (t=0.01s) MSE Error (t=0.45s) MSE Error (t=0.90s) Velocity Body Rate Velocity Body Rate Velocity Body Rate Physics-based 0.00003 0.000572 0.0892 0.0981 0.938 1.08 LSTM Hybrid 0.00441 0.616 0.0217 2.30 0.0384 3.01 Motor-Hybrid 0.0100 0.00543 0.115 0.269 0.115 0.632 AccelError-Hybrid 0.0153 0.00356 0.200 0.187 0.205 0.625 Combined-Hybrid 0.0124 0.0126 0.178 0.535 0.192 1.02 End2End-TCN 0.000735 0.00197 0.00881 0.0352 0.0357 0.0464

TABLE I: Summary of multi-step prediction results across 90 time steps (900 ms).

We find that End2End-TCN outperforms the current state of the art and all alternative models across nearly the entire 90 step sample (corresponding to 900 ms). The most significant performance improvements are in rotation rates, where the fundamental kinematics rely on current and past quadrotor states. This may indicate that dilated convolutions are better suited for this type of long-term sequence modeling. We find that hybrid models perform significantly worse than the fully convolutional approach. This can mostly be attributed to the difficulty of integrating TCNs with numerically integrated dynamical system equations, which are sequential in nature. Hybrid models that have multiple TCN components, each with a fraction of a single large End2End-TCN, likely suffer due to a fundamental lack of expressive power.

Lastly, we see that most TCN-based models represent a 2-10x improvement with respect to prediction accuracy when compared to the physics-based model over a longer time horizon, which indicates that these models learn generalizable unmodeled dynamics that have significant temporal effects. We find that TCN model errors typically plateau over time. While a constant acceleration error due to unmodeled disturbances may cause errors growing quadratically over time, End2End-TCN optimizes for accuracy across the flight sample over longer time periods where transient effects may not be statistically relevant.

V-C Analyzing Flight Samples

While End2End-TCN makes extremely accurate predictions for a majority of samples, overall accuracy is limited by a long tail in the error distribution as depicted for body rotation rate error in figure 6.

Fig. 6:

Distribution of End2End-TCN body rate errors over time (Box-whisker plot: median (red line), 2nd and 3rd quartiles (blue box) and range)

These uncommon but large errors occur at the extremes of the quadrotor’s flight envelope. While using an L1-Norm loss function reduces prediction error overall, it constrains the model to learn the simple hover point dynamics, that are more frequent in the training and evaluation datasets. As such, flight samples in more aggressive maneuvers yield predictions that significantly diverges from the ground truth, as in figure 6. We find that samples with errors in the 90th percentile have significantly higher rates of change of position and motor command (i.e. faster and sharper turns). We also find an increase in the variance of pitch and roll angles, indicating that samples taken farther from the hover point of the quadrotor.

Fig. 7: Flight predictions with respect to ground truth for a selection of test samples including low error (frequent) cases and high error (infrequent) cases

It is hypothesized that this behavior is largely data-driven. The current dataset, comprised of stable, indoor flight, has few samples in the extreme ranges of the quadrotor’s flight envelope. However, in comparison, hybrid models appear to be more robust to these outlier samples. These models have significantly worse mean errors over time but a smaller standard deviation, which indicates that building models with a prior on the system’s dynamics may be an effective way to address a lack of data in certain flight modes.

V-D Scaling Effects

One of the main potential benefits of a fully convolutional architecture for quadrotor predictive modeling was its computational efficiency and memory footprint. Thus, we investigate the impact of model size of its predictive modeling performance. Table II shows the validation set accuracy results of End2End-TCN when varying the number of depth layers. Forward pass frequency was calculated on a test set running on a Nvidia GeForce RTX 2080 Ti Graphic Processing Unit (GPU).

Overall, we find that End2End-TCN retains a significant amount of its predictive ability as the size of the network decreases, particularly for translational velocity. On the other hand, we see significant reductions in body rotation rate prediction accuracy, likely due to the nonlinear nature of these dynamics and their higher sensitivity to disturbances. Similarly, we find that reducing the observation window does not significantly degrade the performance of End2End-TCN.

One hypothesis for this behavior is that the current model is fundamentally limited by the size of the dataset rather than the size of the model. As demonstrated in language models and other sequence learning tasks

[18], performance improvements from increasing model size is fundamentally capped if the size of the dataset does not increase accordingly. There may be additional factors about time-correlated data that make it less susceptible to performance increases from model scale. This view of a data-centric approach for further model scaling is supported by error distributions and the sparsity of data in certain flight modes.

max width= # of layers # of param. / fps (hz) MSE (t=0.45s) MSE (t=0.90s) Vel. Ang. Vel. Vel. Ang. Vel. 5 298,346 / 492.6 0.0102 0.0387 0.0423 0.0634 8 1,166,794 / 383.7 0.0088 0.0352 0.0357 0.0464 10 4,640,266 / 302.4 0.0087 0.0403 0.0353 0.0663 12 18,517,706 / 243.7 0.0148 0.0398 0.0412 0.0654

TABLE II: Prediction error with respect to model size.

V-E Ablation Studies

A series of ablation studies is performed on End2End-TCN to validate the model’s detailed design. We first compare a series of alternative architectures. This includes models with varying amounts of regularization layers (Batch Normalization and Dropout) and varying training loss functions (Euclidean, Manhattan, and Weighted Euclidean). The results of the study are summarized in table III for Batch Normalization (BN), Dropout (Drop), Shortened gradient path architecture (SG), Weighted L2-Norm loss function (WL2), and L1-Norm loss (L1). A crucial element of the design of End2End-TCN is the integration of future control inputs for the multi-step prediction of non-autonomous dynamical systems. In our ablation study, we consider two methods to achieve this. In the baseline model, past quadrotor states, past control inputs, and future control inputs are concatenated into a single model input sequence. We compare this approach to an architecture where only past quadrotor states and control inputs are fed to the first layer, while future control inputs are fed to an intermediate layer for the purposes of shortening their gradient path.

Fig. 8: Architecture with shortened gradient paths to future control inputs.

max width= BN Drop SG WL2 L1 MSE Error Velocity Body Rate 0.0198 0.0715 0.0172 0.0401 0.0217 0.0433 0.0329 0.0440 0.0317 0.0700 0.0158 0.0396

TABLE III: Architecture Ablation Study

Firstly, we see that the alternative architecture performance significantly worse with respect to body rate error when compared to the final model. While this architecture was hypothesized to increase performance by shortening the gradient path to the most important features, namely the last quadrotor state and the control inputs, we see that the number of layers between these features and the output are too few to properly capture the nonlinear rotation dynamics of the quadrotor. Furthermore, reducing or eliminating batch normalization in End2End-TCN decreases performance, as does adding dropout to the model. These results mirror similar conclusions in literature [21]. We also find that the L1-Norm loss function, which is more robust to outlier state errors, leads to better generalization to the test set than do L2 or weighted L2 loss functions.

Vi Conclusion

This paper presents a detailed study of the use of Temporal Convolutional Networks for quadrotor state modeling and motion prediction. While classical modeling techniques characterize such robotic systems using prior knowledge of the system’s non-autonomous dynamics, we formulate this as a sequence modeling problem by performing discrete multi-step prediction. We segment quadrotor telemetry to train a fully convolutional neural network, End2End-TCN, in a semi-supervised fashion. End2End-TCN outperforms the previous state of the art by 55% and proves to be more effective than hybrid models and fully physics-based models. We demonstrate that End2End-TCN retains over 95% of its performance over shorter time intervals when the model is compressed by a factor of 3, and we further characterize model performance with an ablation study and an analysis of predicted flight samples.

This fully convolutional approach to quadrotor modeling is currently limited by the scale and distribution of training data, which is a bottleneck shared by many sequence to sequence models. Collecting data on aggressive quadrotor flight would reduce the model’s bias towards hover point dynamics and potentially reduce infrequent low-accuracy prediction samples. Further work is required to ascertain whether this method will generalize to outdoor environments with wind disturbances. Finally, End2End-TCN will be applied in model-based quadrotor control methods to further contextualize its accuracy and computational efficiency.


  • [1] M. Abdel-Nasser and K. Mahmoud (2019) Accurate photovoltaic power forecasting models using deep lstm-rnn. Neural Computing and Applications 31 (7), pp. 2727–2740. Cited by: §II.
  • [2] K. Alexis, G. Nikolakopoulos, and A. Tzes (2012) Model predictive quadrotor control: attitude, altitude and position experimental studies. IET Control Theory & Applications 6 (12), pp. 1812–1827. Cited by: §I.
  • [3] S. Bai, J. Z. Kolter, and V. Koltun (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271. Cited by: §I, §II, §II, §IV.
  • [4] S. Bansal, A. K. Akametalu, F. J. Jiang, F. Laine, and C. J. Tomlin (2016) Learning quadrotor dynamics using neural network for flight control. In 2016 IEEE 55th Conference on Decision and Control (CDC), pp. 4653–4660. Cited by: §II.
  • [5] D. F. Barcelos, A. Kolaei, and G. Bramesfeld (2018) Performance prediction of multirotor vehicles using a higher order potential flow method. In 2018 AIAA aerospace sciences meeting, pp. 1528. Cited by: §II.
  • [6] L. Bauersfeld, E. Kaufmann, P. Foehn, S. Sun, and D. Scaramuzza (2021) NeuroBEM: hybrid aerodynamic quadrotor model. arXiv preprint arXiv:2106.08015. Cited by: §II.
  • [7] C. Böhm, M. Scheiber, and S. Weiss Filter-based online system-parameter estimation for multicopter uavs. Cited by: §II.
  • [8] A. Borovykh, S. Bohte, and C. W. Oosterlee (2017) Conditional time series forecasting with convolutional neural networks. arXiv preprint arXiv:1703.04691. Cited by: §II.
  • [9] S. Bouabdallah and R. Siegwart (2007) Full control of a quadrotor. In 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 153–158. Cited by: §II.
  • [10] P. Bouffard, A. Aswani, and C. Tomlin (2012) Learning-based model predictive control on a quadrotor: onboard implementation and experimental results. In 2012 IEEE International Conference on Robotics and Automation, pp. 279–284. Cited by: §I.
  • [11] M. Burri, M. Bloesch, Z. Taylor, R. Siegwart, and J. Nieto (2018) A framework for maximum likelihood parameter identification applied on mavs. Journal of Field Robotics 35 (1), pp. 5–22. Cited by: §II.
  • [12] Y. Chen, Y. Kang, Y. Chen, and Z. Wang (2020) Probabilistic forecasting with temporal convolutional neural network. Neurocomputing 399, pp. 491–501. Cited by: §IV.
  • [13] S. Deng, N. Zhang, W. Zhang, J. Chen, J. Z. Pan, and H. Chen (2019) Knowledge-driven stock trend prediction and explanation via temporal convolutional network. In Companion Proceedings of The 2019 World Wide Web Conference, pp. 678–685. Cited by: §II.
  • [14] M. Greeff and A. P. Schoellig (2018) Flatness-based model predictive control for quadrotor trajectory tracking. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6740–6745. Cited by: §I.
  • [15] S. Grigorescu, B. Trasnea, T. Cocias, and G. Macesanu (2020) A survey of deep learning techniques for autonomous driving. Journal of Field Robotics 37 (3), pp. 362–386. Cited by: §I.
  • [16] G. Hoffmann, D. G. Rajnarayan, S. L. Waslander, D. Dostal, J. S. Jang, and C. J. Tomlin (2004) The stanford testbed of autonomous rotorcraft for multi agent control (STARMAC). In The 23rd Digital Avionics Systems Conference (IEEE Cat. No. 04CH37576), Vol. 2, pp. 12–E. Cited by: §II.
  • [17] G. M. Hoffmann, H. Huang, S. L. Waslander, and C. J. Tomlin (2011) Precision flight control for a multi-vehicle quadrotor helicopter testbed. Control engineering practice 19 (9), pp. 1023–1036. Cited by: §I, §II.
  • [18] J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei (2020) Scaling laws for neural language models. arXiv preprint arXiv:2001.08361. Cited by: §V-D.
  • [19] E. Kaufmann, A. Loquercio, R. Ranftl, M. Müller, V. Koltun, and D. Scaramuzza (2020) Deep drone acrobatics. arXiv preprint arXiv:2006.05768. Cited by: §II.
  • [20] J. Kong, M. Pfeiffer, G. Schildbach, and F. Borrelli (2015) Kinematic and dynamic vehicle models for autonomous driving control design. In 2015 IEEE Intelligent Vehicles Symposium (IV), pp. 1094–1099. Cited by: §I.
  • [21] C. Lea, M. D. Flynn, R. Vidal, A. Reiter, and G. D. Hager (2017) Temporal convolutional networks for action segmentation and detection. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    pp. 156–165. Cited by: §I, §II, §IV, §V-E.
  • [22] Z. Liu, J. Liu, and W. He (2018) Dynamic modeling and vibration control for a nonlinear 3-dimensional flexible manipulator. International Journal of Robust and Nonlinear Control 28 (13), pp. 3927–3945. Cited by: §I.
  • [23] R. Mahony, V. Kumar, and P. Corke (2012) Multirotor aerial vehicles: modeling, estimation, and control of quadrotor. IEEE Robotics and Automation magazine 19 (3), pp. 20–32. Cited by: §II, §IV-A.
  • [24] N. Mohajerin and S. L. Waslander (2019) Multistep prediction of dynamic systems with recurrent neural networks. IEEE transactions on neural networks and learning systems 30 (11), pp. 3370–3383. Cited by: §I, §II, §IV-A, §V-B.
  • [25] N. Mohajerin (2017) Modeling dynamic systems for multi-step prediction with recurrent neural networks. Ph.D. Thesis, University of Waterloo. Cited by: §V-A.
  • [26] A. v. d. Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu (2016) Wavenet: a generative model for raw audio. arXiv preprint arXiv:1609.03499. Cited by: §I, §II, §IV.
  • [27] R. Pascanu, T. Mikolov, and Y. Bengio (2013) On the difficulty of training recurrent neural networks. In

    International conference on machine learning

    pp. 1310–1318. Cited by: §I, §II.
  • [28] P. Pounds, R. Mahony, and P. Corke (2006) Modelling and control of a quad-rotor robot. In Proceedings of the 2006 Australasian Conference on Robotics and Automation, pp. 1–10. Cited by: §II.
  • [29] A. Punjani and P. Abbeel (2015) Deep learning helicopter dynamics models. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 3223–3230. Cited by: §II.
  • [30] D. Salinas, V. Flunkert, J. Gasthaus, and T. Januschowski (2020) DeepAR: probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting 36 (3), pp. 1181–1191. Cited by: §II.
  • [31] H. Shi, M. Xu, and R. Li (2017) Deep learning for household load forecasting—a novel pooling deep rnn. IEEE Transactions on Smart Grid 9 (5), pp. 5271–5280. Cited by: §II.
  • [32] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin (2017) Attention is all you need. arXiv preprint arXiv:1706.03762. Cited by: §I, §II.
  • [33] S. Wang, R. Clark, H. Wen, and N. Trigoni (2017) Deepvo: towards end-to-end visual odometry with deep recurrent convolutional neural networks. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2043–2050. Cited by: §II.
  • [34] B. W. Wanjawa and L. Muchemi (2014) ANN model to predict stock prices at stock exchange markets. arXiv preprint arXiv:1502.06434. Cited by: §II.
  • [35] V. Wüest, V. Kumar, and G. Loianno (2019) Online estimation of geometric and inertia parameters for multirotor aerial vehicles. In 2019 International Conference on Robotics and Automation (ICRA), pp. 1884–1890. Cited by: §II.
  • [36] J. Yan, L. Mu, L. Wang, R. Ranjan, and A. Y. Zomaya (2020) Temporal convolutional networks for the advance prediction of enso. Scientific reports 10 (1), pp. 1–15. Cited by: §II.
  • [37] S. Zhang, Y. Wu, T. Che, Z. Lin, R. Memisevic, R. Salakhutdinov, and Y. Bengio (2016) Architectural complexity measures of recurrent neural networks. arXiv preprint arXiv:1602.08210. Cited by: §II.
  • [38] Z. Zhao, R. Rao, S. Tu, and J. Shi (2017) Time-weighted lstm model with redefined labeling for stock trend prediction. In

    2017 IEEE 29th international conference on tools with artificial intelligence (ICTAI)

    pp. 1210–1217. Cited by: §II.