I Introduction
While autonomous robotic systems offer tremendous potential benefits in a wide range of commercial operations, their safe operation will require highly accurate localization and control methods for collision avoidance and action execution. Modelbased state estimation and control have demonstrated strong performance and robustness across the operational domain of remote aircraft
[14, 2, 10], autonomous vehicles [20, 15] and flexible manipulators [22], to name a few. As such, dynamic system modeling is critical to the effort of developing safe autonomous robotic systems that can perform precise motions throughout their operating envelopes.As a primary motivating example, in this work we focus on multistep prediction for quadrotor UAVs. Indeed, developing models of quadrotor flight solely from first principles has proven to be a challenge. Quadrotors are underactuated systems whose translational dynamics are tightly coupled with highly nonlinear rotational dynamics. In realworld environments, aerodynamics, motor dynamics, and asymmetrical mass distributions can be significant disturbances, but are often poorly characterized in most physicsbased quadrotor models [17].
Another line of research focuses on developing statistical quadrotor models from measured flight data. Specifically, discrete time neural network designs have shown the greatest promise in modeling complex quadrotor dynamics due to their strong expressive power. A recent work to benchmark neural network models on quadrotor state prediction performance employ Recurrent Neural Network (RNN) models to sequentially learn timecorrelated features in quadrotor state telemetry time series data
[24]. While these models have demonstrated stateoftheart performance, they have several limitations. The sequential nature of these models leads to longer computation times due to the lack of parallelization, and can cause unstable gradients at training time [27]. Furthermore, current models are limited in their ability to learn timecorrelated features over long time horizons [32].Temporal convolutionbased architecture provide a potential solution to the limits of RNNs for the task of quadrotor state modeling. Temporal Convolutional Networks (TCNs) have demonstrated the ability to accurately model time series in a variety of contexts [26],[3],[21] and have the potential to provide a sparse an efficient model able to learn features over long time histories. In this work, we apply TCNs to a discrete time multistep series forecasting problem, which we adapt to the nonautonomous dynamics of robotic systems. This allows for TCN models to be trained and evaluated on indoor quadrotor flight telemetry.
Thus, in this paper, we perform the first indepth study of convolutionbased architectures for quadrotor modeling. We present End2EndTCN: a novel method of applying TCNs to robotic system modeling by integrating the control input into the input state vector. This model surpasses the current state of the art and several alternative models in prediction accuracy, generating useful future state predictions over longer periods of time for the purpose of modelbased control and state estimation. We perform a comprehensive series of experiments to characterize the performance of TCNs with respect to model size and past state history length. We further provide an analysis of prediction samples and error distributions to characterize model performance. Most importantly, we demonstrate that a TCNbased model can provide a memoryefficient representation of quadrotor dynamics and yield a 55% reduction in prediction error over a 900 ms interval.
Ii Related Works
Empirical Methods. As a result of the success of modelbased quadrotor control methods, the dynamics of quadrotor flight have been extensively studied in literature. Previous research that developed quadrotor test bed platforms [16], developed dynamical system models [9],[28] and characterized significant aerodynamic effects [23] have laid the foundation for a principled approach to developing quadrotor models. In these works, simplified models of quadrotor geometry, rotor thrust, and aerodynamics were used to derive equations of motion. Such physicsbased models have been further refined by deriving more complex aerodynamic models [17] or by using blade element momentum theory [5] [6] to better characterize motor thrust. While many such models obtain parameter values through empirical measurement or offline system identification, recent works have used online parameter estimation to refine their physicsbased models over time [7] [11] [35].
Neural Networks. Neural networks, on the other hand, provide powerful and flexible statistical models that can model highly complex timevarying phenomena. In the field of statistical rotorcraft flight modeling, early work by Punjabi and Abbeel [29] showed significant success in learning a nonlinear acceleration model for helicopter flight by training a simple artificial neural network on past flight telemetry, while others [4] learned a simpler linearizable model for LQR control. Such models may successfully learn a latent representation of flight data, but are not designed to specifically learn timecorrelated features, which have been demonstrated to improve performance in sequence domain tasks. On need look no further than the field of stock price modeling, where early artificial neural networks [34] were quickly surpassed by LSTM models [38] and TCNs [13] spcifically due to their ability to learn timecorrelated features.
Sequence Modeling.
In recent years, deeper networks with new neural network architectures have led to major breakthroughs in sequence modeling. Much of this research has focused on Recurrent Neural Networks (RNNs). Mohajerin et al. leveraged recurrent architectures towards quadrotor modeling by training RNNs with LongShortTerm Memory gated units on an indoor quadrotor dataset, which greatly improved prediction accuracy for future flight trajectories
[24]. This sequential approach mirrors the way discrete dynamical system models are integrated forward in time. However, the ability of an RNN to model timevarying phenomena is limited by the size of its hidden state representation [37], and RNN performance degrades significantly as time horizons extend [32], both of which limit their usability for quadrotor flight modeling. RNNs also have limitations that make them illsuited for online robotics applications. They are less computationally efficient than convolutionbased architectures that can leverage parallel computation hardware [3]due to the cost of processing time series in a sequential method. Furthermore, RNNs can be challenging to train due to backpropagation through time, which can lead to gradient instability
[27].Temporal Convolutional Networks. While RNNs were the dominant approach for time series predictive modeling [30],[1],[31], convolutionbased approaches have emerged recently as a viable alternative. Early work by van den Oord et al. on WaveNet [26] introduced the causal convolution, which modified the standard discrete convolution operation to maintain the temporal structure of time series inputs. Dilated convolutions can be employed to make predictions over large, fixed time horizons and the resulting network can be parallelized for computational efficiency. This results in sparse networks that learn timecorrelated features in an efficient and deterministic manner, which are called Temporal Convolutional Networks (TCNs).
Studies have shown that TCNs outperform recurrent networks across a wide range of sequence modeling tasks [3]. TCNs were further explicitly applied to time series modeling by Borovykh et al. [8]. More relevant to quadrotor modeling, TCNs were used in action segmentation tasks[21] and were combined with Empirical Mode Decomposition (EMD) to predict dynamic climate effects [36]. These prior works demonstrate that TCNs have the ability to learn temporal patterns in robotic motion over long periods and model highly complex dynamical systems.
Many applications of deep learning in robotics learn temporal patterns by simply concatenating images or system state inputs
[33]. However, this only works over short time periods. Recent work by Kaufmann et al. leveraged TCNs to process sensor input information in an endtoend learningbased architecture for quadrotor control [19]. While this study demonstrates the utility of TCNs in the context of quadrotor state information processing, there is still a clear lack of research on the ability of TCNs to explicitly model robotic systems over a long horizon of future state predictions.Iii Problem Formulation
By treating quadrotor flight dynamics as a time series predictive modeling problem, we can perform sequencetosequence modeling to learn a function that can predict future states. We first define a parameterization of the quadrotor state, , which includes position, , and velocity, , in a world frame, , orientation, , represented by Euler rotation angles from a body frame, , about axes XYZ to the world frame, , and rotation rate, , represented by the time derivative of XYZ Euler angles with respect to the body frame, . The diagram below denotes the world frame, , and body frame, , with respect to the quadrotor’s geometry. The way the geometry and reference frames are denoted is based on a quadrotor Xconfiguration, where the roll and pitch axes are offset by 45 degrees from the rotor arms.
This state represents the quadrotor’s pose with 6 degrees of freedom (given the orientation representation) and a measure of its first rate of change. The full system is further characterized by a control input,
, representing four motor commands which are generated by the quadrotor’s controller and linearly map to desired rotor speeds, .In this discrete quadrotor state formulation, we consider a dynamic system represented by the function that maps a past state representation, , to a future state representation, , and a function that maps a state representation, , to a state observation, . In the nonautonomous case, the function maps both the past state, , and a control input, , to the state observation, .
However, to fully leverage the ability of convolutional neural networks to compute state predictions in parallel, we extend this formulation to a multistep prediction case of length
. Note that in the nonautonomous case, past and future control inputs will be required as inputs to this function, as the future state, , is dependent on the future control input, . Furthermore, given the complexity of dynamic effects such as aerodynamics on quadrotor motion, the state parameterization above may not meet the Markov condition. Thus, we theorize that prediction accuracy will be improved by providing a sequence of input states. As such, we seek to model the function mapping a series of past states, past control inputs, and future control inputs to the series of future states. Note that this model assumes access to the full state representation, which is only possible in the case of weak nonlinear observability.Modeling this discrete function can thus be formulated as a sequencetosequence modeling problem. We consider a sequence of prior system states, , prior control inputs, , and future control inputs, , and seek to estimate future system states, . Thus, given a sequencetosequence function generating a future system state prediction , we can use statistical methods to minimize a reconstruction loss over a set of known future quadrotor states.
(1) 
Iv Methodology
Given historical quadrotor state data, neural network model inputs and labels are generated in a semisupervised manner. As per the problem formulation, model inputs include prior quadrotor states , control inputs , and future control inputs . The sample labels correspond to a series of truncated quadrotor states, , which include the translational and rotational velocities from time to time .
A fully convolutional neural network model, dubbed End2EndTCN, is trained on this time series data to provide quadrotor state predictions over time steps. Crucially, in order to make multiple predictions over this nonautonomous dynamical system, past and future control input must be integrated into the discrete sequence modeling problem formulation. End2EndTCN integrates this information into a fixed sequence length input, , composed of augmented states, , for prior states () and, , for future states ().
The model is built on a series of causal convolutions, as first developed in [26], and as implemented in [3]. To achieve the desired effect, a causal convolution block is composed of a series of causal convolutions with dilations that increase exponentially at every layer, as depicted in figure 3.
Causal convolution blocks are stacked with a nonlinear activation function and batch normalization, with a residual connection applied for gradient stability. These blocks are stacked in a layered architecture to form a deep, overparameterized neural network as in
[21] and [12] (see figure 2). End2EndTCN was designed to output a full time series of predicted states at every forward pass, allowing for simultaneous multistep prediction of quadrotor states.Iva Physicsbased Model
A key part of the study of TCNs for quadrotor modeling is ascertaining whether prior knowledge of the system’s dynamics is required to improve prediction accuracy. Consequently, we develop a physicsbased model of quadrotor flight derived for the AsTec Pelican flights in the test set. This model is based on a simplified wireframe model of the quadrotor as per figure 1, which is represented by four arms with a uniform mass and a length . For the specified platform, the arms form a right angle with one another. Fixed to each arm is a rotor, which is modeled by a point mass generating a longitudinal thrust and rotational torque . The body frame is defined such that the rotors lie on the XY plane, the xaxis points in the direction directly between the first and second rotors, and the zaxis points in the direction of the torques generated by any individual rotor. The diagram in figure 1 depicts the wireframe quadrotor model and the two corresponding reference frames (inertial and body). The complex motor and rotor dynamics are approximated by a quadratic relationship between the rotor angular velocity, , in its discrete representation. This is based on the rotor dynamic equation in stead state with a freestream velocity of zero [23], which can be parameterized with respect to a thrust coefficient , the density of air , the rotor radius and the rotor area .
(2) 
(3) 
The total thrusts and torques can thus be calculated from individual rotor contributions in the vectorized equation below.
(4) 
For state derivatives, we reference a quadrotor state in the form as per the problem formulation in section 3. The derivative of position is trivial, . The orientation derivative can be obtained from the body rotation rates with an additional coordinate transform in matrix form ().
(5) 
Translational acceleration can be written with respect to the force and torque from equation 4 using Newton’s 2nd Law. Motor thrust is transformed from to , and additional inertial accelerations due to gravity () and translational drag . Lastly, rotational acceleration can be written from Euler’s Equations of Rotational Motion, with a body frame rotor torque and rotational drag .
(6) 
(7) 
To perform motion prediction, the equations of motion are discretized for all state variables used in motion prediction as per as per section III. Parameters are either empirically measured or estimated using nonlinear system identification, as in [24]. Numerical forward integration is then performed using a realvalued variablecoefficient ODE (VODE) solver. The predicted state variables after an interval is compared to learningbased methods trained on motion prediction for the same discrete time interval.
IvB Hybrid Models
On the other hand, we can use all or part of this physicsbased model as a component in a hybrid architecture. We develop a series of hybrid models combining fully convolutional Temporal Convolutional Network component(s) with similar design parameters as End2EndTCN and the same total number of parameters. Physicsbased components generate forward predictions in a sequential manner by forward integrating some or all of the dynamic system equations outlined in section IVA. This results in three different Hybrid configurations. MotorHybrid uses a TCN component to model the aircraft’s rotor dynamics, generating motor thrust predictions for a given control input. AccelErrorHybrid uses a TCN component to model an additive term to the physicsbased state derivative estimates, thus modeling the dynamics that are not captured by the simplified physicsbased model. Lastly, CombinedHybrid uses both TCN components of the models above.
V Experimental Results
Va Experimental Design
We validate this approach and characterize model performance with respect to its prediction accuracy on real quadrotor flights. We evaluate End2EndTCN and several alternative predictive models on the WAVE Laboratory AsTec Pelican Quadrotor Dataset [25]
, which utilized sensor fusion across inertial, GNSS, and visionbased systems to collect highprecision quadrotor state estimates. Data are interpolated to report full quadrotor states at a sample rate of 100 Hz. The dataset is comprised of a series of indoor quadrotor flights, bounded within a 5 x 5 x 5 m volume. This mostly includes nearhover flight, pseudorandom rotations and curves in all axes, all within the nominal flight envelope of the AsTec Pelican quadrotor. In total, the dataset consists of 54 flights, with over 1,388,410 total samples of quadrotor telemetry data, 10% of which is used in the test set for this experiment.
VB Comparative Study
To validate the performance of End2EndTCN, we compare its performance in terms of velocity and body rate prediction accuracy with alternative models. This includes the current stateoftheart result on this dataset, which was achieved by Mohajerin in [24] with an LSTM Recurrent Neural Network Hybrid model to multistep quadrotor prediction. The model is also compared to a physicsbased model, and a series of hybrid models with both TCN and physicsbased components, as outlined in section 4.
We find that End2EndTCN outperforms the current state of the art and all alternative models across nearly the entire 90 step sample (corresponding to 900 ms). The most significant performance improvements are in rotation rates, where the fundamental kinematics rely on current and past quadrotor states. This may indicate that dilated convolutions are better suited for this type of longterm sequence modeling. We find that hybrid models perform significantly worse than the fully convolutional approach. This can mostly be attributed to the difficulty of integrating TCNs with numerically integrated dynamical system equations, which are sequential in nature. Hybrid models that have multiple TCN components, each with a fraction of a single large End2EndTCN, likely suffer due to a fundamental lack of expressive power.
Lastly, we see that most TCNbased models represent a 210x improvement with respect to prediction accuracy when compared to the physicsbased model over a longer time horizon, which indicates that these models learn generalizable unmodeled dynamics that have significant temporal effects. We find that TCN model errors typically plateau over time. While a constant acceleration error due to unmodeled disturbances may cause errors growing quadratically over time, End2EndTCN optimizes for accuracy across the flight sample over longer time periods where transient effects may not be statistically relevant.
VC Analyzing Flight Samples
While End2EndTCN makes extremely accurate predictions for a majority of samples, overall accuracy is limited by a long tail in the error distribution as depicted for body rotation rate error in figure 6.
These uncommon but large errors occur at the extremes of the quadrotor’s flight envelope. While using an L1Norm loss function reduces prediction error overall, it constrains the model to learn the simple hover point dynamics, that are more frequent in the training and evaluation datasets. As such, flight samples in more aggressive maneuvers yield predictions that significantly diverges from the ground truth, as in figure 6. We find that samples with errors in the 90th percentile have significantly higher rates of change of position and motor command (i.e. faster and sharper turns). We also find an increase in the variance of pitch and roll angles, indicating that samples taken farther from the hover point of the quadrotor.
It is hypothesized that this behavior is largely datadriven. The current dataset, comprised of stable, indoor flight, has few samples in the extreme ranges of the quadrotor’s flight envelope. However, in comparison, hybrid models appear to be more robust to these outlier samples. These models have significantly worse mean errors over time but a smaller standard deviation, which indicates that building models with a prior on the system’s dynamics may be an effective way to address a lack of data in certain flight modes.
VD Scaling Effects
One of the main potential benefits of a fully convolutional architecture for quadrotor predictive modeling was its computational efficiency and memory footprint. Thus, we investigate the impact of model size of its predictive modeling performance. Table II shows the validation set accuracy results of End2EndTCN when varying the number of depth layers. Forward pass frequency was calculated on a test set running on a Nvidia GeForce RTX 2080 Ti Graphic Processing Unit (GPU).
Overall, we find that End2EndTCN retains a significant amount of its predictive ability as the size of the network decreases, particularly for translational velocity. On the other hand, we see significant reductions in body rotation rate prediction accuracy, likely due to the nonlinear nature of these dynamics and their higher sensitivity to disturbances. Similarly, we find that reducing the observation window does not significantly degrade the performance of End2EndTCN.
One hypothesis for this behavior is that the current model is fundamentally limited by the size of the dataset rather than the size of the model. As demonstrated in language models and other sequence learning tasks
[18], performance improvements from increasing model size is fundamentally capped if the size of the dataset does not increase accordingly. There may be additional factors about timecorrelated data that make it less susceptible to performance increases from model scale. This view of a datacentric approach for further model scaling is supported by error distributions and the sparsity of data in certain flight modes.VE Ablation Studies
A series of ablation studies is performed on End2EndTCN to validate the model’s detailed design. We first compare a series of alternative architectures. This includes models with varying amounts of regularization layers (Batch Normalization and Dropout) and varying training loss functions (Euclidean, Manhattan, and Weighted Euclidean). The results of the study are summarized in table III for Batch Normalization (BN), Dropout (Drop), Shortened gradient path architecture (SG), Weighted L2Norm loss function (WL2), and L1Norm loss (L1). A crucial element of the design of End2EndTCN is the integration of future control inputs for the multistep prediction of nonautonomous dynamical systems. In our ablation study, we consider two methods to achieve this. In the baseline model, past quadrotor states, past control inputs, and future control inputs are concatenated into a single model input sequence. We compare this approach to an architecture where only past quadrotor states and control inputs are fed to the first layer, while future control inputs are fed to an intermediate layer for the purposes of shortening their gradient path.
Firstly, we see that the alternative architecture performance significantly worse with respect to body rate error when compared to the final model. While this architecture was hypothesized to increase performance by shortening the gradient path to the most important features, namely the last quadrotor state and the control inputs, we see that the number of layers between these features and the output are too few to properly capture the nonlinear rotation dynamics of the quadrotor. Furthermore, reducing or eliminating batch normalization in End2EndTCN decreases performance, as does adding dropout to the model. These results mirror similar conclusions in literature [21]. We also find that the L1Norm loss function, which is more robust to outlier state errors, leads to better generalization to the test set than do L2 or weighted L2 loss functions.
Vi Conclusion
This paper presents a detailed study of the use of Temporal Convolutional Networks for quadrotor state modeling and motion prediction. While classical modeling techniques characterize such robotic systems using prior knowledge of the system’s nonautonomous dynamics, we formulate this as a sequence modeling problem by performing discrete multistep prediction. We segment quadrotor telemetry to train a fully convolutional neural network, End2EndTCN, in a semisupervised fashion. End2EndTCN outperforms the previous state of the art by 55% and proves to be more effective than hybrid models and fully physicsbased models. We demonstrate that End2EndTCN retains over 95% of its performance over shorter time intervals when the model is compressed by a factor of 3, and we further characterize model performance with an ablation study and an analysis of predicted flight samples.
This fully convolutional approach to quadrotor modeling is currently limited by the scale and distribution of training data, which is a bottleneck shared by many sequence to sequence models. Collecting data on aggressive quadrotor flight would reduce the model’s bias towards hover point dynamics and potentially reduce infrequent lowaccuracy prediction samples. Further work is required to ascertain whether this method will generalize to outdoor environments with wind disturbances. Finally, End2EndTCN will be applied in modelbased quadrotor control methods to further contextualize its accuracy and computational efficiency.
References
 [1] (2019) Accurate photovoltaic power forecasting models using deep lstmrnn. Neural Computing and Applications 31 (7), pp. 2727–2740. Cited by: §II.
 [2] (2012) Model predictive quadrotor control: attitude, altitude and position experimental studies. IET Control Theory & Applications 6 (12), pp. 1812–1827. Cited by: §I.
 [3] (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271. Cited by: §I, §II, §II, §IV.
 [4] (2016) Learning quadrotor dynamics using neural network for flight control. In 2016 IEEE 55th Conference on Decision and Control (CDC), pp. 4653–4660. Cited by: §II.
 [5] (2018) Performance prediction of multirotor vehicles using a higher order potential flow method. In 2018 AIAA aerospace sciences meeting, pp. 1528. Cited by: §II.
 [6] (2021) NeuroBEM: hybrid aerodynamic quadrotor model. arXiv preprint arXiv:2106.08015. Cited by: §II.
 [7] Filterbased online systemparameter estimation for multicopter uavs. Cited by: §II.
 [8] (2017) Conditional time series forecasting with convolutional neural networks. arXiv preprint arXiv:1703.04691. Cited by: §II.
 [9] (2007) Full control of a quadrotor. In 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 153–158. Cited by: §II.
 [10] (2012) Learningbased model predictive control on a quadrotor: onboard implementation and experimental results. In 2012 IEEE International Conference on Robotics and Automation, pp. 279–284. Cited by: §I.
 [11] (2018) A framework for maximum likelihood parameter identification applied on mavs. Journal of Field Robotics 35 (1), pp. 5–22. Cited by: §II.
 [12] (2020) Probabilistic forecasting with temporal convolutional neural network. Neurocomputing 399, pp. 491–501. Cited by: §IV.
 [13] (2019) Knowledgedriven stock trend prediction and explanation via temporal convolutional network. In Companion Proceedings of The 2019 World Wide Web Conference, pp. 678–685. Cited by: §II.
 [14] (2018) Flatnessbased model predictive control for quadrotor trajectory tracking. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6740–6745. Cited by: §I.
 [15] (2020) A survey of deep learning techniques for autonomous driving. Journal of Field Robotics 37 (3), pp. 362–386. Cited by: §I.
 [16] (2004) The stanford testbed of autonomous rotorcraft for multi agent control (STARMAC). In The 23rd Digital Avionics Systems Conference (IEEE Cat. No. 04CH37576), Vol. 2, pp. 12–E. Cited by: §II.
 [17] (2011) Precision flight control for a multivehicle quadrotor helicopter testbed. Control engineering practice 19 (9), pp. 1023–1036. Cited by: §I, §II.
 [18] (2020) Scaling laws for neural language models. arXiv preprint arXiv:2001.08361. Cited by: §VD.
 [19] (2020) Deep drone acrobatics. arXiv preprint arXiv:2006.05768. Cited by: §II.
 [20] (2015) Kinematic and dynamic vehicle models for autonomous driving control design. In 2015 IEEE Intelligent Vehicles Symposium (IV), pp. 1094–1099. Cited by: §I.

[21]
(2017)
Temporal convolutional networks for action segmentation and detection.
In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 156–165. Cited by: §I, §II, §IV, §VE.  [22] (2018) Dynamic modeling and vibration control for a nonlinear 3dimensional flexible manipulator. International Journal of Robust and Nonlinear Control 28 (13), pp. 3927–3945. Cited by: §I.
 [23] (2012) Multirotor aerial vehicles: modeling, estimation, and control of quadrotor. IEEE Robotics and Automation magazine 19 (3), pp. 20–32. Cited by: §II, §IVA.
 [24] (2019) Multistep prediction of dynamic systems with recurrent neural networks. IEEE transactions on neural networks and learning systems 30 (11), pp. 3370–3383. Cited by: §I, §II, §IVA, §VB.
 [25] (2017) Modeling dynamic systems for multistep prediction with recurrent neural networks. Ph.D. Thesis, University of Waterloo. Cited by: §VA.
 [26] (2016) Wavenet: a generative model for raw audio. arXiv preprint arXiv:1609.03499. Cited by: §I, §II, §IV.

[27]
(2013)
On the difficulty of training recurrent neural networks.
In
International conference on machine learning
, pp. 1310–1318. Cited by: §I, §II.  [28] (2006) Modelling and control of a quadrotor robot. In Proceedings of the 2006 Australasian Conference on Robotics and Automation, pp. 1–10. Cited by: §II.
 [29] (2015) Deep learning helicopter dynamics models. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 3223–3230. Cited by: §II.
 [30] (2020) DeepAR: probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting 36 (3), pp. 1181–1191. Cited by: §II.
 [31] (2017) Deep learning for household load forecasting—a novel pooling deep rnn. IEEE Transactions on Smart Grid 9 (5), pp. 5271–5280. Cited by: §II.
 [32] (2017) Attention is all you need. arXiv preprint arXiv:1706.03762. Cited by: §I, §II.
 [33] (2017) Deepvo: towards endtoend visual odometry with deep recurrent convolutional neural networks. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2043–2050. Cited by: §II.
 [34] (2014) ANN model to predict stock prices at stock exchange markets. arXiv preprint arXiv:1502.06434. Cited by: §II.
 [35] (2019) Online estimation of geometric and inertia parameters for multirotor aerial vehicles. In 2019 International Conference on Robotics and Automation (ICRA), pp. 1884–1890. Cited by: §II.
 [36] (2020) Temporal convolutional networks for the advance prediction of enso. Scientific reports 10 (1), pp. 1–15. Cited by: §II.
 [37] (2016) Architectural complexity measures of recurrent neural networks. arXiv preprint arXiv:1602.08210. Cited by: §II.

[38]
(2017)
Timeweighted lstm model with redefined labeling for stock trend prediction.
In
2017 IEEE 29th international conference on tools with artificial intelligence (ICTAI)
, pp. 1210–1217. Cited by: §II.