1 Introduction
Time series forecasting has numerous industrial and scientific applications in logistics, predictive maintenance, energy, manufacturing, agriculture, healthcare, sales, climate science, and many other domains. While classical methods such as ARIMA and Exponential Smoothing often give good results (Hyndman and Athanasopoulos, 2018), machine learning is becoming a more attractive option to improve the models’ representation power and scale to larger datasets and higher dimensionalities. In fact, it has recently been shown that pure ML-based approaches based on generic deep learning architectures can beat classical methods on a variety of tasks (Oreshkin et al., 2019; Flunkert et al., 2017).
Perhaps more important than sheer accuracy, the arrival of modern machine learning opens the opportunity to re-think the forecasting practices and software. For example, classical methods typically require training one model per time series, whereas ML models usually work best when trained on datasets containing large numbers of time series. This and other paradigm changes – such as better support for high-dimensional data, iterative training based on stochastic gradient descent, or the possibility to tailor loss functions for specific needs – require new tools and appropriate APIs. In particular, user-friendly and powerful APIs are important to make ML approaches as easy to use as classical techniques, which is necessary for larger-scale adoption by practitioners.
Several strong time series forecasting toolkits exist; however, they focus on classical methods or do not support deep learning and training models on multiple series (Hyndman and Khandakar, 2008; Jiang et al., 2021; Löning et al., 2019; Hosseini et al., 2021; Bhatnagar et al., 2021), or have lower-level APIs (Alexandrov et al., 2020; Beitner et al., ). Darts proposes a new relatively high-level API unifying classical and ML-based forecasting models.
2 Design Principles and Main Features of Darts
2.1 Time Series Representation
Darts has its own TimeSeries data container type, which represents one time series. TimeSeries are immutable and provide guarantees that the data represents a well-formed time series with correct shape, type, and sorted time index. TimeSeries can be indexed either with Pandas DatetimeIndex or Int64Index (Wes McKinney, 2010). The TimeSeries are wrapping around three-dimensional xarray DataArray (Hoyer and Hamman, 2017). The dimensions are (time, component, sample), where component represents the dimensions of multivariate series and sample represents samples of stochastic time series. The TimeSeries class provides several methods to convert to/from other common types, such as Pandas Dataframes or NumPy arrays (Harris et al., 2020)
. It can also perform convenient operations, such as math operations, indexing, splitting, time-differencing, interpolating, mapping functions, embedding timestamps, plotting, or computing marginal quantiles. For immutability,
TimeSeries carry their own copy of the data and heavily rely on NumPy views for efficiently accessing the data without copying (e.g., when training models).The main advantage of using a dedicated type offering such guarantees is that all models in Darts can consume and produce TimeSeries, which in turn helps to offer a consistent API. For instance, it is easy to have models consuming the outputs of other models.
2.2 Unified High-Level Forecasting API
All models in Darts support the same basic fit(series: TimeSeries) -> None and predict(n: int) -> TimeSeries interface to be trained on a single series series and forecast n time steps after the end of the series. In addition, most models also provide richer functionalities; for instance the ability to be trained on a Sequence of TimeSeries (using calls like fit([series1, series2, …])). Models can have different internal mechanics (e.g., sequence-to-sequence, fixed lengths, recurrent, auto-regressive), and this unified API makes it possible to seamlessly compare, backtest, and ensemble diverse models without having to know their inner workings.
Some of the models implemented in Darts at the time of writing are: (V)ARIMA, Exponential Smoothing, AutoARIMA (Smith et al., 2017), Theta (Assimakopoulos and Nikolopoulos, 2000), Prophet (Taylor and Letham, 2018), FFT-based forecasting, RNN models similar to DeepAR (Flunkert et al., 2017), N-BEATS (Oreshkin et al., 2019), TCN (Bai et al., 2018) and generic regression models that can wrap around any external tabular regression model (such as scikit-learn models). The list is constantly expanding and we welcome external and reference implementations of new models.
2.3 Meta-Learning
An important part of Darts is the support for meta-learning, or the ability to train one model on a potentially large number of separate time series (Oreshkin et al., 2021). The darts.utils.data module contains various implementations of time series datasets
, which specify how to slice series (and potential covariates) into training samples. Darts selects a model-specific default slicing logic, but it can also be user-defined in a custom way if needed. All neural networks are implemented using PyTorch
(Paszke et al., 2019) and support GPU training and inference. It is possible to consume large datasets that do not hold in memory by relying on custom Sequence implementations to load the data in a lazy fashion.2.4 Support for Past and Future Covariates
Several models support covariate series as a way to specify external data potentially helpful for forecasting the target series. Darts differentiates future covariates, which are known into the future (such as weather forecasts) from past covariates, which are known only into the past. The models accept past_covariates and/or future_covariates arguments, which make it clear whether future values are required at inference time and reduces the risks to make mistakes. Covariate series need not be aligned with the targets, as the alignment is done by the slicing logic based on the respective time axes.
2.5 Probabilistic Forecasting
Some models (and nearly all deep learning models) in Darts support probabilistic forecasting. The joint distributions over components and time are represented by storing Monte Carlo samples in the
TimeSeriesobjects directly. This representation is very flexible as it does not rely on any parametric form and can capture arbitrary joint distributions. The computational cost of sampling is usually negligible, as samples can be efficiently computed in a vectorized way using batching. Probabilistic deep learning models can fit arbitrary likelihood forms, as long as the negative log-likelihood loss is differentiable. At the time of writing, Darts provides 16 distributions out-of-the-box (both continuous and discrete, univariate and multivariate). Finally, it offers a way to specify time-independent priors on the distributions’ parameters, as a way to specify prior beliefs about the output distributions.
2.6 Other Features
Darts contains many additional features, such as transformers and pipelines for data pre-processing, backtesting (all models offer a backtest() method), grid-search for model selection, extensive metrics, a dynamic time warping module (Berndt and Clifford, 1994), and ensemble models (with the possibility to use a regression model to learn the ensemble itself). Darts also contains filtering
models such as Kalman filters and Gaussian Processes, which offer probabilistic modelling of time series. Finally, the
darts.datasets module contains a variety of publicly available datasets which can be conveniently loaded as TimeSeries.3 Usage Example
The code below shows how to fit a single TCN model (Bai et al., 2018) with default hyper-parameters on two different (and completely distinct) series, and forecast one of them. The network outputs the parameters of a Laplace distribution. The code contains a complete predictive pipeline, from loading and preprocessing the data, to plotting the forecast with arbitrary quantiles (shown on the right).

4 Conclusions
Darts is an attempt at democratizing modern machine learning forecasting approaches, and unify them (along with classical approaches) under a common user-friendly API. The library is still under active development, and some of the future work includes extending the API to include anomaly detection and time series classification models, supporting irregularly spaced data (e.g., point processes), and providing a collection of models pre-trained on large datasets, similar to what exists in the computer vision and NLP domains.
References
- Alexandrov et al. (2020) Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Rangapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang. Gluonts: Probabilistic and neural time series modeling in python. Journal of Machine Learning Research, 21(116):1–6, 2020. URL http://jmlr.org/papers/v21/19-820.html.
- Assimakopoulos and Nikolopoulos (2000) V. Assimakopoulos and K. Nikolopoulos. The theta model: a decomposition approach to forecasting. International Journal of Forecasting, 16(4):521–530, 2000. ISSN 0169-2070. doi: https://doi.org/10.1016/S0169-2070(00)00066-2. URL https://www.sciencedirect.com/science/article/pii/S0169207000000662. The M3- Competition.
- Bai et al. (2018) Shaojie Bai, J. Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. CoRR, abs/1803.01271, 2018. URL http://arxiv.org/abs/1803.01271.
- (4) Beitner et al. Pytorch forecasting. https://github.com/jdb78/pytorch-forecasting. [Online; accessed 27-September-2021].
- Berndt and Clifford (1994) Donald J Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In KDD workshop, volume 10, pages 359–370. Seattle, WA, USA:, 1994.
- Bhatnagar et al. (2021) Aadyot Bhatnagar, Paul Kassianik, Chenghao Liu, Tian Lan, Wenzhuo Yang, Rowan Cassius, Doyen Sahoo, Devansh Arpit, Sri Subramanian, Gerald Woo, Amrita Saha, Arun Kumar Jagota, Gokulakrishnan Gopalakrishnan, Manpreet Singh, K C Krithika, Sukumar Maddineni, Daeki Cho, Bo Zong, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Steven Hoi, and Huan Wang. Merlion: A machine learning library for time series, 2021.
- Flunkert et al. (2017) Valentin Flunkert, David Salinas, and Jan Gasthaus. Deepar: Probabilistic forecasting with autoregressive recurrent networks. CoRR, abs/1704.04110, 2017. URL http://arxiv.org/abs/1704.04110.
- Harris et al. (2020) Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. Array programming with NumPy. Nature, 585(7825):357–362, September 2020. doi: 10.1038/s41586-020-2649-2. URL https://doi.org/10.1038/s41586-020-2649-2.
- Hosseini et al. (2021) Hosseini et al. Linkedin greykite. https://github.com/linkedin/greykite, 2021. [Online; accessed 27-September-2021].
- Hoyer and Hamman (2017) S. Hoyer and J. Hamman. xarray: N-D labeled arrays and datasets in Python. Journal of Open Research Software, 5(1), 2017. doi: 10.5334/jors.148. URL http://doi.org/10.5334/jors.148.
- Hyndman and Khandakar (2008) Rob J Hyndman and Yeasmin Khandakar. Automatic time series forecasting: the forecast package for R. Journal of Statistical Software, 26(3):1–22, 2008. URL https://www.jstatsoft.org/article/view/v027i03.
- Hyndman and Athanasopoulos (2018) Robin John Hyndman and George Athanasopoulos. Forecasting: Principles and Practice. OTexts, Australia, 2nd edition, 2018.
- Jiang et al. (2021) Jiang et al. Facebook kats. https://github.com/facebookresearch/Kats, 2021. [Online; accessed 27-September-2021].
- Löning et al. (2019) Markus Löning, Anthony J. Bagnall, Sajaysurya Ganesh, Viktor Kazakov, Jason Lines, and Franz J. Király. sktime: A unified interface for machine learning with time series. CoRR, abs/1909.07872, 2019. URL http://arxiv.org/abs/1909.07872.
- Oreshkin et al. (2019) Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. N-BEATS: neural basis expansion analysis for interpretable time series forecasting. CoRR, abs/1905.10437, 2019. URL http://arxiv.org/abs/1905.10437.
-
Oreshkin et al. (2021)
Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio.
Meta-learning framework with applications to zero-shot time-series
forecasting.
Proceedings of the AAAI Conference on Artificial Intelligence
, 35(10):9242–9250, May 2021. URL https://ojs.aaai.org/index.php/AAAI/article/view/17115. - Paszke et al. (2019) Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
- Pedregosa et al. (2011) F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
-
Smith et al. (2017)
Taylor G. Smith et al.
pmdarima: Arima estimators for Python, 2017.
URL http://www.alkaline-ml.com/pmdarima. [Online; accessed 27-September-2021]. - Taylor and Letham (2018) Sean J Taylor and Benjamin Letham. Forecasting at scale. The American Statistician, 72(1):37–45, 2018.
- Wes McKinney (2010) Wes McKinney. Data Structures for Statistical Computing in Python. In Stéfan van der Walt and Jarrod Millman, editors, Proceedings of the 9th Python in Science Conference, pages 56 – 61, 2010. doi: 10.25080/Majora-92bf1922-00a.