LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data

by   Ali Eshragh, et al.

We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within (1+O(ε)) of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach. To the best of our knowledge, this paper is the first attempt to establish a nexus between RandNLA and big time series data analysis.



There are no comments yet.


page 1

page 2

page 3

page 4


Rollage: Efficient Rolling Average Algorithm to Estimate ARMA Models for Big Time Series Data

We develop a new method to estimate an ARMA model in the presence of big...

Toeplitz Least Squares Problems, Fast Algorithms and Big Data

In time series analysis, when fitting an autoregressive model, one must ...

Parameter Estimation of Heavy-Tailed AR Model with Missing Data via Stochastic EM

The autoregressive (AR) model is a widely used model to understand time ...

Iteratively reweighted adaptive lasso for conditional heteroscedastic time series with applications to AR-ARCH type processes

Shrinkage algorithms are of great importance in almost every area of sta...

Efficient Sequential and Parallel Algorithms for Estimating Higher Order Spectra

Polyspectral estimation is a problem of great importance in the analysis...

Counting Causal Paths in Big Times Series Data on Networks

Graph or network representations are an important foundation for data mi...

A Hybrid Framework for Topology Identification of Distribution Grid with Renewables Integration

Topology identification (TI) is a key task for state estimation (SE) in ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.