LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data

11/27/2019
by   Ali Eshragh, et al.
0

We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within (1+O(ε)) of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach. To the best of our knowledge, this paper is the first attempt to establish a nexus between RandNLA and big time series data analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2021

Rollage: Efficient Rolling Average Algorithm to Estimate ARMA Models for Big Time Series Data

We develop a new method to estimate an ARMA model in the presence of big...
research
12/24/2021

Toeplitz Least Squares Problems, Fast Algorithms and Big Data

In time series analysis, when fitting an autoregressive model, one must ...
research
09/19/2018

Parameter Estimation of Heavy-Tailed AR Model with Missing Data via Stochastic EM

The autoregressive (AR) model is a widely used model to understand time ...
research
02/23/2015

Iteratively reweighted adaptive lasso for conditional heteroscedastic time series with applications to AR-ARCH type processes

Shrinkage algorithms are of great importance in almost every area of sta...
research
09/21/2022

Designing PIDs for Reproducible Science Using Time-Series Data

As part of the investigation done by the IEEE Standards Association P295...
research
05/30/2018

Efficient Sequential and Parallel Algorithms for Estimating Higher Order Spectra

Polyspectral estimation is a problem of great importance in the analysis...
research
05/27/2019

Counting Causal Paths in Big Times Series Data on Networks

Graph or network representations are an important foundation for data mi...

Please sign up or login with your details

Forgot password? Click here to reset