Optimal Sampling Designs for Multi-dimensional Streaming Time Series with Application to Power Grid Sensor Data

03/14/2023
by   Rui Xie, et al.
0

The Internet of Things (IoT) system generates massive high-speed temporally correlated streaming data and is often connected with online inference tasks under computational or energy constraints. Online analysis of these streaming time series data often faces a trade-off between statistical efficiency and computational cost. One important approach to balance this trade-off is sampling, where only a small portion of the sample is selected for the model fitting and update. Motivated by the demands of dynamic relationship analysis of IoT system, we study the data-dependent sample selection and online inference problem for a multi-dimensional streaming time series, aiming to provide low-cost real-time analysis of high-speed power grid electricity consumption data. Inspired by D-optimality criterion in design of experiments, we propose a class of online data reduction methods that achieve an optimal sampling criterion and improve the computational efficiency of the online analysis. We show that the optimal solution amounts to a strategy that is a mixture of Bernoulli sampling and leverage score sampling. The leverage score sampling involves auxiliary estimations that have a computational advantage over recursive least squares updates. Theoretical properties of the auxiliary estimations involved are also discussed. When applied to European power grid consumption data, the proposed leverage score based sampling methods outperform the benchmark sampling method in online estimation and prediction. The general applicability of the sampling-assisted online estimation method is assessed via simulation studies.

READ FULL TEXT

page 2

page 17

research
08/07/2018

Sprintz: Time Series Compression for the Internet of Things

Thanks to the rapid proliferation of connected devices, sensor-generated...
research
08/10/2021

Statistical Inference in High-dimensional Generalized Linear Models with Streaming Data

In this paper we develop an online statistical inference approach for hi...
research
04/24/2019

Baseline Drift Estimation for Air Quality Data Using Quantile Trend Filtering

We address the problem of estimating smoothly varying baseline trends in...
research
11/17/2020

Distributed Online Learning with Multiple Kernels

In the Internet-of-Things (IoT) systems, there are plenty of informative...
research
12/24/2020

On Statistical Efficiency in Learning

A central issue of many statistical learning problems is to select an ap...
research
03/07/2022

Multivariate Time Series Forecasting with Latent Graph Inference

This paper introduces a new approach for Multivariate Time Series foreca...
research
05/12/2019

Note on Thompson sampling for large decision problems

There is increasing interest in using streaming data to inform decision ...

Please sign up or login with your details

Forgot password? Click here to reset