Protecting Time Series Data with Minimal Forecast Loss

06/30/2021
by   Matthew J. Schneider, et al.
0

Forecasting could be negatively impacted due to anonymization requirements in data protection legislation. To measure the potential severity of this problem, we derive theoretical bounds for the loss to forecasts from additive exponential smoothing models using protected data. Following the guidelines of anonymization from the General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA), we develop the k-nearest Time Series (k-nTS) Swapping and k-means Time Series (k-mTS) Shuffling methods to create protected time series data that minimizes the loss to forecasts while preventing a data intruder from detecting privacy issues. For efficient and effective decision making, we formally model an integer programming problem for a perfect matching for simultaneous data swapping in each cluster. We call it a two-party data privacy framework since our optimization model includes the utilities of a data provider and data intruder. We apply our data protection methods to thousands of time series and find that it maintains the forecasts and patterns (level, trend, and seasonality) of time series well compared to standard data protection methods suggested in legislation. Substantively, our paper addresses the challenge of protecting time series data when used for forecasting. Our findings suggest the managerial importance of incorporating the concerns of forecasters into the data protection itself.

READ FULL TEXT
research
04/08/2019

Application of data compression techniques to time series forecasting

In this study we show that standard well-known file compression programs...
research
04/18/2020

Orbit: Probabilistic Forecast with Exponential Smoothing

Time series forecasting is an active research topic in academia as well ...
research
06/12/2023

Improving Forecasts for Heterogeneous Time Series by "Averaging", with Application to Food Demand Forecast

A common forecasting setting in real world applications considers a set ...
research
05/26/2023

Improved Sales Forecasting using Trend and Seasonality Decomposition with LightGBM

Retail sales forecasting presents a significant challenge for large reta...
research
06/09/2017

Time Series Using Exponential Smoothing Cells

Time series analysis is used to understand and predict dynamic processes...
research
03/18/2020

Forecasting Crime Using ARIMA Model

Data mining is the process in which we extract the different patterns an...
research
03/22/2019

Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data

Time series data in the retail world are particularly rich in terms of d...

Please sign up or login with your details

Forgot password? Click here to reset