Model selection for count timeseries with applications in forecasting number of trips in bike-sharing systems and its volatility
Forecasting the number of trips in bike-sharing systems and its volatility over time is crucial for planning and optimizing such systems. This paper develops timeseries models to forecast hourly count timeseries data, and estimate its volatility. Such models need to take into account the complex patterns over various temporal scales including hourly, daily, weekly and annual as well as the temporal correlation. To capture this complex structure, a large number of parameters are needed. Here a structural model selection approach is utilized to choose the parameters. This method explores the parameter space for a group of covariates at each step. These groups of covariate are constructed to represent a particular structure in the model. The statistical models utilized are extensions of Generalized Linear Models to timeseries data. One challenge in using such models is the explosive behavior of the simulated values. To address this issue, we develop a technique which relies on damping the simulated value, if it falls outside of an admissible interval. The admissible interval is defined using measures of variability of the left and right tails. A new definition of outliers is proposed based on these variability measures. This new definition is shown to be useful in the context of asymmetric distributions.
READ FULL TEXT