
Deep Factors for Forecasting
Producing probabilistic forecasts for large collections of similar and/or dependent time series is a practically relevant and challenging task. Classical time series models fail to capture complex patterns in the data, and multivariate techniques struggle to scale to large problem sizes. Their reliance on strong structural assumptions makes them dataefficient, and allows them to provide uncertainty estimates. The converse is true for models based on deep neural networks, which can learn complex patterns and dependencies given enough data. In this paper, we propose a hybrid model that incorporates the benefits of both approaches. Our new method is datadriven and scalable via a latent, global, deep component. It also handles uncertainty through a local classical model. We provide both theoretical and empirical evidence for the soundness of our approach through a necessary and sufficient decomposition of exchangeable time series into a global and a local part. Our experiments demonstrate the advantages of our model both in term of data efficiency, accuracy and computational complexity.
05/28/2019 ∙ by Yuyang Wang, et al. ∙ 6 ∙ shareread it

Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale
We present a scalable and robust Bayesian inference method for linear state space models. The method is applied to demand forecasting in the context of a large ecommerce platform, paying special attention to intermittent and bursty target statistics. Inference is approximated by the NewtonRaphson algorithm, reduced to lineartime Kalman smoothing, which allows us to operate on several orders of magnitude larger problems than previous related work. In a study on large realworld sales datasets, our method outperforms competing approaches on fast and medium moving items.
09/22/2017 ∙ by Matthias Seeger, et al. ∙ 0 ∙ shareread it

Online Learning with Pairwise Loss Functions
Efficient online learning with pairwise loss functions is a crucial component in building largescale learning system that maximizes the area under the Receiver Operator Characteristic (ROC) curve. In this paper we investigate the generalization performance of online learning algorithms with pairwise loss functions. We show that the existing proof techniques for generalization bounds of online algorithms with a univariate loss can not be directly applied to pairwise losses. In this paper, we derive the first result providing datadependent bounds for the average risk of the sequence of hypotheses generated by an arbitrary online learner in terms of an easily computable statistic, and show how to extract a low risk hypothesis from the sequence. We demonstrate the generality of our results by applying it to two important problems in machine learning. First, we analyze two online algorithms for bipartite ranking; one being a natural extension of the perceptron algorithm and the other using online convex optimization. Secondly, we provide an analysis for the risk bound for an online algorithm for supervised metric learning.
01/22/2013 ∙ by Yuyang Wang, et al. ∙ 0 ∙ shareread it

Nonparametric Bayesian Mixedeffect Model: a Sparse Gaussian Process Approach
Multitask learning models using Gaussian processes (GP) have been developed and successfully applied in various applications. The main difficulty with this approach is the computational cost of inference using the union of examples from all tasks. Therefore sparse solutions, that avoid using the entire data directly and instead use a set of informative "representatives" are desirable. The paper investigates this problem for the grouped mixedeffect GP model where each individual response is given by a fixedeffect, taken from one of a set of unknown groups, plus a random individual effect function that captures variations among individuals. Such models have been widely used in previous work but no sparse solutions have been developed. The paper presents the first sparse solution for such problems, showing how the sparse approximation can be obtained by maximizing a variational lower bound on the marginal likelihood, generalizing ideas from singletask Gaussian processes to handle the mixedeffect model as well as grouping. Experiments using artificial and real data validate the approach showing that it can recover the performance of inference with the full sample, that it outperforms baseline methods, and that it outperforms state of the art sparse solutions for other multitask GP formulations.
11/28/2012 ∙ by Yuyang Wang, et al. ∙ 0 ∙ shareread it

Infinite Shiftinvariant Grouped Multitask Learning for Gaussian Processes
Multitask learning leverages shared information among data sets to improve the learning performance of individual tasks. The paper applies this framework for data where each task is a phaseshifted periodic time series. In particular, we develop a novel Bayesian nonparametric model capturing a mixture of Gaussian processes where each task is a sum of a groupspecific function and a component capturing individual variation, in addition to each task being phase shifted. We develop an efficient em algorithm to learn the parameters of the model. As a special case we obtain the Gaussian mixture model and em algorithm for phasedshifted periodic time series. Furthermore, we extend the proposed model by using a Dirichlet Process prior and thereby leading to an infinite mixture model that is capable of doing automatic model selection. A Variational Bayesian approach is developed for inference in this model. Experiments in regression, classification and class discovery demonstrate the performance of the proposed models using both synthetic data and realworld time series data from astrophysics. Our methods are particularly useful when the time series are sparsely and nonsynchronously sampled.
03/05/2012 ∙ by Yuyang Wang, et al. ∙ 0 ∙ shareread it

Giniregularized Optimal Transport with an Application to SpatioTemporal Forecasting
Rapidly growing product lines and services require a finergranularity forecast that considers geographic locales. However the open question remains, how to assess the quality of a spatiotemporal forecast? In this manuscript we introduce a metric to evaluate spatiotemporal forecasts. This metric is based on an Opti mal Transport (OT) problem. The metric we propose is a constrained OT objec tive function using the Gini impurity function as a regularizer. We demonstrate through computer experiments both the qualitative and the quantitative charac teristics of the Gini regularized OT problem. Moreover, we show that the Gini regularized OT problem converges to the classical OT problem, when the Gini regularized problem is considered as a function of λ, the regularization parameter. The convergence to the classical OT solution is faster than the stateoftheart Entropicregularized OT[Cuturi, 2013] and results in a numerically more stable algorithm.
12/07/2017 ∙ by Lucas Roberts, et al. ∙ 0 ∙ shareread it

MmWave Beam Prediction with Situational Awareness: A Machine Learning Approach
Millimeterwave communication is a challenge in the highly mobile vehicular context. Traditional beam training is inadequate in satisfying low overheads and latency. In this paper, we propose to combine machine learning tools and situational awareness to learn the beam information (power, optimal beam index, etc) from past observations. We consider forms of situational awareness that are specific to the vehicular setting including the locations of the receiver and the surrounding vehicles. We leverage regression models to predict the received power with different beam power quantizations. The result shows that situational awareness can largely improve the prediction accuracy and the model can achieve throughput with little performance loss with almost zero overhead.
05/23/2018 ∙ by Yuyang Wang, et al. ∙ 0 ∙ shareread it

Deep Factors with Gaussian Processes for Forecasting
A large collection of time series poses significant challenges for classical and neural forecasting approaches. Classical time series models fail to fit data well and to scale to large problems, but succeed at providing uncertainty estimates. The converse is true for deep neural networks. In this paper, we propose a hybrid model that incorporates the benefits of both approaches. Our new method is datadriven and scalable via a latent, global, deep component. It also handles uncertainty through a local classical Gaussian Process model. Our experiments demonstrate that our method obtains higher accuracy than stateoftheart methods.
11/30/2018 ∙ by Danielle C. Maddix, et al. ∙ 0 ∙ shareread it

GluonTS: Probabilistic Time Series Models in Python
We introduce Gluon Time Series (GluonTS)[<https://gluonts.mxnet.io>], a library for deeplearningbased time series modeling. GluonTS simplifies the development of and experimentation with time series models for common tasks such as forecasting or anomaly detection. It provides all necessary components and tools that scientists need for quickly building new models, for efficiently running and analyzing experiments and for evaluating model accuracy.
06/12/2019 ∙ by Alexander Alexandrov, et al. ∙ 0 ∙ shareread it
Yuyang Wang
is this you? claim profile