1 Introduction
Large companies of all kinds define, evolve and rely on enterprisewise forecasting systems that model and predict many aspects of business development. Central to such business analyses are revenue forecasting components that operate at multiple scales in time and across business enterprises. In large retail supermarket companies, forecasts are impacted by multiscale influences such as companywide policy, regional differences, variation across Categories of items bought and sold, and demand for individual items at individual stores, among many other influences on revenue streams. In large and diverse supermarket chains, forecast information at multiple levels of aggregation– devolving to groups of items (Categories) and groups of stores, referred to as Local Store Groups (LSGs) – are utilized by downstream decision makers in the enterprise. In this setting, we discuss aspects of a large case study that evolve modeling approaches to aid and inform these complex decision processes.
In business sales forecasting, information about demand filters from the bottomup in terms of consumer behavior that underlies itemlevel sales. In parallel, information about supply, projected sales targets and macroeconomic considerations filter from the topdown, often in formats that are not easily compatible with statistical forecasting models. Models generating revenue forecasts for product Categories and groups of stores thus need to integrate bottomup and topdown information. Forecast outputs also need to be in a form that Categorymanagers, storemanagers and executives can utilize. In major companies with many stores and products, what may appear to be very small improvements in forecast accuracy at the levels of groups of items and groups of stores can translate to very major revenue impact at the enterprise level; hence modeling developments that yield apparently modest improvements at the “micro” levels are of major interest.
In this work, we discuss aspects of a longterm case study of revenue forecasting for a large grocery chain. There are two primary dimensions of interest: Local Store Groups, groups of policysimilar stores (in terms of geography or management); and Categories, defined groups of similar or related items on sale. The business setting defines a focus on forecasting revenue 12 weeks ahead for every LSGCategory pair. There are multiple challenges in this and related settings. While patterns of Category demand are related across LSGs, there is also considerable heterogeneity by LSG and Category. Sharing information has the potential to improve forecasts, especially for smaller LSGs and Categories, but it is not obvious at what level to share information due to the heterogeneity. Key questions arise on how to utilize Categorylevel information on discounts and pricing, in particular. The focus on longerterm forecasting– a forecast horizon of 12 weeks or more to feedinto longerterm planning and decisions– defines challenges to all forecasting approaches.
A number of downstream business questions are informed by revenue forecasts. The primary interest is in forecasting for 12 weeks ahead to feed into pricing decisions; even very small improvements in forecast accuracy at LSGs and Category levels can translate to large monetary gains across the system. The grocery chain is also interested in understanding the roles of pricing and promotion strategies, for both LSGs and Categories, and in exploring “Whatif?” scenarios where pricing and discounts are altered and the impact of these changes assessed. This necessitates interpretable models such that: (i) the roles of such control and predictor variables can be assessed; (ii) users can intervene in the models in informed ways; and (iii) forecast uncertainties are fully characterized for proper use in downstream decision making. There is also interest in understanding dependencies between Categories, particularly in relation to possible “cannibalization” effects that might occur when one Category is subject to more aggressive discount policies than another that might “compete” for customer purchases. There is also the evident need for models to be open, responsive and adaptable over time as realized consumer behavior and grocery demand is inherently timevarying. We address these desiderata using customized classes of dynamic linear models
(West and Harrison, 1997; Prado et al., 2021) applied to revenue time series at the LSGCategory level, with multiscale extensions (e.g. Berry and West, 2020; West, 2020) to represent key aspects of multivariate relationships.Statistical forecasting has a long history in revenue management across industries. Models must address basic questions of seasonality, stochastic variation in demand, price sensitivity, and computational efficiency (e.g. Weatherford,Larry, 2016)
. More recently, machine learning and algorithmic approaches have been explored for revenue forecasting.
Pundir et al. (2020) and Lei and Cailan (2021)use random forests and support vector machines, while
Mishev et al. (2019) and Chu and Zhang (2003)explore deep learning methods. Such approaches can yield forecast accuracy improvements, especially in shortterm forecasting and when timevariation is very limited. They are, however, challenging to interpret and typically neither probabilistic nor dynamic. Particularly in the retail domain, Bayesian dynamic models have been successful in terms of forecasting accuracy, and are substantially preferable in terms of interpretation, openness to intervention, and fully probabilistic forecasting
(e.g. Berry and West, 2020; Berry et al., 2020; Yanchenko et al., 2021).Our case study also involves methodological contributions. We extend multiscale models (e.g. Berry and West, 2020; Berry et al., 2020; Yanchenko et al., 2021) to allow sharing of discount information, and represent multivariate structure in pricing and revenue via a recoupled system of univariate models. These are embedded in the case study discussion throughout.
Section 2 introduces the retail setting and data. Section 3 describes the multiscale modeling framework, noting the role of the decouple/recouple approach in engendering scalability of multivariate models. Section 4 discusses selected results, highlighting: (i) retail Categories that benefit from multiscale modeling in improved revenue forecasting, and others that do not; (ii) contexts where forecasts can be improved by joint modeling of pricing, revenue and dependencies across Categories; and (iii) aspects of crossCategory dependencies. Concluding comments are in Section 5.
2 Setting and Data
The setting is revenue forecasting at the LSGCategory level for a large grocery chain. The forecasting level of interest here is across groups of items (Categories) and groups of stores (LSGs). Each Category is a collection of (a large number of) related items; each LSG is a subset of (a small number of) regionally proximate stores. LSGs, in general, share traits in terms of discounts offered and pricing, though there is variability across LSGs and Categories. It is thus important to allow for variability by LSG and Category, while also allowing information sharing– as appropriate– to potentially increase forecast accuracy.
The data provide 2 calendar years of weekly information for 100 product Categories across 9 LSGs in one geographic region of the USA. This includes weekly revenue (in $s) and detailed information about pricing and promotion for each Category and LSG. Several “breadth of discount” measures (weighted averages across items within each Category) exist and we use three: Temporary Price Reduction (TPR) percent, a percent measure of advertising on the front page of leaflets (AdFront percent), and a percent measure of special stock displays in the back of stores (DspBack percent). Each of these discount measures represents the percentage of items within each Category with each type of discount, weighted by how often each item has historically been purchased. Other information includes the weighted average of discounted price of items within a Category, referred to as the Net Price; this is a quantity that turns out to be quite useful in forecasting weekly LSGCategory level revenue. Throughout, all revenue results are scaled by a random factor.
In Figure 2, we see that revenue varies both by LSG and Category. Over all 104 weeks, however, revenue by Category trends appear similar across LSGs, though different in scale (Figure 2). While there do appear to be potential holiday effects for some Categories, we do not explicitly take holidays into account here. Both pricing (Figure 4) and discounts (Figure 6) tend to be very similar across LSGs, and to vary considerably by Category. While each LSG has some control over individual discounts for that particular group of stores, there is coordination among the LSGs in terms of pricing and promotion decisions. Pricing, in particular, tends to be very similar between LSGs over time, and in general, fairly stable for most Categories (Figure 4). Variation in the Net Price variable over time and between LSGs is largely a function of discounting, as the Net Price variable is the weighted average of price actually paid by customers after taking any discounts into account. On the other hand, there is much more variation over time in terms of TPR percent (Figure 6). Again, TPR trends are similar across LSGs, though vary considerably by Category. TPR percent tends to be the most variable of the three available discount measures.
3 Methodology
3.1 MultiScale Modeling
We are interested in forecasting revenue weeks ahead for each LSGCategory pair. Discount information is set multiple weeks in advance, so discount covariates can be treated as known 12 weeks into the future. However, Net Price needs to be forecast to be used as a covariate at this forecast horizon, as Net Price depends on the discounts seen by individual customers. To improve the revenue forecasts at the LSGCategory level, we utilize aggregate multiscale discount information across LSGs, extending the approach of Berry and West (2020).
Multiscale analysis enables forecast information from aggregate levels to inform lowerlevel forecasts, inherently hierarchical by design. Multiscale models are critically interesting alternatives to far more computationally implicated hierarchical models (e.g. Salinas et al., 2019; Sen et al., 2019)
. Multiscale approaches share information across series while enabling parallel estimation of univariate models
(Berry and West, 2020; Berry, 2019; West, 2020). This enables scaling to large numbers of time series such as are frequently seen in business contexts; computations scale linearly in the number of series. Importantly, this avoids the need for large, complex Markov chain Monte Carlo or particle filtering methods, while retaining the ability to improve multistep ahead forecasts for individual series by incorporating multiscale “dynamic factor” signals. Scalability is especially relevant in demand forecasting settings, where there are very many noisy, sparse and heterogeneous individual series. However, there often exist crosssectional or other hierarchical structures in this type of data– across items, for example– that can be leveraged as aggregate, multiscale signals to improve forecasts at the lowest level. Our models here build on this background.
Let be the revenue for week , Category and LSG and be the revenue aggregated across LSGs for each Category . Then, let
be the vector of discount measures (TPR percent, ad front percent and display back percent). Here
is known 12 weeks in advance and we aim to forecast for all into the future . Our modeling strategy is to:
[noitemsep,topsep=0pt]

Model aggregate revenue across LSGs (multiscale): .

Extract inferred effects of aggregate discounts from model (1): .

Model revenue: .
This model for revenue depends on LSGCategory specific discount information () and multiscale discount information across LSGs (); see Figure 7. This defines a flexible baseline model. Section 4.2 discusses extensions to include Category pricing information that can yield revenue forecasting improvements.
This hierarchical, multiscale approach allows each LSGCategory pair to “see” common, aggregate revenue responses to discounts differently and allows for sharing of information and personalization of the common trends for each specific LSG. This approach increases forecast accuracy for many LSGCategory pairs for 12week ahead revenue forecasts, in particular for smaller LSGs that build on information from larger LSGs. On a key technical point, we use “plugin” point forecasts of the multiscale effects of discount predictors, choosing the current (time ) posterior mean of the effect in the aggregate model. This understates uncertainty in resulting revenue forecast distributions as it ignores uncertainty about aggregate discount effects. Applied evaluations lead us to accept this practical sidestep of full uncertainty characterization, as it has modest practical impact. At the costs of more extensive computation it is, of course, easy to extend the analysis to include full uncertainty characterization, repeating the analysis with Monte Carlo samples of the discount effect; see Berry and West (2020) in related models. This more computationally intensive analysis, across numerous LSGs and Categories, can aid in understanding how relevant or– in this case study– practically limited, is the impact of this secondorder uncertainty analysis.
3.2 Dynamic Linear Models
DLMs define the core class of time series models for all levels in the multiscale setting of Figure 7. For a generic univariate time series observed at discrete times , information at time is denoted by where represents any additional relevant information beyond the observed data. A DLM has the form
(1) 
where:

[noitemsep,topsep=0pt]

is a matrix of known covariates at time ,

is the state vector, which evolves via a firstorder Markov process,

is a known state evolution matrix,

is the stochastic innovation vector, with the independent over time, and
Sequential learning in the DLM proceeds naturally via computationally easy updates and forecasting algorithms. Analysis at the level of each univariate series is standard (West and Harrison, 1997; Prado et al., 2021).
3.3 Modeling Details
Revenue is modeled on the log scale using normal DLMs with a trend term and additional covariates; each univariate DLM has the vector with a leading element of 1 followed by entries representing potential seasonal components and known predictor/covariate values. Among the latter, the aggregate revenue model for uses the average discounts across LSGs, , as additional covariates and has yearly seasonality represented by the fundamental (52 week) harmonic model component. The LSGCategory revenue model for , has multiscale discount information included as predictor values; here has elements and as covariates, again with yearly seasonality defined by the first harmonic. All models use the same specific state evolution discount factors to define rates of change over time of state vectors. This completes the basic DLM outlook for each univariate revenue series.
In terms of customized predictor information, Category price discount covariates that are negligible over all weeks are not included (some Categories are rarely discounted, especially various alcohol Categories). Similarly, covariates that are static for many weeks have some small amount of noise added to them to stabilize the modeling; this is a common approach in machine learning and has connections to ridge regression. Here, we add noise to control variables to (1) stabilize inference when there is not much variation in the covariates, and (2) to reflect potential noise in the estimation of these control variables out to 12 weeks in advance, for some increased robustness in the models for practical application. All LSGCategories pairs are modeled separately as univariate DLMs as described in Section
3.2. Recoupling is then induced by sharing information within the overarching multiscale framework. Analysis is implemented in PyBats (Lavine and Cron, 2020).4 Selected Results
Models were fit and evaluated over the first year of data to define selection of DLM discount factors. The detailed forecasting analysis and selected evaluations are based on then running the analyses sequentially over the second year of data with outofsample forecasts generated each week for the following 12 weeks. Empirical forecast accuracy measures are all on the 12week horizon. Section 4.1 gives selected examples where multiscale modeling improves revenue forecasts and others where it does not. Section 4.2 highlights situations where adding information to the multiscale models is shown to improve revenue forecasting, with rationalization and discussion of business implications. Section 4.3 explores aspects of dependencies across Categories with a view to advising potential competing goals in Categorywide pricing and discount strategies. Throughout, all revenue results are scaled by a random factor.
4.1 MultiScale Revenue Forecasting
4.1.1 Some Aggregate Results
A first interest is in identifying Categories and LSGs where there are forecast improvements using the multiscale analysis that shares discount information across LSGs, as described in Section 3. Using the MAPE metric, the results vary by LSGCategory pair, as seen in Figure 8. About 45% of the LSGCategory pairs benefit from the inclusion of multiscale discount information, having lower MAPE values. Again, at this enterprisewide level of forecasting, even small very improvements in MAPE can lead to large increases in revenue, so these cases are of key interest. Then, identifying cases that are better forecast without the multiscale information is just as important; these LSGCategory pairs will be forecast using their individual models.
4.1.2 Revenue Forecasts
We now focus on specific LSGCategory examples that benefit from multiscale information. In addition to the forecasts themselves, we look at the regression effect of the discount information from the multiscale model to illuminate the impact of the multiscale information. For each Monte Carlo sample of the state vector from the multiscale model across LSGs, the discount regression effect is
This represents the overall impact of the multiscale discount information.
Some general points and findings are noted first. Forecasting 12 weeks ahead is challenging. An evaluation on 1 week ahead forecasts could be misleading in terms of the main longerterm horizon of interest. Then, we find that in the cases where multiscale information improves the forecasts at the 12week horizon, it also does at the 1 week ahead forecast horizon. Further, multiscale information can improve the forecasts of both large and small LSGs. Additionally, Category discount information is absolutely critical to include in the revenue forecasting models, either as multiscale information or not. As the main control variable, the discount information is able to produce good forecasts alone for the majority of LSGs and Categories. Finally, some Categories have clear and strong holiday effects. With only two years of data here, there is not enough information to estimate holiday effects directly, but we discuss possible approaches to addressing holiday information in more detail in Section 4.2.
One Category that particularly benefits from multiscale information is the Sugars & Sweeteners Category, with forecasts for two LSGs and the multiscale regression effect shown in Figure 9. Across both larger and smaller LSGs, the inclusion of multiscale discount information defines MAPE optimal forecasts that are more accurate than those from the no multiscale model, especially over weeks 10 and 30. For Sugars & Sweeteners, around weeks 1020 in (c) there is a dynamic, negative discount regression effect, compared to the rest of the weeks; this translates to lower forecasts from the multiscale model compared to the no multiscale model in Figure 9. This negative regression effect pulls the forecasts down in this region, leading to more accurate forecasts. This response to discounts in terms of the revenue is shared across LSGs and well captured by the multiscale model, leading to improved forecasts for this specific Category. Additionally, in both the forecasts and regression effect in Figure 9, there are strong holiday effects around week 30 (the week of December 15).
Figure 10 shows similar forecast summaries at the 1 week ahead horizon. While overall forecast accuracy is naturally higher than that for the 12week horizon, note that the multiscale model still leads to improved forecasts for these LSGCategory pairs.
Frames (a) and (b) show 12week ahead forecasts from the multiscale and the no multiscale models for the Sugar & Sweeteners Category for two LSGs. Average MAPE values over the year are shown in the legends. The point forecasts are MAPE optimal, shading shows 90% credible intervals in the multiscale model, and points are the observed revenue values. For all LSGs, the inclusion of multiscale information improves the forecasts. Frame (c) shows the online estimated regression effects, with 90% credible intervals, of the combined discount predictor information.
Broth/Dry Soup is an example of a Category where the value of the multiscale information varies by LSG. In the larger LSGs in Figure 11, there is little benefit from the multiscale information and the multiscale model tends to underforecast around weeks 2030. However, there is real benefit from the multiscale information for the smaller LSG. This is a common finding in hierarchical models: smaller groups (here LSGs) can benefit more from sharing of information across larger groups due to the increased shrinkage on smaller groups. The multiscale discount information improves forecasts the most for smaller LSGs generally. In this example, note also the change in regression effect around weeks 2030, shown in (c). The multiscale regression effect tends to lead to better forecasts for this time period for the smaller LSGs, as compared to the larger LSGs which underforecast here. Forecasts for both models also naturally improve at the shorter, 1 week ahead, forecast horizon; see Figure 12.
Finally, Baked Sweet Goods is an example of a Category where multiscale information does not improve revenue forecasts. In general, from weeks 3550, the multiscale model tends to overforecast, as reflected in both the forecasts themselves and the positive regression effect for this time period in Figure 13. For weeks prior to week 35, the regression effect is approximately 0 and the no multiscale and multiscale models give very similar forecasts. This Category could perhaps benefit from other types of multiscale information that is more relevant, especially in early weeks when there is minimal discount regression effects. One week ahead forecasts are given in Figure 14.
4.2 Extending the Revenue Models
Additional information from the grocery chain offers potential to further improve revenue forecasting in specific settings. Here, we focus on the role of Category level pricing and holiday effects.
4.2.1 Pricing
There is additional information about pricing information via the Net Price variable; this is an average measure of the Net Price realized by customers (including discounts), averaged over customers within LSG and Category. We find that jointly modeling and forecasting Net Price together with revenue can further improve revenue quite generally. Updating the details in Section 3, the modifications are as follows.
Let be the Net Price for week , Category and LSG ; we need to forecast as it incorporates realized discounts received by customers and so is uncertain in future weeks. We define a joint model by coupling two univariate dynamic models: one for Net Price and one for revenue that extends the earlier DLM to also include Net Price as a predictor. This decouple/recouple approach enables customization of each of the univariate model as well as sensitive modeling of dependence of revenue on Net Price. In summary, for each LSGCategory over weeks we:

[noitemsep,topsep=0pt]

Model Net Price: .

Model revenue across LSGs (multiscale): .

Extract imputed values of the discount state vectors from model (2):
as before. 
Model revenue: now also conditional on imputed values of
At the final model stage, the imputed values of can be any selected point forecasts; the baseline choice is a “plugin” analysis that uses the forecast median of Net Price as from its univariate model. This can be refined to run analyses repeatedly over a range of values or a Monte Carlo forecast sample of Net Price to understand if uncertainty underquantification using the plugin analysis is practically meaningful. The revenue model also includes both the LSGCategory specific discount information and multiscale discount information across LSGs, as before. The Net Price model uses the LSGCategory specific discount information and pricing information without discounts (the latter being which is a control variable for the grocery chain).
Selected aggregate results are highlighted in Figure 15. With the set of univariate DLMs without multiscale and Net Price extensions (“No MultiScale”) as baseline, this shows average revenue forecast MAPE values from (i) a revenue model with Net Price information only, (ii) the original multiscale revenue model, and (iii) the more general revenue model with both multiscale and Net Price information of this section. Compared to the baseline, 28% of the LSGCategory pairs are improved with the Net Price model, 45% for the multiscale model, and 37% for the multiscale and Net Price model. A number of specific LSGCategory pairs that particularly benefit from the inclusion of pricing information, while others do not.
One Category where the combination of pricing and multiscale discount information improves revenue forecasts is Craft/Micro Beers. This Category is rarely discounted and when it is the discounts tend to be small. There is also some retail price drift separate from discount information that can be helpful for this Category (see further comments in Supplementary Materials). Forecast comparisons and regression effects are given in Figure 16. The regression effect is generally insignificant over time. We do see that, around weeks 3540, the larger negative regression effect pulls down the forecasts in the multiscale model, improving 12week forecast accuracy over for this time period. While there is limited explanatory information in LSGspecific or multiscale discounts for this Category, they nevertheless have practical value in revenue forecasting.
4.2.2 Holiday Effects
Some product Categories exhibit clear, important but sporadic holiday effects. The Sugars & Sweeteners Category, for example, shows effects particularly around Christmas (Figure 17
). However, two years of data do not provide historical information sufficient to incorporate holiday week dummy variables, or holidayspecific transfer response model components over the week before, of and after the holiday period, such as is standard in Bayesian forecasting in commercial settings
(West and Harrison, 1997, Sections 9.3 and 11.2). Transfer response models designed specifically for local holiday effects have been utilized in related models in our setting, and coded for public access and incorporation into revenue models (Lavine and Cron, 2020).The revenue models in further development for routine application are developed this way, but for our interest here we are mainly concerned about the impact of holiday events on forecast accuracy summaries. In terms of basic empirical accuracy impact, it is easy to reevaluate MAPE (or other) metrics across all LSGs and Categories over the year of test data but simply dropping the (rare) holiday weeks from the summary. This does not wholly reevaluate accuracy, since the model analysis includes those weeks and so the sequential updating analysis is inevitably perturbed (negatively) by poor forecasts at holiday times that are not explicitly modeled as they might be, as noted above. But, simply masking out a few holiday weeks from the forecast error evaluation gives at least a lower bound on potential improvements.
More formally, a fully Bayesian feedforward intervention approach simply defines each holiday week as a known time when major departures from the routine model forecasts are expected, and treats the outcome data for those few weeks as missing observations. This is effectively building in a “holiday week” random intervention effect specific to each holiday, and with very high prior uncertainty. The result is that the state vectors in the baseline models will be protected from what may be large forecast errors in the forward filtering and updating analysis (Berry and West, 2020; West and Harrison, 1997, Section 11.2.4).
Identifying the week of Thanksgiving, the week of Christmas and the week after Christmas (New Years’) for the Sugars & Sweeteners Category leads to strong aggregate improvements in terms of lower MAPE values; the net reduction in empirical MAPE values averaged over LSGs, Categories and across the 1 year evaluation period is about 78%. This indicates that the three holiday periods have a substantial impact on forecast accuracy metrics. Some Categories are far more impacted than others, of course, and implementation of the models for routine use will customize developments for holidays as needed. For a subset of Categories, including specific holiday effects formally with more data is likely to be beneficial to revenue forecasts. This has been found to be the case internally by the grocery chain on separate data for which a longer period of time is available on some Categories and LSGs.
4.3 Exploration of CrossCategory Dependence
There are business interests in identifying whether discounts for one Category affect sales and hence revenue in other Categories. Identifying such relationships has potential to yield forecast accuracy improvements by including relevant crossCategory discount predictors in revenue models. Then, if higher discounts for Category A are associated with higher sales for Category B, the products within the Categories are potential complements and crossCategory promotion strategies may be of interest to management. A store or LSG could offer discounts in Category A to induce customers to also purchase products in Category B at lesser discounts. On the other hand, if higher discounts for Category A are associated with lower sales for Category B, then products within the two Categories are possible substitutes of each other, and discounts potentially “cannibalize” crossCategory sales. If some products in Category A are heavily discounted and sales within Category B decrease, then consumption has merely shifted and apparent sales lift in Category A is masking potentially storelevel, or LSGlevel, drops in revenue.
We identify potential pairs for crossCategory analysis by examining relationships between standardized forecast errors from the log revenue models. This is exemplified here using the 12week multiscale revenue models incorporating multiscale discount, Net Price, and the primary discount variables as predictors. Postforecasting exploration of 12week ahead forecast errors is key as these realized errors are implicitly already free (modulo the assumed adequacy of the models) of the effects of Categoryspecific discounts and other effects that may generate spurious indications of crossCategory relationships. The later include, for example, any patterns of local trend and/or seasonality that may be common to Category revenue and discount decisions; e.g. sales of hot chocolate increase in the winter while discounts on ice cream decrease. Further, we use realized errors standardized under their step ahead forecast distributions; this appropriately accounts for seriesspecific residual volatility over time prior to evaluating crossCategory correlations.
It is also important to examine consistency of any potential crossCategory relationships across Local Store Groups. Each LSG has, in theory, the ability to independently select discount strategies in any Category for stores in the LSG. If an observed crossCategory relationship is consistent across LSGs, then that Category pair is of more interest for further exploration.
Some summaries of exploratory analysis using the top Category combinations are highlighted. Figure 18 presents a heatmap of crossCategory correlations of forecast errors from the 12week ahead multiscale revenue models with TPR% discount. For each pair of Categories the correlation is that between realized forecast errors in Category and TPR% in Category evaluated over the 52week forecasting test period and averaged across the 9 LSGs. While many pairwise correlations are apparently negligible, interest lies in exploring specific example pairs where the correlation seems highest. First note, however, that the corresponding correlation heatmap based on raw revenue data rather than on the modelbased forecast errors shows substantial numbers of much higher correlations (Supplementary Material, Figure 27). The naive analysis using raw revenue generates many apparently interesting but spurious suggestions of crossCategory relationships that disappear when evaluation uses forecast errors instead of revenue. A more important comparison is with the corresponding heatmap of correlations using using 1week rather than 12week ahead forecast errors (Supplementary Material, Figure 28). Analysis at the longer forecast horizon shows evidence of some stronger correlations than using 1week forecast errors. This is important since the 12week horizon is most relevant for business decisions; at that horizon, forecasts are generally less accurate than the 1week forecasts, so there is more room for improvement by incorporating crossCategory promotion strategies in the 12week forecasting models.
We highlight one particular pair of Categories: Cold Cereal and NF (organic) Milk. Discounts for Cold Cereal have the largest correlation with NF Milk forecast errors across the examined Categories in two of the nine LSGs, always positive, and fourth largest when averaged across LSGs. Figure 19 shows 12week forecast errors for NF Milk against Cold Cereal TPR% discount for each of the LSGs. Slightly positive– albeit rather weak and noisy– relationships are consistent with the view that the models tend to underpredict NF Milk revenues when Cold Cereal experiences higher discounts. The concordance across several LSGs is important in supporting the view that this is a systematic, potentially casual relationship. From a forecasting viewpoint, the potential for such a crossCategory association to be useful is explored by reestimating the revenue for NF Milk but now extending the model to also include the Cold Cereal TPR% discount as a predictor. That analysis was performed and confirmed point forecast improvements; the 12week ahead MAPE metric averaged over the 52 week test period is reduced for 6 of the 9 LSGs and remains essentially unchanged for the other 3. Again, we repeat the point that even very small improvements in this measure of forecast accuracy at the LSG level can be of real practical importance in informing planning, promotion and logistics with meaningful business revenue impact.
5 Summary Comments
Our case study of revenue modeling at the LSGCategory level for a large grocery chain has extended Bayesian multiscale dynamic modeling to enable integration of series specific as well as crossCategory discount information in forecasting several hundred multivariate revenue time series. A few summaries here of the much broader analysis exhibit key practical aspects of these joint models for pricing and revenue, and examine features of crossCategory dependencies. Substantial heterogeneity across both LSGs and Categories offers opportunities for multiscale, aggregate information sharing to improve LSGCategory specific forecasts. The multiscale signal in this setting is the aggregate state vector information related to discounts for each Category across LSGs, and we find that this can improve multistep ahead (12week ahead) forecasts for about half of the main Categories of interest to the company. The baseline dynamic models should be maintained for the other Categories and they already define forecasting advances in relying on LSGCategory predictors generated from pricing and discount information, as well as benefiting from the inherent adaptability over time of Bayesian DLMs. For the LSGCategory cases that do benefit from the multiscale extension, forecast improvements are practically relevant and some quite large in terms of revenue implications.
There are several avenues for future development in applications and methodology. The company is involved in developing broader evaluation on more extensive data sets including explicit integration of holiday effects in the DLMs. Additional exploration of other types of multiscale information, for example across groups of similar Categories, is one direction that raises potential for further improvements in forecast accuracy. In particular, alcohol (and other) Categories that are rarely discounted are likely to benefit from information sharing across multiple contextuallyrelated Categories, in addition to across LSGs. Additionally, measures of traffic, such as weekly transactions within an LSG containing items of a given category, might be jointly modeled with category revenue to improve forecast accuracy. Exploring crossCategory dependence is possible with this modeling approach and is of interest for the application, specifically to understand how discounts in one Category impact revenue in another Category. This ties into more formal “Whatif?” decision analyses– also known as “scenario forecasting”– to explore, for example, how changes in pricing or promotions for specific Categories leads to changes in revenue in the same or other Categories. The potential for extending this line of thinking to a causal basis, involving realtime experimentation, is clearly an open and interesting area, though as yet not a direction addressed in publicdomain R&D linked to this specific study.
We note that the model analyses presented and summarized can be developed by interested readers and potential users based on prototype code available in PyBats (Lavine and Cron, 2020).
References
 Probabilistic forecasting of heterogeneous consumer transactionsales time series. International Journal of Forecasting 36, pp. 552–569. External Links: Document Cited by: §1, §1.
 Bayesian forecasting of many countvalued time series. Journal of Business and Economic Statistics 38, pp. 872–887. External Links: Document Cited by: §1, §1, §1, §3.1, §3.1, §3.1, §4.2.2.
 Bayesian dynamic modeling and forecasting of count time series. Ph.D. Thesis, Department of Statistical Science, Duke University. Cited by: §3.1.
 A comparative study of linear and nonlinear models for aggregate retail sales forecasting. International Journal of Production Economics 86, pp. 217–231. Cited by: §1.
 PyBats: a Python package for Bayesian Analysis of Time Series and Bayesian forecasting. Note: https://pypi.org/project/pybats/ Cited by: §3.3, §4.2.2, §5.
 Comparison of multiple machine learning models based on enterprise revenue forecasting. In 2021 AsiaPacific Conference on Communications Technology and Computer Science (ACCTCS), pp. 354–359. External Links: Document Cited by: §1.

Forecasting corporate revenue by using deeplearning methodologies.
In
2019 International Conference on Control, Artificial Intelligence, Robotics Optimization (ICCAIRO)
, pp. 115–120. External Links: Document Cited by: §1.  Time series: modeling, computation & inference. 2nd edition, Chapman & Hall/CRC Press. External Links: ISBN 9781498747028, Link Cited by: §1, §3.2.
 Machine learning for revenue forecasting: a case study in retail business. In 2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 201–207. External Links: Document Cited by: §1.
 Highdimensional multivariate forecasting with lowrank Gaussian copula processes. In Advances in Neural Information Processing Systems, Vol. 32, pp. 6827–6837. Cited by: §3.1.

Think globally, act locally: a deep neural network approach to highdimensional time series forecasting
. In Advances in Neural Information Processing Systems, Vol. 32. External Links: Link Cited by: §3.1.  The history of forecasting models in revenue management. Journal of Revenue and Pricing Management 15, pp. 212–221. Cited by: §1.
 Bayesian forecasting of multivariate time series: Scalability, structure uncertainty and decisions (with discussion). Annals of the Institute of Statistical Mathematics 72, pp. 1–44. External Links: Document, Link Cited by: §1, §3.1.
 Bayesian forecasting and dynamic models. 2nd edition, SpringerVerlag, New York, Inc. Cited by: §1, §3.2, §4.2.2, §4.2.2.
 Hierarchical dynamic modeling for individualized Bayesian forecasting (submitted). Note: arXiv: 2101.03408 Cited by: §1, §1.
Additional Joint Pricing and Revenue Forecasting Summaries
Observed covariates and retail price information ((a)) for the Craft/Micro Beers example shown in Section 4.2. This Category is rarely discounted and there is some observed price drift over time. Net Price forecasts for this chosen LSGCategory pair are given in ((b)).
Additional Aspects of CrossCategory Dependence
Figures 27 and 28 display heatmaps of crossCategory correlations of actual revenue and realized 1week ahead revenue forecast errors with TPR% discount. As in the case of 12week ahead forecast errors (main paper Section 4.3 and Figure 18) these are computed over the 52week forecasting test period and averaged across the 9 LSGs.
Comments
There are no comments yet.