Doubly High-Dimensional Contextual Bandits: An Interpretable Model for Joint Assortment-Pricing

by   Junhui Cai, et al.

Key challenges in running a retail business include how to select products to present to consumers (the assortment problem), and how to price products (the pricing problem) to maximize revenue or profit. Instead of considering these problems in isolation, we propose a joint approach to assortment-pricing based on contextual bandits. Our model is doubly high-dimensional, in that both context vectors and actions are allowed to take values in high-dimensional spaces. In order to circumvent the curse of dimensionality, we propose a simple yet flexible model that captures the interactions between covariates and actions via a (near) low-rank representation matrix. The resulting class of models is reasonably expressive while remaining interpretable through latent factors, and includes various structured linear bandit and pricing models as particular cases. We propose a computationally tractable procedure that combines an exploration/exploitation protocol with an efficient low-rank matrix estimator, and we prove bounds on its regret. Simulation results show that this method has lower regret than state-of-the-art methods applied to various standard bandit and pricing models. Real-world case studies on the assortment-pricing problem, from an industry-leading instant noodles company to an emerging beauty start-up, underscore the gains achievable using our method. In each case, we show at least three-fold gains in revenue or profit by our bandit method, as well as the interpretability of the latent factor models that are learned.


page 1

page 2

page 3

page 4


Low-rank Bandit Methods for High-dimensional Dynamic Pricing

We consider high dimensional dynamic multi-product pricing with an evolv...

A Simple Unified Framework for High Dimensional Bandit Problems

Stochastic high dimensional bandit problems with low dimensional structu...

High Dimensional Latent Panel Quantile Regression with an Application to Asset Pricing

We propose a generalization of the linear panel quantile regression mode...

On High-dimensional and Low-rank Tensor Bandits

Most existing studies on linear bandits focus on the one-dimensional cha...

Multi-Agent Dynamic Pricing in a Blockchain Protocol Using Gaussian Bandits

The Graph Protocol indexes historical blockchain transaction data and ma...

Dynamic Assortment Personalization in High Dimensions

We study the problem of dynamic assortment personalization with large, h...

Online Action Learning in High Dimensions: A New Exploration Rule for Contextual ε_t-Greedy Heuristics

Bandit problems are pervasive in various fields of research and are also...

Please sign up or login with your details

Forgot password? Click here to reset