Perishability of Data: Dynamic Pricing under Varying-Coefficient Models
We consider a firm that sells a large number of products to its customers in an online fashion. Each product is described by a high dimensional feature vector, and the market value of a product is assumed to be linear in the values of its features. Parameters of the valuation model are unknown and can change over time. The firm sequentially observes a product's features and can use the historical sales data (binary sale/no sale feedbacks) to set the price of current product, with the objective of maximizing the collected revenue. We measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance. We propose a pricing policy based on projected stochastic gradient descent (PSGD) and characterize its regret in terms of time T, features dimension d, and the temporal variability in the model parameters, δ_t. We consider two settings. In the first one, feature vectors are chosen antagonistically by nature and we prove that the regret of PSGD pricing policy is of order O(√(T) + ∑_t=1^T √(t)δ_t). In the second setting (referred to as stochastic features model), the feature vectors are drawn independently from an unknown distribution. We show that in this case, the regret of PSGD pricing policy is of order O(d^2 T + ∑_t=1^T tδ_t/d).
READ FULL TEXT 
  
  
     share
 share