EXP4-DFDC: A Non-Stochastic Multi-Armed Bandit for Cache Replacement

09/23/2020
by   Farzana Beente Yusuf, et al.
0

In this work we study a variant of the well-known multi-armed bandit (MAB) problem, which has the properties of a delay in feedback, and a loss that declines over time. We introduce an algorithm, EXP4-DFDC, to solve this MAB variant, and demonstrate that the regret vanishes as the time increases. We also show that LeCaR, a previously published machine learning-based cache replacement algorithm, is an instance of EXP4-DFDC. Our results can be used to provide insight on the choice of hyperparameters, and optimize future LeCaR instances.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset