Recommendation strategies are typically evaluated by using previously lo...
Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011...
Explanation is a key component for the adoption of reinforcement learnin...
Cognitive biases are mental shortcuts humans use in dealing with informa...
Smoothed online combinatorial optimization considers a learner who repea...
Off-policy evaluation methods are important in recommendation systems an...
Online reinforcement learning (RL) algorithms are often difficult to dep...
Most reinforcement learning (RL) recommendation systems designed for edg...
Many real-world applications require aligning two temporal sequences,
in...
Many real-world sequential decision-making problems involve critical sys...
Strategic recommendations (SR) refer to the problem where an intelligent...
Most reinforcement learning methods are based upon the key assumption th...
The Markov decision process (MDP) formulation used to model many real-wo...
In many real-world sequential decision making problems, the number of
av...
Most model-free reinforcement learning methods leverage state representa...
Posterior sampling for reinforcement learning (PSRL) is a popular algori...
Structured high-cardinality data arises in many domains, and poses a maj...