Modern recommender systems lie at the heart of complex ecosystems that c...
Algorithms for offline bandits must optimize decisions in uncertain
envi...
A prominent challenge of offline reinforcement learning (RL) is the issu...
We present a representation-driven framework for reinforcement learning....
While popularity bias is recognized to play a role in recommmender (and ...
We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a no...
We present the problem of reinforcement learning with exogenous terminat...
We consider the problem of using expert data with unobserved confounders...
Cooperative multi-agent reinforcement learning (MARL) faces significant
...
Mixture models are an expressive hypothesis class that can approximate a...
Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning
p...
Offline reinforcement learning approaches can generally be divided to
pr...
We identify a fundamental phenomenon of heterogeneous one dimensional ra...
We study linear contextual bandits with access to a large, partially
obs...
Recent advances in Reinforcement Learning have highlighted the difficult...
This work studies the problem of batch off-policy evaluation for
Reinfor...
We identify a fundamental problem in policy gradient-based methods in
co...
We introduce Act2Vec, a general framework for learning context-based act...
Model selection on validation data is an essential step in machine learn...
The dynamics of infectious diseases spread is crucial in determining the...