
Universal OffPolicy Evaluation
When faced with sequential decisionmaking problems, it is often useful ...
HighConfidence OffPolicy (or Counterfactual) Variance Estimation
Many sequential decisionmaking systems leverage data collected using pr...
Towards Safe Policy Improvement for NonStationary MDPs
Many realworld sequential decisionmaking problems involve critical sys...
Reinforcement Learning for Strategic Recommendations
Strategic recommendations (SR) refer to the problem where an intelligent...
Evaluating the Performance of Reinforcement Learning Algorithms
Performance evaluations are critical for quantifying algorithmic advance...
Optimizing for the Future in NonStationary MDPs
Most reinforcement learning methods are based upon the key assumption th...
Classical Policy Gradient: Preserving Bellman's Principle of Optimality
We propose a new objective function for finitehorizon episodic Markov d...
Reinforcement Learning When All Actions are Not Always Available
The Markov decision process (MDP) formulation used to model many realwo...
Lifelong Learning with a Changing Action Set
In many realworld sequential decision making problems, the number of av...
Learning Action Representations for Reinforcement Learning
Most modelfree reinforcement learning methods leverage state representa...
Fusion Graph Convolutional Networks
Semisupervised node classification involves learning to classify unlabe...
HOPF: Higher Order Propagation Framework for Deep Collective Classification
Given a graph wherein every node has certain attributes associated with ...
