We consider estimation of parameters defined as linear functionals of
so...
Ranking interfaces are everywhere in online platforms. There is thus an ...
Reinforcement Learning with Human Feedback (RLHF) is a paradigm in which...
In this paper, we investigate the problem of offline reinforcement learn...
We study the problem of estimating the distribution of the return of a p...
In this paper, we study nonparametric estimation of instrumental variabl...
In offline reinforcement learning (RL) we have no opportunity to explore...
Reinforcement learning (RL) is one of the most vibrant research frontier...
We study off-policy evaluation (OPE) for partially observable MDPs (POMD...
In this paper we study online Reinforcement Learning (RL) in partially
o...
We study reinforcement learning with function approximation for large-sc...
We study Reinforcement Learning for partially observable dynamical syste...
We present BRIEE (Block-structured Representation learning with Interlea...
We consider the fixed-budget best arm identification problem in the
mult...
We consider off-policy evaluation (OPE) in Partially Observable Markov
D...
This work studies the question of Representation Learning in RL: how can...
We study model-based offline Reinforcement Learning with general functio...
This paper studies offline Imitation Learning (IL) where an agent learns...
We study the estimation of causal parameters when not all confounders ar...
We offer a theoretical characterization of off-policy evaluation (OPE) i...
We study the regret of reinforcement learning from offline data generate...
We study off-policy evaluation (OPE) from multiple logging policies, eac...
Offline reinforcement learning, wherein one uses off-policy data logged ...
We study the efficient off-policy evaluation of natural stochastic polic...
We consider the evaluation and training of a new policy for the evaluati...
Policy gradient methods in reinforcement learning update policy paramete...
We consider the efficient estimation of a low-dimensional parameter in t...
We provide theoretical investigations into off-policy evaluation in
rein...
Off-policy evaluation (OPE) in reinforcement learning is notoriously
dif...
Off-policy evaluation (OPE) in reinforcement learning allows one to eval...
Off-policy evaluation (OPE) in both contextual bandits and reinforcement...
Many statistical models are given in the form of non-normalized densitie...
We propose estimation methods for unnormalized models with missing data....
Parameter estimation of unnormalized models is a challenging problem bec...
Parameter estimation of unnormalized models is a challenging problem bec...
How to deal with nonignorable response is often a challenging problem
en...
There are many models, often called unnormalized models, whose normalizi...
Generative adversarial networks (GANs) are successful deep generative mo...