Nathan Kallus

research

∙ 08/19/2023

Large Language Models as Zero-Shot Conversational Recommenders

In this paper, we present empirical studies on conversational recommenda...

0 Zhankui He, et al. ∙

research

∙ 07/25/2023

Source Condition Double Robust Inference on Functionals of Inverse Problems

We consider estimation of parameters defined as linear functionals of so...

0 Andrew Bennett, et al. ∙

research

∙ 07/21/2023

JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning

In this paper, we present JoinGym, an efficient and lightweight query op...

0 Kaiwen Wang, et al. ∙

research

∙ 05/25/2023

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

While distributional reinforcement learning (RL) has demonstrated empiri...

0 Kaiwen Wang, et al. ∙

research

∙ 05/24/2023

Provable Offline Reinforcement Learning with Human Feedback

In this paper, we investigate the problem of offline reinforcement learn...

0 Wenhao Zhan, et al. ∙

research

∙ 04/20/2023

B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding

Estimating heterogeneous treatment effects from observational data is a ...

0 Miruna Oprescu, et al. ∙

research

∙ 02/10/2023

Minimax Instrumental Variable Regression and L_2 Convergence Guarantees without Identification or Closedness

In this paper, we study nonparametric estimation of instrumental variabl...

0 Andrew Bennett, et al. ∙

research

∙ 02/07/2023

Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR

In this paper, we study risk-sensitive Reinforcement Learning (RL), focu...

0 Kaiwen Wang, et al. ∙

research

∙ 02/05/2023

Refined Value-Based Offline RL under Realizability and Partial Coverage

In offline reinforcement learning (RL) we have no opportunity to explore...

0 Masatoshi Uehara, et al. ∙

research

∙ 01/29/2023

Smooth Non-Stationary Bandits

In many applications of online decision making, the environment is non-s...

0 Su Jia, et al. ∙

research

∙ 12/29/2022

Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations

Sequential testing, always-valid p-values, and confidence sequences prom...

0 Aurélien Bibaut, et al. ∙

research

∙ 12/13/2022

A Review of Off-Policy Evaluation in Reinforcement Learning

Reinforcement learning (RL) is one of the most vibrant research frontier...

0 Masatoshi Uehara, et al. ∙

research

∙ 11/11/2022

The Implicit Delta Method

Epistemic uncertainty quantification is a crucial part of drawing credib...

0 Nathan Kallus, et al. ∙

research

∙ 10/26/2022

Provable Safe Reinforcement Learning with Binary Feedback

Safety is a crucial necessity in many applications of reinforcement lear...

0 Andrew Bennett, et al. ∙

research

∙ 08/17/2022

Debiased Inference on Identified Linear Functionals of Underidentified Nuisances via Penalized Minimax Estimation

We study generic inference on identified linear functionals of nonunique...

0 Nathan Kallus, et al. ∙

research

∙ 07/26/2022

Future-Dependent Value-Based Off-Policy Evaluation in POMDPs

We study off-policy evaluation (OPE) for partially observable MDPs (POMD...

3 Masatoshi Uehara, et al. ∙

research

∙ 07/12/2022

Learning Bellman Complete Representations for Offline Policy Evaluation

We study representation learning for Offline Reinforcement Learning (RL)...

6 Jonathan D. Chang, et al. ∙

research

∙ 06/24/2022

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings

We study reinforcement learning with function approximation for large-sc...

6 Masatoshi Uehara, et al. ∙

research

∙ 06/24/2022

Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

We study Reinforcement Learning for partially observable dynamical syste...

26 Masatoshi Uehara, et al. ∙

research

∙ 05/23/2022

Robust and Agnostic Learning of Conditional Distributional Treatment Effects

The conditional average treatment effect (CATE) is the best point predic...

0 Nathan Kallus, et al. ∙

research

∙ 05/20/2022

What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment

The fundamental problem of causal inference – that we never observe coun...

0 Nathan Kallus, et al. ∙

research

∙ 04/13/2022

Estimating Structural Disparities for Face Models

In machine learning, disparity metrics are often defined by measuring th...

8 Shervin Ardeshir, et al. ∙

research

∙ 02/19/2022

Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning

Off-policy evaluation and learning (OPE/L) use offline observational dat...

0 Nathan Kallus, et al. ∙

research

∙ 02/15/2022

Long-term Causal Inference Under Persistent Confounding via Data Combination

We study the identification and estimation of long-term treatment effect...

0 Guido Imbens, et al. ∙

research

∙ 01/15/2022

Treatment Effect Risk: Bounds and Inference

Since the average treatment effect (ATE) measures the change in social w...

0 Nathan Kallus, et al. ∙

research

∙ 12/21/2021

Doubly-Valid/Doubly-Sharp Sensitivity Analysis for Causal Inference with Unmeasured Confounding

We study the problem of constructing bounds on the average treatment eff...

0 Jacob Dorn, et al. ∙

research

∙ 11/16/2021

An Empirical Evaluation of the Impact of New York's Bail Reform on Crime Using Synthetic Controls

We conduct an empirical evaluation of the impact of New York's bail refo...

0 Angela Zhou, et al. ∙

research

∙ 10/28/2021

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

In applications of offline reinforcement learning to observational data,...

10 Andrew Bennett, et al. ∙

research

∙ 10/19/2021

Stateful Offline Contextual Policy Evaluation and Learning

We study off-policy evaluation and learning from sequential data in a st...

0 Nathan Kallus, et al. ∙

research

∙ 10/06/2021

Residual Overfit Method of Exploration

Exploration is a crucial aspect of bandit and reinforcement learning alg...

0 James McInerney, et al. ∙

research

∙ 08/09/2021

Controlling for Unmeasured Confounding in Panel Data Using Minimal Bridge Functions: From Two-Way Fixed Effects to Factor Models

We develop a new approach for identifying and estimating average causal ...

0 Guido Imbens, et al. ∙

research

∙ 06/15/2021

Control Variates for Slate Off-Policy Evaluation

We study the problem of off-policy evaluation from batched contextual ba...

0 Nikos Vlassis, et al. ∙

research

∙ 06/03/2021

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Empirical risk minimization (ERM) is the workhorse of machine learning, ...

7 Aurélien Bibaut, et al. ∙

research

∙ 06/01/2021

Post-Contextual-Bandit Inference

Contextual bandit algorithms are increasingly replacing non-adaptive A/B...

1 Aurélien Bibaut, et al. ∙

research

∙ 03/25/2021

Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach

We study the estimation of causal parameters when not all confounders ar...

0 Nathan Kallus, et al. ∙

research

∙ 02/05/2021

Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency

We offer a theoretical characterization of off-policy evaluation (OPE) i...

2 Masatoshi Uehara, et al. ∙

research

∙ 01/31/2021

Fast Rates for the Regret of Offline Reinforcement Learning

We study the regret of reinforcement learning from offline data generate...

15 Yichun Hu, et al. ∙

research

∙ 12/21/2020

Fairness, Welfare, and Equity in Personalized Pricing

We study the interplay of fairness, welfare, and equity considerations i...

0 Nathan Kallus, et al. ∙

research

∙ 12/17/2020

The Variational Method of Moments

The conditional moment problem is a powerful formulation for describing ...

7 Andrew Bennett, et al. ∙

research

∙ 12/05/2020

Rejoinder: New Objectives for Policy Learning

I provide a rejoinder for discussion of "More Efficient Policy Learning ...

0 Nathan Kallus, et al. ∙

research

∙ 11/05/2020

Fast Rates for Contextual Linear Optimization

Incorporating side observations of predictive features can help reduce u...

1 Yichun Hu, et al. ∙

research

∙ 10/21/2020

Optimal Off-Policy Evaluation from Multiple Logging Policies

We study off-policy evaluation (OPE) from multiple logging policies, eac...

0 Nathan Kallus, et al. ∙

research

∙ 08/17/2020

Stochastic Optimization Forests

We study conditional stochastic optimization problems, where we leverage...

21 Nathan Kallus, et al. ∙

research

∙ 07/27/2020

Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders

Off-policy evaluation (OPE) in reinforcement learning is an important pr...

5 Andrew Bennett, et al. ∙

research

∙ 06/06/2020

Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies

Offline reinforcement learning, wherein one uses off-policy data logged ...

15 Nathan Kallus, et al. ∙

research

∙ 06/06/2020

Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning

We study the efficient off-policy evaluation of natural stochastic polic...

4 Nathan Kallus, et al. ∙

research

∙ 05/06/2020

On the Optimality of Randomization in Experimental Design: How to Randomize for Minimax Variance and Design-Based Inference

I study the minimax-optimal design for a two-arm controlled experiment w...

4 Nathan Kallus, et al. ∙

research

∙ 05/06/2020

DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

Dynamic treatment regimes (DTRs) for are personalized, sequential treatm...

5 Yichun Hu, et al. ∙

research

∙ 04/06/2020

Comment: Entropy Learning for Dynamic Treatment Regimes

I congratulate Profs. Binyan Jiang, Rui Song, Jialiang Li, and Donglin Z...

3 Nathan Kallus, et al. ∙

research

∙ 03/27/2020

On the role of surrogates in the efficient estimation of treatment effects with limited outcome data

We study the problem of estimating treatment effects when the outcome of...

2 Nathan Kallus, et al. ∙

Nathan Kallus

Featured Co-authors

Sign in with Google

Consider DeepAI Pro