Yash Chandak

research

∙ 06/26/2023

Supervised Pretraining Can Learn In-Context Reinforcement Learning

Large transformer models trained on diverse datasets have shown a remark...

0 Jonathan N. Lee, et al. ∙

research

∙ 05/16/2023

Coagent Networks: Generalized and Scaled

Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011...

0 James E. Kostas, et al. ∙

research

∙ 05/01/2023

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

Representation learning and exploration are among the key challenges for...

0 Yash Chandak, et al. ∙

research

∙ 02/23/2023

Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments

In this work, we consider the off-policy policy evaluation problem for c...

1 Vincent Liu, et al. ∙

research

∙ 02/06/2023

Optimization using Parallel Gradient Evaluations on Multiple Parameters

We propose a first-order method for convex optimization, where instead o...

0 Yash Chandak, et al. ∙

research

∙ 01/24/2023

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Methods for sequential decision-making are often built upon a foundation...

6 Yash Chandak, et al. ∙

research

∙ 12/06/2022

Understanding Self-Predictive Learning for Reinforcement Learning

We study the learning dynamics of self-predictive learning for reinforce...

0 Yunhao Tang, et al. ∙

research

∙ 12/16/2021

On Optimizing Interventions in Shared Autonomy

Shared autonomy refers to approaches for enabling an autonomous agent to...

11 Weihao Tan, et al. ∙

research

∙ 11/06/2021

SOPE: Spectrum of Off-Policy Estimators

Many sequential decision making problems are high-stakes and require off...

0 Christina J. Yuan, et al. ∙

research

∙ 04/26/2021

Universal Off-Policy Evaluation

When faced with sequential decision-making problems, it is often useful ...

0 Yash Chandak, et al. ∙

research

∙ 01/25/2021

High-Confidence Off-Policy (or Counterfactual) Variance Estimation

Many sequential decision-making systems leverage data collected using pr...

12 Yash Chandak, et al. ∙

research

∙ 10/23/2020

Towards Safe Policy Improvement for Non-Stationary MDPs

Many real-world sequential decision-making problems involve critical sys...

0 Yash Chandak, et al. ∙

research

∙ 09/15/2020

Reinforcement Learning for Strategic Recommendations

Strategic recommendations (SR) refer to the problem where an intelligent...

0 Georgios Theocharous, et al. ∙

research

∙ 06/30/2020

Evaluating the Performance of Reinforcement Learning Algorithms

Performance evaluations are critical for quantifying algorithmic advance...

0 Scott M. Jordan, et al. ∙

research

∙ 05/17/2020

Optimizing for the Future in Non-Stationary MDPs

Most reinforcement learning methods are based upon the key assumption th...

2 Yash Chandak, et al. ∙

research

∙ 06/06/2019

Classical Policy Gradient: Preserving Bellman's Principle of Optimality

We propose a new objective function for finite-horizon episodic Markov d...

0 Philip S. Thomas, et al. ∙

research

∙ 06/05/2019

Reinforcement Learning When All Actions are Not Always Available

The Markov decision process (MDP) formulation used to model many real-wo...

0 Yash Chandak, et al. ∙

research

∙ 06/05/2019

Lifelong Learning with a Changing Action Set

In many real-world sequential decision making problems, the number of av...

0 Yash Chandak, et al. ∙

research

∙ 02/01/2019

Learning Action Representations for Reinforcement Learning

Most model-free reinforcement learning methods leverage state representa...

0 Yash Chandak, et al. ∙

research

∙ 05/31/2018

Fusion Graph Convolutional Networks

Semi-supervised node classification involves learning to classify unlabe...

0 Priyesh Vijayan, et al. ∙

research

∙ 05/31/2018

HOPF: Higher Order Propagation Framework for Deep Collective Classification

Given a graph wherein every node has certain attributes associated with ...

0 Priyesh Vijayan, et al. ∙

Yash Chandak

Featured Co-authors

Sign in with Google

Consider DeepAI Pro