Yinlam Chow

research

∙ 02/21/2023

Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Reinforcement learning (RL) has shown great promise for developing dialo...

0 Dhawal Gupta, et al. ∙

research

∙ 07/25/2022

Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning

Despite recent advances in natural language understanding and generation...

0 Deborah Cohen, et al. ∙

research

∙ 05/31/2022

A Mixture-of-Expert Approach to RL-based Dialogue Management

Despite recent advancements in language models (LMs), their application ...

0 Yinlam Chow, et al. ∙

research

∙ 05/10/2022

Efficient Risk-Averse Reinforcement Learning

In risk-averse reinforcement learning (RL), the goal is to optimize some...

9 Ido Greenberg, et al. ∙

research

∙ 02/10/2022

SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition

Though many reinforcement learning (RL) problems involve learning polici...

7 Dylan Slack, et al. ∙

research

∙ 02/06/2022

Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors

Interactive recommender systems (RSs) allow users to express intent, pre...

0 Christina Göpfert, et al. ∙

research

∙ 12/01/2020

Non-Stationary Latent Bandits

Users of recommender systems often behave in a non-stationary fashion, d...

0 Joey Hong, et al. ∙

research

∙ 10/22/2020

CoinDICE: Off-Policy Confidence Interval Estimation

We study high-confidence behavior-agnostic off-policy evaluation in rein...

0 Bo Dai, et al. ∙

research

∙ 10/11/2020

Safe Reinforcement Learning with Natural Language Constraints

In this paper, we tackle the problem of learning control policies for ta...

27 Tsung-Yen Yang, et al. ∙

research

∙ 06/24/2020

Control-Aware Representations for Model-based Reinforcement Learning

A major challenge in modern reinforcement learning (RL) is efficient con...

0 Brandon Cui, et al. ∙

research

∙ 06/15/2020

Latent Bandits Revisited

A latent bandit problem is one in which the learning agent knows the arm...

0 Joey Hong, et al. ∙

research

∙ 06/15/2020

Piecewise-Stationary Off-Policy Optimization

Off-policy learning is a framework for evaluating and optimizing policie...

0 Joey Hong, et al. ∙

research

∙ 06/09/2020

Variational Model-based Policy Optimization

Model-based reinforcement learning (RL) algorithms allow us to combine m...

0 Yinlam Chow, et al. ∙

research

∙ 03/02/2020

Predictive Coding for Locally-Linear Control

High-dimensional observations and unknown dynamics are major challenges ...

5 Rui Shu, et al. ∙

research

∙ 02/08/2020

BRPO: Batch Residual Policy Optimization

In batch reinforcement learning (RL), one often constrains a learned pol...

11 Sungryull Sohn, et al. ∙

research

∙ 12/04/2019

AlgaeDICE: Policy Gradient from Arbitrary Experience

In many real-world applications of reinforcement learning (RL), interact...

0 Ofir Nachum, et al. ∙

research

∙ 09/26/2019

CAQL: Continuous Action Q-Learning

Value-based reinforcement learning (RL) methods like Q-learning have sho...

29 Moonkyung Ryu, et al. ∙

research

∙ 09/04/2019

Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control

Many real-world sequential decision-making problems can be formulated as...

25 Nir Levine, et al. ∙

research

∙ 06/10/2019

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

In many real-world reinforcement learning applications, access to the en...

0 Ofir Nachum, et al. ∙

research

∙ 01/28/2019

Lyapunov-based Safe Policy Optimization for Continuous Control

We study continuous action reinforcement learning problems in which it i...

6 Yinlam Chow, et al. ∙

research

∙ 09/07/2018

A Block Coordinate Ascent Algorithm for Mean-Variance Optimization

Risk management in dynamic decision problems is a primary concern in man...

0 Bo Liu, et al. ∙

research

∙ 08/13/2018

Risk-Sensitive Generative Adversarial Imitation Learning

We study risk-sensitive imitation learning where the agent's goal is to ...

0 Jonathan Lacotte, et al. ∙

research

∙ 05/20/2018

A Lyapunov-based Approach to Safe Reinforcement Learning

In many real-world reinforcement learning (RL) problems, besides optimiz...

0 Yinlam Chow, et al. ∙

research

∙ 02/10/2018

Path Consistency Learning in Tsallis Entropy Regularized MDPs

We study the sparse entropy-regularized reinforcement learning (ERL) pro...

0 Ofir Nachum, et al. ∙

research

∙ 02/10/2018

More Robust Doubly Robust Off-policy Evaluation

We study the problem of off-policy evaluation (OPE) in reinforcement lea...

0 Mehrdad Farajtabar, et al. ∙

research

∙ 07/13/2016

Safe Policy Improvement by Minimizing Robust Baseline Regret

An important problem in sequential decision-making under uncertainty is ...

0 Marek Petrik, et al. ∙

research

∙ 12/05/2015

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

In many sequential decision-making problems one is interested in minimiz...

0 Yinlam Chow, et al. ∙

research

∙ 09/29/2015

Two Phase Q-learning for Bidding-based Vehicle Sharing

We consider one-way vehicle sharing systems where customers can rent a c...

0 Yinlam Chow, et al. ∙

research

∙ 06/06/2015

Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach

In this paper we address the problem of decision making within a Markov ...

0 Yinlam Chow, et al. ∙

research

∙ 02/13/2015

Policy Gradient for Coherent Risk Measures

Several authors have recently developed risk-sensitive policy gradient m...

0 Aviv Tamar, et al. ∙

research

∙ 06/12/2014

Algorithms for CVaR Optimization in MDPs

In many sequential decision-making problems we may want to manage risk b...

0 Yinlam Chow, et al. ∙

Yinlam Chow

Featured Co-authors

Sign in with Google

Consider DeepAI Pro