Yonathan Efroni

research

∙ 10/31/2022

Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

Learning to control an agent from data collected offline in a rich pixel...

0 Riashat Islam, et al. ∙

research

∙ 10/05/2022

Tractable Optimality in Episodic Latent MABs

We consider a multi-armed bandit problem with M latent contexts, where a...

0 Jeongyeol Kwon, et al. ∙

research

∙ 10/05/2022

Reward-Mixing MDPs with a Few Latent Contexts are Learnable

We consider episodic reinforcement learning in reward-mixing Markov deci...

0 Jeongyeol Kwon, et al. ∙

research

∙ 07/17/2022

Guaranteed Discovery of Controllable Latent States with Multi-Step Inverse Models

A person walking along a city street who tries to model all aspects of t...

17 Alex Lamb, et al. ∙

research

∙ 06/09/2022

Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

In real-world reinforcement learning applications the learner's observat...

22 Yonathan Efroni, et al. ∙

research

∙ 02/08/2022

Provable Reinforcement Learning with a Short-Term Memory

Real-world sequential decision making problems commonly involve partial ...

0 Yonathan Efroni, et al. ∙

research

∙ 01/30/2022

Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms

Motivated by online recommendation systems, we propose the problem of fi...

0 Jeongyeol Kwon, et al. ∙

research

∙ 10/17/2021

Provable RL with Exogenous Distractors via Multistep Inverse Dynamics

Many real-world applications of reinforcement learning (RL) require the ...

4 Yonathan Efroni, et al. ∙

research

∙ 10/12/2021

Sparsity in Partially Controllable Linear Systems

A fundamental concept in control theory is that of controllability, wher...

0 Yonathan Efroni, et al. ∙

research

∙ 10/12/2021

Dare not to Ask: Problem-Dependent Guarantees for Budgeted Bandits

We consider a stochastic multi-armed bandit setting where feedback is li...

0 Nadav Merlis, et al. ∙

research

∙ 10/07/2021

Reinforcement Learning in Reward-Mixing MDPs

Learning a near optimal policy in a partially observable system remains ...

0 Jeongyeol Kwon, et al. ∙

research

∙ 03/24/2021

Minimax Regret for Stochastic Shortest Path

We study the Stochastic Shortest Path (SSP) problem in which an agent ha...

15 Alon Cohen, et al. ∙

research

∙ 02/09/2021

RL for Latent MDPs: Regret Guarantees and a Lower Bound

In this work, we consider the regret minimization problem for reinforcem...

0 Jeongyeol Kwon, et al. ∙

research

∙ 08/13/2020

Reinforcement Learning with Trajectory Feedback

The computational model of reinforcement learning is based upon the abil...

46 Yonathan Efroni, et al. ∙

research

∙ 06/11/2020

Bandits with Partially Observable Offline Data

We study linear contextual bandits with access to a large, partially obs...

0 Guy Tennenholtz, et al. ∙

research

∙ 05/20/2020

Mirror Descent Policy Optimization

We propose deep Reinforcement Learning (RL) algorithms inspired by mirro...

0 Manan Tomar, et al. ∙

research

∙ 03/04/2020

Exploration-Exploitation in Constrained MDPs

In many sequential decision-making problems, the goal is to optimize a u...

0 Yonathan Efroni, et al. ∙

research

∙ 02/19/2020

Optimistic Policy Optimization with Bandit Feedback

Policy optimization methods are one of the most widely used classes of R...

0 Yonathan Efroni, et al. ∙

research

∙ 10/07/2019

Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning

Multi-step greedy policies have been extensively used in model-based Rei...

0 Manan Tomar, et al. ∙

research

∙ 09/10/2019

Multi-Step Greedy and Approximate Real Time Dynamic Programming

Real Time Dynamic Programming (RTDP) is a well-known Dynamic Programming...

0 Yonathan Efroni, et al. ∙

research

∙ 09/06/2019

Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs

Trust region policy optimization (TRPO) is a popular and empirically suc...

0 Lior Shani, et al. ∙

research

∙ 05/27/2019

Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies

State-of-the-art efficient model-based Reinforcement Learning (RL) algor...

0 Yonathan Efroni, et al. ∙

research

∙ 01/26/2019

Action Robust Reinforcement Learning and Applications in Continuous Control

A policy is said to be robust if it maximizes the reward while consideri...

0 Chen Tessler, et al. ∙

research

∙ 12/13/2018

Revisiting Exploration-Conscious Reinforcement Learning

The objective of Reinforcement Learning is to learn an optimal policy by...

0 Lior Shani, et al. ∙

research

∙ 09/06/2018

How to Combine Tree-Search Methods in Reinforcement Learning

Finite-horizon lookahead policies are abundantly used in Reinforcement L...

0 Yonathan Efroni, et al. ∙

research

∙ 05/21/2018

Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning

Multiple-step lookahead policies have demonstrated high empirical compet...

0 Yonathan Efroni, et al. ∙

research

∙ 02/10/2018

Beyond the One Step Greedy Approach in Reinforcement Learning

The famous Policy Iteration algorithm alternates between policy improvem...

0 Yonathan Efroni, et al. ∙

Yonathan Efroni

Featured Co-authors

Sign in with Google

Consider DeepAI Pro