Shie Mannor

research

∙ 09/03/2023

Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization

In robust Markov decision processes (RMDPs), it is assumed that the rewa...

0 Uri Gadot, et al. ∙

research

∙ 07/25/2023

Implicitly Normalized Explicitly Regularized Density Estimation

We propose a new approach to non-parametric density estimation, that is ...

0 Mark Kozdoba, et al. ∙

research

∙ 06/24/2023

Individualized Dosing Dynamics via Neural Eigen Decomposition

Dosing models often use differential equations to model biological dynam...

0 Stav Belogolovsky, et al. ∙

research

∙ 06/09/2023

Robust Reinforcement Learning via Adversarial Kernel Approximation

Robust Markov Decision Processes (RMDPs) provide a framework for sequent...

0 Kaixin Wang, et al. ∙

research

∙ 05/31/2023

Representation-Driven Reinforcement Learning

We present a representation-driven framework for reinforcement learning....

0 Ofir Nabati, et al. ∙

research

∙ 05/02/2023

CALM: Conditional Adversarial Latent Models for Directable Virtual Characters

In this work, we present Conditional Adversarial Latent Models (CALM), a...

0 Chen Tessler, et al. ∙

research

∙ 03/12/2023

Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization

Robust Markov decision processes (MDPs) aim to handle changing or partia...

0 Esther Derman, et al. ∙

research

∙ 01/31/2023

An Efficient Solution to s-Rectangular Robust Markov Decision Processes

We present an efficient robust value iteration for -rectangular robust M...

0 Navdeep Kumar, et al. ∙

research

∙ 01/31/2023

Policy Gradient for s-Rectangular Robust Markov Decision Processes

We present a novel robust policy gradient method (RPG) for s-rectangular...

0 Navdeep Kumar, et al. ∙

research

∙ 01/30/2023

SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search

Despite the popularity of policy gradient methods, they are known to suf...

0 Gal Dalal, et al. ∙

research

∙ 01/26/2023

Train Hard, Fight Easy: Robust Meta Reinforcement Learning

A major challenge of reinforcement learning (RL) in real-world applicati...

0 Ido Greenberg, et al. ∙

research

∙ 01/03/2023

Towards Deployable RL – What's Broken with RL Research and a Potential Fix

Reinforcement learning (RL) has demonstrated great potential, but is cur...

0 Shie Mannor, et al. ∙

research

∙ 12/13/2022

DiffStack: A Differentiable and Modular Control Stack for Autonomous Vehicles

Autonomous vehicle (AV) stacks are typically built in a modular fashion,...

0 Peter Karkus, et al. ∙

research

∙ 10/05/2022

Tractable Optimality in Episodic Latent MABs

We consider a multi-armed bandit problem with M latent contexts, where a...

0 Jeongyeol Kwon, et al. ∙

research

∙ 10/05/2022

Reward-Mixing MDPs with a Few Latent Contexts are Learnable

We consider episodic reinforcement learning in reward-mixing Markov deci...

0 Jeongyeol Kwon, et al. ∙

research

∙ 10/03/2022

Policy Gradient for Reinforcement Learning with General Utilities

In Reinforcement Learning (RL), the goal of agents is to discover an opt...

0 Navdeep Kumar, et al. ∙

research

∙ 09/28/2022

SoftTreeMax: Policy Gradient with Tree Search

Policy-gradient methods are widely used for learning control policies. T...

0 Gal Dalal, et al. ∙

research

∙ 07/19/2022

Actor-Critic based Improper Reinforcement Learning

We consider an improper reinforcement learning setting where a learner i...

0 Mohammadi Zaki, et al. ∙

research

∙ 07/05/2022

Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Cloud datacenters are exponentially growing both in numbers and size. Th...

0 Benjamin Fuhrer, et al. ∙

research

∙ 06/26/2022

Analysis of Stochastic Processes through Replay Buffers

Replay buffers are a key component in many reinforcement learning scheme...

0 Shirli Di-Castro Shashua, et al. ∙

research

∙ 05/30/2022

Reinforcement Learning with a Terminator

We present the problem of reinforcement learning with exogenous terminat...

0 Guy Tennenholtz, et al. ∙

research

∙ 05/28/2022

Efficient Policy Iteration for Robust Markov Decision Processes via Regularization

Robust Markov decision processes (MDPs) provide a general framework to m...

0 Navdeep Kumar, et al. ∙

research

∙ 05/10/2022

Efficient Risk-Averse Reinforcement Learning

In risk-averse reinforcement learning (RL), the goal is to optimize some...

9 Ido Greenberg, et al. ∙

research

∙ 04/18/2022

Optimizing Tensor Network Contraction Using Reinforcement Learning

Quantum Computing (QC) stands to revolutionize computing, but is current...

0 Eli A. Meirom, et al. ∙

research

∙ 03/12/2022

Whats Missing? Learning Hidden Markov Models When the Locations of Missing Observations are Unknown

The Hidden Markov Model (HMM) is one of the most widely used statistical...

0 Binyamin Perets, et al. ∙

research

∙ 02/02/2022

Learning to reason about and to act on physical cascading events

Reasoning and interacting with dynamic environments is a fundamental pro...

0 Yuval Atzmon, et al. ∙

research

∙ 01/31/2022

Continuous Forecasting via Neural Eigen Decomposition of Stochastic Dynamics

Motivated by a real-world problem of blood coagulation control in Hepari...

0 Stav Belogolovsky, et al. ∙

research

∙ 01/30/2022

The Geometry of Robust Value Functions

The space of value functions is a fundamental concept in reinforcement l...

0 Kaixin Wang, et al. ∙

research

∙ 01/30/2022

Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms

Motivated by online recommendation systems, we propose the problem of fi...

0 Jeongyeol Kwon, et al. ∙

research

∙ 01/28/2022

Planning and Learning with Adaptive Lookahead

The classical Policy Iteration (PI) algorithm alternates between greedy ...

0 Aviv Rosenberg, et al. ∙

research

∙ 10/13/2021

On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning

We consider the problem of using expert data with unobserved confounders...

0 Guy Tennenholtz, et al. ∙

research

∙ 10/12/2021

Twice regularized MDPs and the equivalence between robustness and regularization

Robust Markov decision processes (MDPs) aim to handle changing or partia...

0 Esther Derman, et al. ∙

research

∙ 10/12/2021

Dare not to Ask: Problem-Dependent Guarantees for Budgeted Bandits

We consider a stochastic multi-armed bandit setting where feedback is li...

0 Nadav Merlis, et al. ∙

research

∙ 10/07/2021

Reinforcement Learning in Reward-Mixing MDPs

Learning a near optimal policy in a partially observable system remains ...

0 Jeongyeol Kwon, et al. ∙

research

∙ 10/05/2021

Continuous-Time Fitted Value Iteration for Robust Policies

Solving the Hamilton-Jacobi-Bellman equation is important in many domain...

1 Michael Lutter, et al. ∙

research

∙ 10/01/2021

Sim and Real: Better Together

Simulation is used extensively in autonomous systems, particularly in ro...

0 Shirli Di-Castro Shashua, et al. ∙

research

∙ 09/22/2021

Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning

Cooperative multi-agent reinforcement learning (MARL) faces significant ...

0 Roy Zohar, et al. ∙

research

∙ 07/04/2021

Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction

Tree Search (TS) is crucial to some of the most influential successes in...

0 Assaf Hallak, et al. ∙

research

∙ 05/25/2021

Robust Value Iteration for Continuous Control Tasks

When transferring a control policy from simulation to a physical system,...

2 Michael Lutter, et al. ∙

research

∙ 05/10/2021

Value Iteration in Continuous Actions, States and Time

Classical value iteration approaches are not applicable to environments ...

4 Michael Lutter, et al. ∙

research

∙ 05/01/2021

Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling

We consider the problem of scheduling in constrained queueing networks w...

0 Mohammani Zaki, et al. ∙

research

∙ 04/06/2021

Using Kalman Filter The Right Way: Noise Estimation Is Not Optimal

Determining the noise parameters of a Kalman Filter (KF) has been resear...

0 Ido Greenberg, et al. ∙

research

∙ 03/18/2021

Maximum Entropy Reinforcement Learning with Mixture Policies

Mixture models are an expressive hypothesis class that can approximate a...

0 Nir Baram, et al. ∙

research

∙ 02/22/2021

Action Redundancy in Reinforcement Learning

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning p...

0 Nir Baram, et al. ∙

research

∙ 02/22/2021

GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning

Offline reinforcement learning approaches can generally be divided to pr...

0 Guy Tennenholtz, et al. ∙

research

∙ 02/18/2021

Reinforcement Learning for Datacenter Congestion Control

We approach the task of network congestion control in datacenters using ...

0 Chen Tessler, et al. ∙

research

∙ 02/16/2021

Improper Learning with Gradient-based Policy Optimization

We consider an improper reinforcement learning setting where the learner...

0 Mohammadi Zaki, et al. ∙

research

∙ 02/13/2021

Online Apprenticeship Learning

In Apprenticeship Learning (AL), we are given a Markov Decision Process ...

0 Lior Shani, et al. ∙

research

∙ 02/09/2021

RL for Latent MDPs: Regret Guarantees and a Lower Bound

In this work, we consider the regret minimization problem for reinforcem...

0 Jeongyeol Kwon, et al. ∙

research

∙ 02/07/2021

Dimension Free Generalization Bounds for Non Linear Metric Learning

In this work we study generalization guarantees for the metric learning ...

0 Mark Kozdoba, et al. ∙

Shie Mannor

Featured Co-authors

Sign in with Google

Consider DeepAI Pro