Gal Dalal

research

∙ 01/30/2023

SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search

Despite the popularity of policy gradient methods, they are known to suf...

0 Gal Dalal, et al. ∙

research

∙ 09/28/2022

SoftTreeMax: Policy Gradient with Tree Search

Policy-gradient methods are widely used for learning control policies. T...

0 Gal Dalal, et al. ∙

research

∙ 07/05/2022

Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Cloud datacenters are exponentially growing both in numbers and size. Th...

0 Benjamin Fuhrer, et al. ∙

research

∙ 05/30/2022

Reinforcement Learning with a Terminator

We present the problem of reinforcement learning with exogenous terminat...

0 Guy Tennenholtz, et al. ∙

research

∙ 01/28/2022

Planning and Learning with Adaptive Lookahead

The classical Policy Iteration (PI) algorithm alternates between greedy ...

0 Aviv Rosenberg, et al. ∙

research

∙ 10/13/2021

On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning

We consider the problem of using expert data with unobserved confounders...

0 Guy Tennenholtz, et al. ∙

research

∙ 07/04/2021

Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction

Tree Search (TS) is crucial to some of the most influential successes in...

0 Assaf Hallak, et al. ∙

research

∙ 02/18/2021

Reinforcement Learning for Datacenter Congestion Control

We approach the task of network congestion control in datacenters using ...

0 Chen Tessler, et al. ∙

research

∙ 01/28/2021

Acting in Delayed Environments with Non-Stationary Markov Policies

The standard Markov Decision Process (MDP) formulation hinges on the ass...

0 Esther Derman, et al. ∙

research

∙ 12/08/2020

The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems

With deep reinforcement learning (RL) methods achieving results that exc...

26 Ahmet Inci, et al. ∙

research

∙ 11/20/2019

A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound

Policy evaluation in reinforcement learning is often conducted using two...

0 Gal Dalal, et al. ∙

research

∙ 09/06/2018

How to Combine Tree-Search Methods in Reinforcement Learning

Finite-horizon lookahead policies are abundantly used in Reinforcement L...

0 Yonathan Efroni, et al. ∙

research

∙ 05/21/2018

Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning

Multiple-step lookahead policies have demonstrated high empirical compet...

0 Yonathan Efroni, et al. ∙

research

∙ 02/10/2018

Beyond the One Step Greedy Approach in Reinforcement Learning

The famous Policy Iteration algorithm alternates between policy improvem...

0 Yonathan Efroni, et al. ∙

research

∙ 01/26/2018

Safe Exploration in Continuous Action Spaces

We address the problem of deploying a reinforcement learning (RL) agent ...

0 Gal Dalal, et al. ∙

research

∙ 01/01/2018

Chance-Constrained Outage Scheduling using a Machine Learning Proxy

Outage scheduling aims at defining, over a horizon of several months to ...

0 Gal Dalal, et al. ∙

research

∙ 04/04/2017

Finite Sample Analyses for TD(0) with Function Approximation

TD(0) is one of the most commonly used algorithms in reinforcement learn...

0 Gal Dalal, et al. ∙

research

∙ 11/30/2016

Unit Commitment using Nearest Neighbor as a Short-Term Proxy

We devise the Unit Commitment Nearest Neighbor (UCNN) algorithm to be us...

0 Gal Dalal, et al. ∙

research

∙ 03/06/2016

Hierarchical Decision Making In Electricity Grid Management

The power grid is a complex and vital system that necessitates careful r...

0 Gal Dalal, et al. ∙

Gal Dalal

Featured Co-authors

Sign in with Google

Consider DeepAI Pro