
-
Policy Optimization as Online Learning with Mediator Feedback
Policy Optimization (PO) is a widely used approach to address continuous...
read it
-
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits
In the contextual linear bandit setting, algorithms built on the optimis...
read it
-
Option Hedging with Risk Averse Reinforcement Learning
In this paper we show how risk-averse reinforcement learning can be used...
read it
-
Inverse Reinforcement Learning from a Gradient-based Learner
Inverse Reinforcement Learning addresses the problem of inferring an exp...
read it
-
Newton-based Policy Optimization for Games
Many learning problems involve multiple agents optimizing different inte...
read it
-
A Policy Gradient Method for Task-Agnostic Exploration
In a reward-free environment, what is a suitable intrinsic objective for...
read it
-
Sequential Transfer in Reinforcement Learning with a Generative Model
We are interested in how to design reinforcement learning agents that pr...
read it
-
Time-Variant Variational Transfer for Value Functions
In most transfer learning approaches to reinforcement learning (RL) the ...
read it
-
A Novel Confidence-Based Algorithm for Structured Bandits
We study finite-armed stochastic bandits where the rewards of each arm m...
read it
-
Online Joint Bid/Daily Budget Optimization of Internet Advertising Campaigns
Pay-per-click advertising includes various formats (e.g., search, contex...
read it
-
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning
The choice of the control frequency of a system has a relevant impact on...
read it
-
MushroomRL: Simplifying Reinforcement Learning Research
MushroomRL is an open-source Python library developed to simplify the pr...
read it
-
Risk-Averse Trust Region Optimization for Reward-Volatility Reduction
In real-world decision-making problems, for instance in the fields of fi...
read it
-
Gradient-Aware Model-based Policy Search
Traditional model-based reinforcement learning approaches learn a model ...
read it
-
Policy Space Identification in Configurable Environments
We study the problem of identifying the policy space of a learning agent...
read it
-
Feature Selection via Mutual Information: New Theoretical Insights
Mutual information has been successfully adopted in filter feature-selec...
read it
-
An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies
What is a good exploration strategy for an agent that interacts with an ...
read it
-
Smoothing Policies and Safe Policy Gradients
Policy gradient algorithms are among the best candidates for the much an...
read it
-
Policy Optimization via Importance Sampling
Policy optimization is an effective reinforcement learning approach to s...
read it
-
Stochastic Variance-Reduced Policy Gradient
In this paper, we propose a novel reinforcement- learning algorithm cons...
read it
-
Configurable Markov Decision Processes
In many real-world problems, there is the possibility to configure, to a...
read it
-
Importance Weighted Transfer of Samples in Reinforcement Learning
We consider the transfer of experience samples (i.e., tuples < s, a, s',...
read it
-
Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent
In this paper, we propose a novel approach to automatically determine th...
read it
-
Unimodal Thompson Sampling for Graph-Structured Arms
We study, to the best of our knowledge, the first Bayesian algorithm for...
read it
-
Multi-objective Reinforcement Learning with Continuous Pareto Frontier Approximation Supplementary Material
This document contains supplementary material for the paper "Multi-objec...
read it
-
Transfer from Multiple MDPs
Transfer reinforcement learning (RL) methods leverage on the experience ...
read it