
-
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Designing off-policy reinforcement learning algorithms is typically a ve...
read it
-
Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning
In offline reinforcement learning (RL) an optimal policy is learnt solel...
read it
-
A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization
This paper proposes a new algorithm – the Momentum-assisted Single-times...
read it
-
Is Pessimism Provably Efficient for Offline RL?
We study offline reinforcement learning (RL), which aims to learn an opt...
read it
-
Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy
While deep reinforcement learning has achieved tremendous successes in v...
read it
-
Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization
We consider the optimization problem of minimizing a functional defined ...
read it
-
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
The classical theory of reinforcement learning (RL) has focused on tabul...
read it
-
Provable Fictitious Play for General Mean-Field Games
We propose a reinforcement learning algorithm for stationary mean-field ...
read it
-
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Temporal-Difference (TD) learning with nonlinear smooth function approxi...
read it
-
Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time
Reinforcement learning is a powerful tool to learn the optimal policy of...
read it
-
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
We study the global convergence and global optimality of actor-critic, o...
read it
-
Understanding Implicit Regularization in Over-Parameterized Nonlinear Statistical Model
We study the implicit regularization phenomenon induced by simple optimi...
read it
-
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic
This paper analyzes a two-timescale stochastic algorithm for a class of ...
read it
-
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach
Structural equation models (SEMs) are widely used in sciences, ranging f...
read it
-
Dynamic Regret of Policy Optimization in Non-stationary Environments
We consider reinforcement learning (RL) in episodic MDPs with adversaria...
read it
-
On the Global Optimality of Model-Agnostic Meta-Learning
Model-agnostic meta-learning (MAML) formulates meta-learning as a bileve...
read it
-
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
We study risk-sensitive reinforcement learning in episodic Markov decisi...
read it
-
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Empowered by expressive function approximators such as neural networks, ...
read it
-
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning
Multi-agent reinforcement learning (MARL) achieves significant empirical...
read it
-
Neural Certificates for Safe Control Policies
This paper develops an approach to learn a policy of a dynamical system ...
read it
-
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Temporal-difference and Q-learning play a key role in deep reinforcement...
read it
-
An efficient Gehan-type estimation for the accelerated failure time model with clustered and censored data
In medical studies, the collected covariates usually contain underlying ...
read it
-
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate
Generative adversarial imitation learning (GAIL) demonstrates tremendous...
read it
-
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees
Graph representation learning is a ubiquitous task in machine learning w...
read it
-
Upper Confidence Primal-Dual Optimization: Stochastically Constrained Markov Decision Processes with Adversarial Losses and Unknown Transitions
We consider online learning for episodic Markov decision processes (MDPs...
read it
-
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
We study the Safe Reinforcement Learning (SRL) problem using the Constra...
read it
-
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
We develop provably efficient reinforcement learning algorithms for two-...
read it
-
On Computation and Generalization of Generative Adversarial Imitation Learning
Generative Adversarial Imitation Learning (GAIL) is a powerful and pract...
read it
-
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
This paper develops a Pontryagin differentiable programming (PDP) method...
read it
-
Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator
Multi-agent reinforcement learning has been successfully applied to a nu...
read it
-
Provably Efficient Exploration in Policy Optimization
While policy-based reinforcement learning (RL) achieves tremendous succe...
read it
-
Decentralized Multi-Agent Reinforcement Learning with Networked Agents: Recent Advances
Multi-agent reinforcement learning (MARL) has long been a significant an...
read it
-
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Recent years have witnessed significant advances in reinforcement learni...
read it
-
Convergent Policy Optimization for Safe Reinforcement Learning
We study the safe reinforcement learning problem with nonlinear function...
read it
-
Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
We study discrete-time mean-field Markov games with infinite numbers of ...
read it
-
Credible Sample Elicitation by Deep Learning, for Deep Learning
It is important to collect credible training samples (x,y) for building ...
read it
-
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Policy gradient methods with actor-critic schemes demonstrate tremendous...
read it
-
Robust One-Bit Recovery via ReLU Generative Networks: Improved Statistical Rates and Global Landscape Analysis
We study the robust one-bit compressed sensing problem whose goal is to ...
read it
-
Fast multi-agent temporal-difference learning via homotopy stochastic primal-dual optimization
We consider a distributed multi-agent policy evaluation problem in reinf...
read it
-
More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning
We consider the weakly supervised binary classification problem where th...
read it
-
On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
Despite the empirical success of the actor-critic algorithm, its theoret...
read it
-
Stochastic Convergence Results for Regularized Actor-Critic Methods
In this paper, we present a stochastic convergence proof, under suitable...
read it
-
Provably Efficient Reinforcement Learning with Linear Function Approximation
Modern Reinforcement Learning (RL) is commonly applied to practical prob...
read it
-
A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning
This paper considers a distributed reinforcement learning problem in whi...
read it
-
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Proximal policy optimization and trust region policy optimization (PPO a...
read it
-
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games
We study the global convergence of policy optimization for finding the N...
read it
-
Neural Temporal-Difference Learning Converges to Global Optima
Temporal-difference learning (TD), coupled with neural networks, is amon...
read it
-
A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
This paper extends off-policy reinforcement learning to the multi-agent ...
read it
-
Finite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement Learning
Despite the increasing interest in multi-agent reinforcement learning (M...
read it
-
Provable Gaussian Embedding with One Observation
The success of machine learning methods heavily relies on having an appr...
read it