Yingbin Liang

research

∙ 08/17/2023

Model-Free Algorithm with Improved Sample Efficiency for Zero-Sum Markov Games

The problem of two-player zero-sum Markov games has recently attracted i...

0 Songtao Feng, et al. ∙

research

∙ 08/10/2023

Provably Efficient Algorithm for Nonstationary Low-Rank MDPs

Reinforcement learning (RL) under changing environment models many real-...

0 Yuan Cheng, et al. ∙

research

∙ 08/07/2023

Non-Convex Bilevel Optimization with Time-Varying Objective Functions

Bilevel optimization has become a powerful tool in a wide variety of mac...

0 Sen Lin, et al. ∙

research

∙ 08/01/2023

Doubly Robust Instance-Reweighted Adversarial Training

Assigning importance weights to adversarial data has achieved great succ...

0 Daouda Sow, et al. ∙

research

∙ 07/01/2023

Provably Efficient UCB-type Algorithms For Learning Predictive State Representations

The general sequential decision-making problem, which includes Markov de...

0 Ruiquan Huang, et al. ∙

research

∙ 06/14/2023

Theoretical Hardness and Tractability of POMDPs in RL with Partial Hindsight State Information

Partially observable Markov decision processes (POMDPs) have been widely...

0 Ming Shi, et al. ∙

research

∙ 06/08/2023

Generalization Performance of Transfer Learning: Overparameterized and Underparameterized Regimes

Transfer learning is a useful technique for achieving improved performan...

0 Peizhong Ju, et al. ∙

research

∙ 06/01/2023

Non-stationary Reinforcement Learning under General Function Approximation

General function approximation is a powerful tool to handle large state ...

0 Songtao Feng, et al. ∙

research

∙ 04/09/2023

Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning

Meta-learning has arisen as a successful method for improving training p...

0 Peizhong Ju, et al. ∙

research

∙ 03/20/2023

Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs

In reward-free reinforcement learning (RL), an agent explores the enviro...

0 Yuan Cheng, et al. ∙

research

∙ 02/28/2023

M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

Learning to Optimize (L2O) has drawn increasing attention as it often re...

0 Junjie Yang, et al. ∙

research

∙ 02/22/2023

Learning to Generalize Provably in Learning to Optimize

Learning to optimize (L2O) has gained increasing popularity, which autom...

0 Junjie Yang, et al. ∙

research

∙ 02/12/2023

Theory on Forgetting and Generalization of Continual Learning

Continual learning (CL), which aims to learn a sequence of tasks, has at...

0 Sen Lin, et al. ∙

research

∙ 02/08/2023

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

In many applications of Reinforcement Learning (RL), it is critically im...

0 Ming Shi, et al. ∙

research

∙ 02/08/2023

Near-Optimal Adversarial Reinforcement Learning with Switching Costs

Switching costs, which capture the costs for changing policies, are rega...

0 Ming Shi, et al. ∙

research

∙ 02/02/2023

Algorithm Design for Online Meta-Learning with Task Boundary Detection

Online meta-learning has recently emerged as a marriage between batch me...

0 Daouda Sow, et al. ∙

research

∙ 01/01/2023

Theoretical Characterization of How Neural Network Pruning Affects its Generalization

It has been observed in practice that applying pruning-at-initialization...

0 Hongru Yang, et al. ∙

research

∙ 01/01/2023

Sharper analysis of sparsely activated wide neural networks with trainable biases

This work studies training one-hidden-layer overparameterized ReLU netwo...

0 Hongru Yang, et al. ∙

research

∙ 08/18/2022

Global Convergence of Two-timescale Actor-Critic for Solving Linear Quadratic Regulator

The actor-critic (AC) reinforcement learning algorithms have been the po...

0 Xuyang Chen, et al. ∙

research

∙ 06/28/2022

Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL

While the primary goal of the exploration phase in reward-free reinforce...

0 Ruiquan Huang, et al. ∙

research

∙ 06/18/2022

Provable Generalization of Overparameterized Meta-learning Trained with SGD

Despite the superior empirical success of deep meta-learning, theoretica...

0 Yu Huang, et al. ∙

research

∙ 06/13/2022

Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward

The remarkable success of reinforcement learning (RL) heavily relies on ...

0 Tengyu Xu, et al. ∙

research

∙ 06/13/2022

Provable Benefit of Multitask Representation Learning in Reinforcement Learning

As representation learning becomes a powerful technique to reduce sample...

0 Yuan Cheng, et al. ∙

research

∙ 05/27/2022

Will Bilevel Optimizers Benefit from Loops

Bilevel optimization has arisen as a powerful tool for solving a variety...

7 Kaiyi Ji, et al. ∙

research

∙ 03/31/2022

Data Sampling Affects the Complexity of Online SGD over Dependent Data

Conventional machine learning applications typically assume that data sa...

0 Shaocong Ma, et al. ∙

research

∙ 03/01/2022

A Constrained Optimization Approach to Bilevel Optimization with Multiple Inner Minima

Bilevel optimization has found extensive applications in modern machine ...

0 Daouda Sow, et al. ∙

research

∙ 02/07/2022

Model-Based Offline Meta-Reinforcement Learning with Regularization

Existing offline reinforcement learning (RL) methods face a few major ch...

0 Sen Lin, et al. ∙

research

∙ 10/20/2021

Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

The problem of constrained Markov decision process (CMDP) is investigate...

0 Tianjiao Li, et al. ∙

research

∙ 10/13/2021

ES-Based Jacobian Enables Faster Bilevel Optimization

Bilevel optimization (BO) has arisen as a powerful tool for solving many...

0 Daouda Sow, et al. ∙

research

∙ 10/13/2021

PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method

Emphatic temporal difference (ETD) learning (Sutton et al., 2016) is a s...

0 Ziwei Guan, et al. ∙

research

∙ 07/06/2021

A Unified Off-Policy Evaluation Approach for General Value Function

General Value Function (GVF) is a powerful tool to represent both the pr...

0 Tengyu Xu, et al. ∙

research

∙ 06/08/2021

Provably Faster Algorithms for Bilevel Optimization

Bilevel optimization has been widely applied in many important machine l...

0 Junjie Yang, et al. ∙

research

∙ 02/23/2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality

Designing off-policy reinforcement learning algorithms is typically a ve...

0 Tengyu Xu, et al. ∙

research

∙ 02/09/2021

Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry

The gradient descent-ascent (GDA) algorithm has been widely applied to s...

11 Ziyi Chen, et al. ∙

research

∙ 02/07/2021

Lower Bounds and Accelerated Algorithms for Bilevel Optimization

Bilevel optimization has recently attracted growing interests due to its...

0 Kaiyi Ji, et al. ∙

research

∙ 11/11/2020

A Primal Approach to Constrained Policy Optimization: Global Optimality and Finite-Time Analysis

Safe reinforcement learning (SRL) problems are typically modeled as cons...

0 Tengyu Xu, et al. ∙

research

∙ 11/10/2020

Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms

Two timescale stochastic approximation (SA) has been widely used in valu...

0 Tengyu Xu, et al. ∙

research

∙ 10/15/2020

Provably Faster Algorithms for Bilevel Optimization and Applications to Meta-Learning

Bilevel optimization has arisen as a powerful tool for many machine lear...

0 Kaiyi Ji, et al. ∙

research

∙ 09/29/2020

Finite-Time Analysis for Double Q-learning

Although Q-learning is one of the most successful algorithms for finding...

5 Huaqing Xiong, et al. ∙

research

∙ 08/09/2020

Spectral Algorithms for Community Detection in Directed Networks

Community detection in large social networks is affected by degree heter...

25 Zhe Wang, et al. ∙

research

∙ 07/30/2020

Momentum Q-learning with Finite-Sample Convergence Guarantee

Existing studies indicate that momentum ideas in conventional optimizati...

8 Bowen Weng, et al. ∙

research

∙ 07/29/2020

Feedback Capacities of Gaussian Multiple-Access Wiretap Channels

The feedback capacities of the Gaussian multiple-access channel (GMAC) a...

0 Bin Dai, et al. ∙

research

∙ 07/15/2020

Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent

Existing convergence analyses of Q-learning mostly focus on the vanilla ...

0 Bowen Weng, et al. ∙

research

∙ 06/24/2020

When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence

Generative adversarial imitation learning (GAIL) is a popular inverse re...

0 Ziwei Guan, et al. ∙

research

∙ 06/16/2020

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

Although model-agnostic meta-learning (MAML) is a very successful algori...

0 Kaiyi Ji, et al. ∙

research

∙ 06/16/2020

Enhanced First and Zeroth Order Variance Reduced Algorithms for Min-Max Optimization

Min-max optimization captures many important machine learning problems s...

0 Tengyu Xu, et al. ∙

research

∙ 05/07/2020

Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms

As an important type of reinforcement learning algorithms, actor-critic ...

0 Tengyu Xu, et al. ∙

research

∙ 04/27/2020

Improving Sample Complexity Bounds for Actor-Critic Algorithms

The actor-critic (AC) algorithm is a popular method to find an optimal p...

8 Tengyu Xu, et al. ∙

research

∙ 02/26/2020

Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart for Nonconvex Optimization

Various types of parameter restart schemes have been proposed for accele...

0 Yi Zhou, et al. ∙

research

∙ 02/18/2020

Multi-Step Model-Agnostic Meta-Learning: Convergence and Improved Algorithms

As a popular meta-learning approach, the model-agnostic meta-learning (M...

7 Kaiyi Ji, et al. ∙

Yingbin Liang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro