
Linear Convergence of EntropyRegularized Natural Policy Gradient with Linear Function Approximation
Natural policy gradient (NPG) methods with function approximation achiev...
The Complexity of NonconvexStronglyConcave Minimax Optimization
This paper studies the complexity for finding approximate stationary poi...
Simulation Studies on Deep Reinforcement Learning for Building Control with Human Interaction
The building sector consumes the largest energy in the world, and there ...
Sample Complexity and Overparameterization Bounds for ProjectionFree Neural TD Learning
We study the dynamics of temporaldifference learning with neural networ...
ProvablyEfficient Double QLearning
In this paper, we establish a theoretical comparison between the asympto...
Biased Stochastic Gradient Descent for Conditional Stochastic Optimization
Conditional Stochastic Optimization (CSO) covers a variety of applicatio...
Periodic QLearning
The use of target networks is a common practice in deep reinforcement le...
Global Convergence and VarianceReduced Optimization for a Class of NonconvexNonconcave Minimax Problems
Nonconvex minimax problems appear frequently in emerging machine learnin...
A Unified Switching System Perspective and O.D.E. Analysis of QLearning Algorithms
In this paper, we introduce a unified framework for analyzing a large fa...
Optimization for Reinforcement Learning: From Single Agent to Cooperative Agents
This article reviews recent advances in multiagent reinforcement learni...
Sample Complexity of Sample Average Approximation for Conditional Stochastic Optimization
In this paper, we study a class of stochastic optimization problems, ref...
Exponential Family Estimation via Adversarial Dynamics Embedding
We present an efficient algorithm for maximum likelihood estimation (MLE...
TargetBased Temporal Difference Learning
The use of target networks has been a popular and key component of recen...
Quadratic Decomposable Submodular Function Minimization: Theory and Practice
We introduce a new convex optimization problem, termed quadratic decompo...
Kernel Exponential Family Estimation via Doubly Dual Embedding
We investigate penalized maximum loglikelihood estimation for exponenti...
Quadratic Decomposable Submodular Function Minimization
We introduce a new convex optimization problem, termed quadratic decompo...
Nonparametric Hawkes Processes: Online Estimation and Generalization Bounds
In this paper, we design a nonparametric online algorithm for estimating...
Smoothed Dual Embedding Control
We revisit the Bellman optimality equation with Nesterov's smoothing tec...
Boosting the Actor with Dual Critic
This paper proposes a new actorcriticstyle algorithm called Dual Actor...
Stochastic Generative Hashing
Learningbased binary hashing has become a powerful paradigm for fast se...
Fast and Simple Optimization for Poisson Likelihood Models
Poisson likelihood models have been prevalently used in imaging, social ...
Learning from Conditional Distributions via Dual Embeddings
Many machine learning tasks, such as learning with invariance and policy...
Provable Bayesian Inference via Particle Mirror Descent
Bayesian methods are appealing in their flexibility in modeling complex ...
Scalable Kernel Methods via Doubly Stochastic Gradients
The general perception is that kernel methods are not scalable, and neur...
Stochastic ADMM for Nonsmooth Optimization
We present a stochastic setting for optimization problems with nonsmooth...
Niao He
