
Randomized Exploration is NearOptimal for Tabular MDP
We study exploration using randomized value functions in Thompson Sampli...
read it

Provably Efficient Policy Gradient Methods for TwoPlayer ZeroSum Markov Games
Policy gradient methods are widely used in solving twoplayer zerosum g...
read it

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
We study episodic reinforcement learning under unknown adversarial corru...
read it

FineGrained GapDependent Bounds for Tabular MDPs via Adaptive MultiStep Bootstrap
This paper presents a new modelfree algorithm for episodic finitehoriz...
read it

VarianceAware Confidence Set: VarianceDependent Bound for Linear Bandits and HorizonFree Bound for Linear Mixture MDP
We show how to construct varianceaware confidence sets for linear bandi...
read it

A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost
Many realworld applications, such as those in medical domains, recommen...
read it

Nearly Minimax Optimal Rewardfree Reinforcement Learning
We study the rewardfree reinforcement learning framework, which is part...
read it

Is Reinforcement Learning More Difficult Than Bandits? A Nearoptimal Algorithm Escaping the Curse of Horizon
Episodic reinforcement learning and contextual bandits are two widely st...
read it

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
We study how neural networks trained by gradient descent extrapolate, i....
read it

On RewardFree Reinforcement Learning with Linear Function Approximation
Rewardfree reinforcement learning (RL) is a framework which is suitable...
read it

Qlearning with Logarithmic Regret
This paper presents the first nonasymptotic result showing that a model...
read it

When is Particle Filtering Efficient for POMDP Sequential Planning?
Particle filtering is a popular method for inferring latent states in st...
read it

Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?
Learning to plan for long horizons is a central challenge in episodic re...
read it

Provably Efficient Exploration for RL with Unsupervised Learning
We study how to use unsupervised learning for efficient exploration in r...
read it

Provable Representation Learning for Imitation Learning via Bilevel Optimization
A common strategy in modern learning systems is to learn a representatio...
read it

FewShot Learning via Learning the Representation, Provably
This paper studies fewshot learning via representation learning, where ...
read it

Agnostic Qlearning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity
The current paper studies the problem of agnostic Qlearning with functi...
read it

Overparameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality
Adversarial training is a popular method to give neural nets robustness ...
read it

Optimism in Reinforcement Learning with Generalized Linear Function Approximation
We design a new provably efficient algorithm for episodic reinforcement ...
read it

Enhanced Convolutional Neural Tangent Kernels
Recent research shows that for training with ℓ_2 loss, convolutional neu...
read it

Continuous Control with Contexts, Provably
A fundamental challenge in artificial intelligence is to build an agent ...
read it

Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?
Modern deep learning methods provide an effective means to learn good re...
read it

Harnessing the Power of Infinitely Wide Deep Nets on Smalldata Tasks
Recent research shows that the following two models are equivalent: (a) ...
read it

Dual Sequential Monte Carlo: Tunneling Filtering and Planning in Continuous POMDPs
We present the DualSMC network that solves continuous POMDPs by learning...
read it

Towards Understanding the Importance of Shortcut Connections in Residual Networks
Residual Network (ResNet) is undoubtedly a milestone in deep learning. R...
read it

Provably Efficient Qlearning with Function Approximation via Distribution Shift Error Checking Oracle
Qlearning with function approximation is one of the most popular method...
read it

What Can Neural Networks Reason About?
Neural networks have successfully been applied to solving reasoning task...
read it

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels
While graph kernels (GKs) are easy to train and enjoy provable theoretic...
read it

Hitting Time of Stochastic Gradient Langevin Dynamics to Stationary Points: A Direct Analysis
Stochastic gradient Langevin dynamics (SGLD) is a fundamental algorithm ...
read it

On Exact Computation with an Infinitely Wide Neural Net
How well does a classic deep net architecture like AlexNet or VGG19 clas...
read it

Global Convergence of Adaptive Gradient Methods for An Overparameterized Neural Network
Adaptive gradient methods like AdaGrad are widely used in optimizing neu...
read it

Acceleration via Symplectic Discretization of HighResolution Differential Equations
We study firstorder optimization methods obtained by discretizing ordin...
read it

Provably efficient RL with Rich Observations via Latent State Decoding
We study the exploration problem in episodic MDPs with rich observations...
read it

FineGrained Analysis of Optimization and Generalization for Overparameterized TwoLayer Neural Networks
Recent works have cast some light on the mystery of why deep nets fit an...
read it

Width Provably Matters in Optimization for Deep Linear Neural Networks
We prove that for an Llayer fullyconnected linear neural network, if t...
read it

Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient descent finds a global minimum in training deep neural networks...
read it

Understanding the Acceleration Phenomenon via HighResolution Differential Equations
Gradientbased optimization algorithms can be studied from the perspecti...
read it

Gradient Descent Provably Optimizes Overparameterized Neural Networks
One of the mystery in the success of neural networks is randomly initial...
read it

Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced
We study the implicit regularization imposed by gradient descent for lea...
read it

Robust Nonparametric Regression under Huber's εcontamination Model
We consider the nonparametric regression problem under Huber's ϵcontam...
read it

How Many Samples are Needed to Learn a Convolutional Neural Network?
A widespread folklore for explaining the success of convolutional neural...
read it

Improved Learning of Onehiddenlayer Convolutional Neural Networks with Overlaps
We propose a new algorithm to learn a onehiddenlayer convolutional neu...
read it

Fast and Sample Efficient Inductive Matrix Completion via MultiPhase Procrustes Flow
We revisit the inductive matrix completion problem that aims to recover ...
read it

On the Power of Overparametrization in Neural Networks with Quadratic Activation
We provide new theoretical insights on why overparametrization is effec...
read it

NearLinear Time Local Polynomial Nonparametric Estimation
Local polynomial regression (Fan & Gijbels, 1996) is an important class ...
read it

Linear Convergence of the PrimalDual Gradient Method for ConvexConcave Saddle Point Problems without Strong Convexity
We consider the convexconcave saddle point problem _x_y f(x)+y^ A xg(y...
read it

Gradient Descent Learns Onehiddenlayer CNN: Don't be Afraid of Spurious Local Minima
We consider the problem of learning a onehiddenlayer neural network wi...
read it

When is a Convolutional Filter Easy To Learn?
We analyze the convergence of (stochastic) gradient descent algorithm fo...
read it

Gradient Descent Can Take Exponential Time to Escape Saddle Points
Although gradient descent (GD) almost always escapes saddle points asymp...
read it

Stochastic Variance Reduction Methods for Policy Evaluation
Policy evaluation is a crucial step in many reinforcementlearning proce...
read it
Simon S. Du
is this you? claim profile
Research Intern at facebook 2017, Research Intern at Microsoft 2016, Consulting Intern at Accenture 2015, Research Assistant at UC Berkeley from 20132014, Software Engineering Intern at Google 2014, Research Assistant at Bay Area Intellectual Property Group 2013, Consulting Intern at CCID Consulting Co. 2012, PhD in Machine Learning Department at Carnegie Mellon University 20152020