
On the Sample Complexity and Metastability of Heavytailed Policy Search in Continuous Control
Reinforcement learning is a framework for interactive decisionmaking wi...
read it

You Only Compress Once: Towards Effective and Elastic BERT Compression via ExploitExplore Stochastic Nature Gradient
Despite superior performance on various natural language processing task...
read it

1×N Block Pattern for Network Sparsity
Though network sparsity emerges as a promising direction to overcome the...
read it

MARL with General Utilities via Decentralized Shadow Reward ActorCritic
We posit a new mechanism for cooperation in multiagent reinforcement le...
read it

Towards Compact CNNs via Collaborative Compression
Channel pruning and tensor decomposition have received extensive attenti...
read it

ThinFilm Smoothed Particle Hydrodynamics Fluid
We propose a particlebased method to simulate thinfilm fluid that join...
read it

Learning Good State and Action Representations via Tensor Decomposition
The transition kernel of a continuousstateaction Markov decision proce...
read it

On the Convergence and Sample Efficiency of VarianceReduced Policy Gradient Method
Policy gradient gives rise to a rich class of reinforcement learning (RL...
read it

Bootstrapping Statistical Inference for OffPolicy Evaluation
Bootstrapping provides a flexible and effective approach for assessing t...
read it

On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
The classical theory of reinforcement learning (RL) has focused on tabul...
read it

HighDimensional Sparse Linear Bandits
Stochastic linear bandits with highdimensional sparse features are a pr...
read it

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient
This paper provides a statistical analysis of highdimensional batch Rei...
read it

Online Sparse Reinforcement Learning
We investigate the hardness of online reinforcement learning in fixed ho...
read it

Generalized Leverage Score Sampling for Neural Networks
Leverage score sampling is a powerful technique that originates from the...
read it

Variational Policy Gradient Method for Reinforcement Learning with General Utilities
In recent years, reinforcement learning (RL) systems with general goals ...
read it

ModelBased Reinforcement Learning with ValueTargeted Regression
This paper studies modelbased reinforcement learning (RL) for regret mi...
read it

Concept Annotation for Intelligent Textbooks
With the increased popularity of electronic textbooks, there is a growin...
read it

Knowledge Annotation for Intelligent Textbooks
With the increased popularity of electronic textbooks, there is a growin...
read it

Cautious Reinforcement Learning via Distributional Risk in the Dual Domain
We study the estimation of risksensitive policies in reinforcement lear...
read it

Sketching Transformed Matrices with Applications to Natural Language Processing
Suppose we are given a large matrix A=(a_i,j) that cannot be stored in m...
read it

MinimaxOptimal OffPolicy Evaluation with Linear Function Approximation
This paper studies the statistical theory of batch data reinforcement le...
read it

Unsupervised Common Question Generation from Multiple Documents using Reinforced Contrastive Coordinator
Web search engines today return a ranked list of document links in respo...
read it

Continuous Control with Contexts, Provably
A fundamental challenge in artificial intelligence is to build an agent ...
read it

Characterizing Deep Learning Training Workloads on AlibabaPAI
Modern deep learning models have been exploited in various domains, incl...
read it

Solving Discounted Stochastic TwoPlayer Games with NearOptimal Time and Sample Complexity
In this paper, we settle the sampling complexity of solving discounted t...
read it

VotingBased MultiAgent Reinforcement Learning
The recent success of singleagent reinforcement learning (RL) encourage...
read it

Learning Markov models via lowrank optimization
Modeling unknown systems from data is a precursor of system optimization...
read it

FeatureBased QLearning for TwoPlayer Stochastic Games
Consider a twoplayer zerosum stochastic game where the transition func...
read it

Learning lowdimensional state embeddings and metastable clusters from time series data
This paper studies how to find compact state embeddings from highdimens...
read it

RL4health: Crowdsourcing Reinforcement Learning for Knee Replacement Pathway Optimization
Joint replacement is the most common inpatient surgical treatment in the...
read it

Reinforcement Leaning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Exploration in reinforcement learning (RL) suffers from the curse of dim...
read it

Learning to Control in Metric Space with Optimal Regret
We study online reinforcement learning for finitehorizon deterministic ...
read it

SampleOptimal Parametric QLearning with Linear Transition Models
Consider a Markov decision process (MDP) that admits a set of stateacti...
read it

GraphAdaptive Pruning for Efficient Inference of Convolutional Neural Networks
In this work, we propose a graphadaptive pruning (GAP) method for effic...
read it

State Aggregation Learning from Markov Transition Data
State aggregation is a model reduction method rooted in control theory a...
read it

A bird'seye view on coherence, and a worm'seye view on cohesion
Generating coherent and cohesive longform texts is a challenging proble...
read it

Adaptive LowNonnegativeRank Approximation for State Aggregation of Markov Chains
This paper develops a lownonnegativerank approximation method to ident...
read it

Diffusion Approximations for Online Principal Component Estimation and Global Convergence
In this paper, we propose to adopt the diffusion approximation tools to ...
read it

Scalable Bilinear π Learning Using State and Action Features
Approximate linear programming (ALP) represents one of the major algorit...
read it

Estimation of Markov Chain via Rankconstrained Likelihood
This paper studies the recovery and state compression of lowrank Markov...
read it

Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization
Stochastic optimization naturally arises in machine learning. Efficient ...
read it

Variance Reduction Methods for Sublinear Reinforcement Learning
This work considers the problem of provably optimal reinforcement learni...
read it

State Compression of Markov Processes via Empirical LowRank Estimation
Model reduction is a central problem in analyzing complex systems and hi...
read it

Improved Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization
We consider the nonsmooth convex composition optimization problem where ...
read it

Improved Incremental FirstOrder Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization
We consider the nonsmooth convex composition optimization problem where ...
read it

PrimalDual π Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems
Consider the problem of approximating the optimal policy of a Markov dec...
read it

Online Factorization and Partition of Complex Networks From Random Walks
Finding the reduceddimensional structure is critical to understanding c...
read it

Stochastic PrimalDual Methods and Sample Complexity of Reinforcement Learning
We study the online estimation of the optimal policy of a Markov decisio...
read it

Accelerating Stochastic Composition Optimization
Consider the stochastic composition optimization problem where the objec...
read it

NearOptimal Stochastic Approximation for Online Principal Component Estimation
Principal component analysis (PCA) has been a prominent tool for highdi...
read it
Mengdi Wang
is this you? claim profile
Assistant Professor of Department of Operations Research and Financial Engineering at Princeton University