
Solving Discounted Stochastic TwoPlayer Games with NearOptimal Time and Sample Complexity
In this paper, we settle the sampling complexity of solving discounted t...
Diffusion Approximations for Online Principal Component Estimation and Global Convergence
In this paper, we propose to adopt the diffusion approximation tools to ...
Online Factorization and Partition of Complex Networks From Random Walks
Finding the reduceddimensional structure is critical to understanding c...
Stochastic PrimalDual Methods and Sample Complexity of Reinforcement Learning
We study the online estimation of the optimal policy of a Markov decisio...
Accelerating Stochastic Composition Optimization
Consider the stochastic composition optimization problem where the objec...
NearOptimal Stochastic Approximation for Online Principal Component Estimation
Principal component analysis (PCA) has been a prominent tool for highdi...
Random MultiConstraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints
Consider convex optimization problems subject to a large number of const...
Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of ExpectedValue Functions
Classical stochastic gradient methods are well suited for minimizing exp...
Improved Incremental FirstOrder Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization
We consider the nonsmooth convex composition optimization problem where ...
Improved Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization
We consider the nonsmooth convex composition optimization problem where ...
Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization
Stochastic optimization naturally arises in machine learning. Efficient ...
Variance Reduction Methods for Sublinear Reinforcement Learning
This work considers the problem of provably optimal reinforcement learni...
State Compression of Markov Processes via Empirical LowRank Estimation
Model reduction is a central problem in analyzing complex systems and hi...
PrimalDual π Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems
Consider the problem of approximating the optimal policy of a Markov dec...
Estimation of Markov Chain via Rankconstrained Likelihood
This paper studies the recovery and state compression of lowrank Markov...
Scalable Bilinear π Learning Using State and Action Features
Approximate linear programming (ALP) represents one of the major algorit...
Adaptive LowNonnegativeRank Approximation for State Aggregation of Markov Chains
This paper develops a lownonnegativerank approximation method to ident...
State Aggregation Learning from Markov Transition Data
State aggregation is a model reduction method rooted in control theory a...
A bird'seye view on coherence, and a worm'seye view on cohesion
Generating coherent and cohesive longform texts is a challenging proble...
GraphAdaptive Pruning for Efficient Inference of Convolutional Neural Networks
In this work, we propose a graphadaptive pruning (GAP) method for effic...
SampleOptimal Parametric QLearning with Linear Transition Models
Consider a Markov decision process (MDP) that admits a set of stateacti...
Learning to Control in Metric Space with Optimal Regret
We study online reinforcement learning for finitehorizon deterministic ...
Reinforcement Leaning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Exploration in reinforcement learning (RL) suffers from the curse of dim...
Learning lowdimensional state embeddings and metastable clusters from time series data
This paper studies how to find compact state embeddings from highdimens...
RL4health: Crowdsourcing Reinforcement Learning for Knee Replacement Pathway Optimization
Joint replacement is the most common inpatient surgical treatment in the...
VotingBased MultiAgent Reinforcement Learning
The recent success of singleagent reinforcement learning (RL) encourage...
Learning Markov models via lowrank optimization
Modeling unknown systems from data is a precursor of system optimization...
FeatureBased QLearning for TwoPlayer Stochastic Games
Consider a twoplayer zerosum stochastic game where the transition func...
Continuous Control with Contexts, Provably
A fundamental challenge in artificial intelligence is to build an agent ...
Unsupervised Common Question Generation from Multiple Documents using Reinforced Contrastive Coordinator
Web search engines today return a ranked list of document links in respo...
Sketching Transformed Matrices with Applications to Natural Language Processing
Suppose we are given a large matrix A=(a_i,j) that cannot be stored in m...
MinimaxOptimal OffPolicy Evaluation with Linear Function Approximation
This paper studies the statistical theory of batch data reinforcement le...
Characterizing Deep Learning Training Workloads on AlibabaPAI
Modern deep learning models have been exploited in various domains, incl...
Mengdi Wang
Assistant Professor of Department of Operations Research and Financial Engineering at Princeton University