Mengdi Wang

is this you? claim profile

0 followers

Assistant Professor of Department of Operations Research and Financial Engineering at Princeton University

  • Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity

    In this paper, we settle the sampling complexity of solving discounted two-player turn-based zero-sum stochastic games up to polylogarithmic factors. Given a stochastic game with discount factor γ∈(0,1) we provide an algorithm that computes an ϵ-optimal strategy with high-probability given Õ((1 - γ)^-3ϵ^-2) samples from the transition function for each state-action-pair. Our algorithm runs in time nearly linear in the number of samples and uses space nearly linear in the number of state-action pairs. As stochastic games generalize Markov decision processes (MDPs) our runtime and sample complexities are optimal due to Azar et al (2013). We achieve our results by showing how to generalize a near-optimal Q-learning based algorithms for MDP, in particular Sidford et al (2018), to two-player strategy computation algorithms. This overcomes limitations of standard Q-learning and strategy iteration or alternating minimization based approaches and we hope will pave the way for future reinforcement learning results by facilitating the extension of MDP results to multi-agent settings with little loss.

    08/29/2019 ∙ by Aaron Sidford, et al. ∙ 12 share

    read it

  • Diffusion Approximations for Online Principal Component Estimation and Global Convergence

    In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient descent method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for principal component analysis under the additional assumption of bounded samples.

    08/29/2018 ∙ by Chris Junchi Li, et al. ∙ 4 share

    read it

  • Online Factorization and Partition of Complex Networks From Random Walks

    Finding the reduced-dimensional structure is critical to understanding complex networks. Existing approaches such as spectral clustering are applicable only when the full network is explicitly observed. In this paper, we focus on the online factorization and partition of implicit large-scale networks based on observations from an associated random walk. We formulate this into a nonconvex stochastic factorization problem and propose an efficient and scalable stochastic generalized Hebbian algorithm. The algorithm is able to process dependent state-transition data dynamically generated by the underlying network and learn a low-dimensional representation for each vertex. By applying a diffusion approximation analysis, we show that the continuous-time limiting process of the stochastic algorithm converges globally to the "principal components" of the Markov chain and achieves a nearly optimal sample complexity. Once given the learned low-dimensional representations, we further apply clustering techniques to recover the network partition. We show that when the associated Markov process is lumpable, one can recover the partition exactly with high probability. We apply the proposed approach to model the traffic flow of Manhattan as city-wide random walks. By using our algorithm to analyze the taxi trip data, we discover a latent partition of the Manhattan city that closely matches the traffic dynamics.

    05/22/2017 ∙ by Lin F. Yang, et al. ∙ 0 share

    read it

  • Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning

    We study the online estimation of the optimal policy of a Markov decision process (MDP). We propose a class of Stochastic Primal-Dual (SPD) methods which exploit the inherent minimax duality of Bellman equations. The SPD methods update a few coordinates of the value and policy estimates as a new state transition is observed. These methods use small storage and has low computational complexity per iteration. The SPD methods find an absolute-ϵ-optimal policy, with high probability, using O(|S|^4 |A|^2σ^2 /(1-γ)^6ϵ^2) iterations/samples for the infinite-horizon discounted-reward MDP and O(|S|^4 |A|^2H^6σ^2 /ϵ^2) for the finite-horizon MDP.

    12/08/2016 ∙ by Yichen Chen, et al. ∙ 0 share

    read it

  • Accelerating Stochastic Composition Optimization

    Consider the stochastic composition optimization problem where the objective is a composition of two expected-value functions. We propose a new stochastic first-order method, namely the accelerated stochastic compositional proximal gradient (ASC-PG) method, which updates based on queries to the sampling oracle using two different timescales. The ASC-PG is the first proximal gradient method for the stochastic composition problem that can deal with nonsmooth regularization penalty. We show that the ASC-PG exhibits faster convergence than the best known algorithms, and that it achieves the optimal sample-error complexity in several important special cases. We further demonstrate the application of ASC-PG to reinforcement learning and conduct numerical experiments.

    07/25/2016 ∙ by Mengdi Wang, et al. ∙ 0 share

    read it

  • Near-Optimal Stochastic Approximation for Online Principal Component Estimation

    Principal component analysis (PCA) has been a prominent tool for high-dimensional data analysis. Online algorithms that estimate the principal component by processing streaming data are of tremendous practical and theoretical interests. Despite its rich applications, theoretical convergence analysis remains largely open. In this paper, we cast online PCA into a stochastic nonconvex optimization problem, and we analyze the online PCA algorithm as a stochastic approximation iteration. The stochastic approximation iteration processes data points incrementally and maintains a running estimate of the principal component. We prove for the first time a nearly optimal finite-sample error bound for the online PCA algorithm. Under the subgaussian assumption, we show that the finite-sample error bound closely matches the minimax information lower bound.

    03/16/2016 ∙ by Chris Junchi Li, et al. ∙ 0 share

    read it

  • Random Multi-Constraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints

    Consider convex optimization problems subject to a large number of constraints. We focus on stochastic problems in which the objective takes the form of expected values and the feasible set is the intersection of a large number of convex sets. We propose a class of algorithms that perform both stochastic gradient descent and random feasibility updates simultaneously. At every iteration, the algorithms sample a number of projection points onto a randomly selected small subsets of all constraints. Three feasibility update schemes are considered: averaging over random projected points, projecting onto the most distant sample, projecting onto a special polyhedral set constructed based on sample points. We prove the almost sure convergence of these algorithms, and analyze the iterates' feasibility error and optimality error, respectively. We provide new convergence rate benchmarks for stochastic first-order optimization with many constraints. The rate analysis and numerical experiments reveal that the algorithm using the polyhedral-set projection scheme is the most efficient one within known algorithms.

    11/12/2015 ∙ by Mengdi Wang, et al. ∙ 0 share

    read it

  • Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of Expected-Value Functions

    Classical stochastic gradient methods are well suited for minimizing expected-value objective functions. However, they do not apply to the minimization of a nonlinear function involving expected values or a composition of two expected-value functions, i.e., problems of the form _x E_v [f_v(E_w [g_w(x)])]. In order to solve this stochastic composition problem, we propose a class of stochastic compositional gradient descent (SCGD) algorithms that can be viewed as stochastic versions of quasi-gradient method. SCGD update the solutions based on noisy sample gradients of f_v,g_w and use an auxiliary variable to track the unknown quantity E_w[g_w(x)]. We prove that the SCGD converge almost surely to an optimal solution for convex optimization problems, as long as such a solution exists. The convergence involves the interplay of two iterations with different time scales. For nonsmooth convex problems, the SCGD achieve a convergence rate of O(k^-1/4) in the general case and O(k^-2/3) in the strongly convex case, after taking k samples. For smooth convex problems, the SCGD can be accelerated to converge at a rate of O(k^-2/7) in the general case and O(k^-4/5) in the strongly convex case. For nonconvex problems, we prove that any limit point generated by SCGD is a stationary point, for which we also provide the convergence rate analysis. Indeed, the stochastic setting where one wants to optimize compositions of expected-value functions is very common in practice. The proposed SCGD methods find wide applications in learning, estimation, dynamic programming, etc.

    11/14/2014 ∙ by Mengdi Wang, et al. ∙ 0 share

    read it

  • Improved Incremental First-Order Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization

    We consider the nonsmooth convex composition optimization problem where the objective is a composition of two finite-sum functions and analyze stochastic compositional variance reduced gradient (SCVRG) methods for them. SCVRG and its variants have recently drawn much attention given their edge over stochastic compositional gradient descent (SCGD); but the theoretical analysis exclusively assumes strong convexity of the objective, which excludes several important examples such as Lasso, logistic regression, principle component analysis and deep neural nets. In contrast, we prove non-asymptotic incremental first-order oracle (IFO) complexity of SCVRG or its novel variants for nonsmooth convex composition optimization and show that they are provably faster than SCGD and gradient descent. More specifically, our method achieves the total IFO complexity of O((m+n)(1/ϵ)+1/ϵ^3) which improves that of O(1/ϵ^3.5) and O((m+n)/√(ϵ)) obtained by SCGD and accelerated gradient descent respectively. Experiments on sparse mean-variance optimization problem demonstrates that our method outperforms other competing methods.

    02/07/2018 ∙ by Tianyi Lin, et al. ∙ 0 share

    read it

  • Improved Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization

    We consider the nonsmooth convex composition optimization problem where the objective is a composition of two finite-sum functions and analyze stochastic compositional variance reduced gradient (SCVRG) methods for them. SCVRG and its variants have recently drawn much attention given their edge over stochastic compositional gradient descent (SCGD); but the theoretical analysis exclusively assumes strong convexity of the objective, which excludes several important examples such as Lasso, logistic regression, principle component analysis and deep neural nets. In contrast, we prove non-asymptotic incremental first-order oracle (IFO) complexity of SCVRG or its novel variants for nonsmooth convex composition optimization and show that they are provably faster than SCGD and gradient descent. More specifically, our method achieves the total IFO complexity of O((m+n)(1/ϵ)+1/ϵ^3) which improves that of O(1/ϵ^3.5) and O((m+n)/√(ϵ)) obtained by SCGD and accelerated gradient descent respectively. Experiments on sparse mean-variance optimization problem demonstrates that our method outperforms other competing methods.

    02/07/2018 ∙ by Tianyi Lin, et al. ∙ 0 share

    read it

  • Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization

    Stochastic optimization naturally arises in machine learning. Efficient algorithms with provable guarantees, however, are still largely missing, when the objective function is nonconvex and the data points are dependent. This paper studies this fundamental challenge through a streaming PCA problem for stationary time series data. Specifically, our goal is to estimate the principle component of time series data with respect to the covariance matrix of the stationary distribution. Computationally, we propose a variant of Oja's algorithm combined with downsampling to control the bias of the stochastic gradient caused by the data dependency. Theoretically, we quantify the uncertainty of our proposed stochastic algorithm based on diffusion approximations. This allows us to prove the global convergence in terms of the continuous time limiting solution trajectory and further implies near optimal sample complexity. Numerical experiments are provided to support our analysis.

    03/06/2018 ∙ by Minshuo Chen, et al. ∙ 0 share

    read it