Zhuoran Yang

is this you? claim profile

0 followers

Research Internt at Princeton University

  • More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning

    We consider the weakly supervised binary classification problem where the labels are randomly flipped with probability 1- α. Although there exist numerous algorithms for this problem, it remains theoretically unexplored how the statistical accuracies and computational efficiency of these algorithms depend on the degree of supervision, which is quantified by α. In this paper, we characterize the effect of α by establishing the information-theoretic and computational boundaries, namely, the minimax-optimal statistical accuracy that can be achieved by all algorithms, and polynomial-time algorithms under an oracle computational model. For small α, our result shows a gap between these two boundaries, which represents the computational price of achieving the information-theoretic boundary due to the lack of supervision. Interestingly, we also show that this gap narrows as α increases. In other words, having more supervision, i.e., more correct labels, not only improves the optimal statistical accuracy as expected, but also enhances the computational efficiency for achieving such accuracy.

    07/14/2019 ∙ by Xinyang Yi, et al. ∙ 6 share

    read it

  • Provably Efficient Reinforcement Learning with Linear Function Approximation

    Modern Reinforcement Learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy. The introduction of function approximation raises a fundamental set of challenges involving computational and statistical efficiency, especially given the need to manage the exploration/exploitation tradeoff. As a result, a core RL question remains open: how can we design provably efficient RL algorithms that incorporate function approximation? This question persists even in a basic setting with linear dynamics and linear rewards, for which only linear function approximation is needed. This paper presents the first provable RL algorithm with both polynomial runtime and polynomial sample complexity in this linear setting, without requiring a "simulator" or additional assumptions. Concretely, we prove that an optimistic modification of Least-Squares Value Iteration (LSVI)---a classical algorithm frequently studied in the linear setting---achieves Õ(√(d^3H^3T)) regret, where d is the ambient dimension of feature space, H is the length of each episode, and T is the total number of steps. Importantly, such regret is independent of the number of states and actions.

    07/11/2019 ∙ by Chi Jin, et al. ∙ 3 share

    read it

  • Finite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement Learning

    Despite the increasing interest in multi-agent reinforcement learning (MARL) in the community, understanding its theoretical foundation has long been recognized as a challenging problem. In this work, we make an attempt towards addressing this problem, by providing finite-sample analyses for fully decentralized MARL. Specifically, we consider two fully decentralized MARL settings, where teams of agents are connected by time-varying communication networks, and either collaborate or compete in a zero-sum game, without the absence of any central controller. These settings cover many conventional MARL settings in the literature. For both settings, we develop batch MARL algorithms that can be implemented in a fully decentralized fashion, and quantify the finite-sample errors of the estimated action-value functions. Our error analyses characterize how the function class, the number of samples within each iteration, and the number of iterations determine the statistical accuracy of the proposed algorithms. Our results, compared to the finite-sample bounds for single-agent RL, identify the involvement of additional error terms caused by decentralized computation, which is inherent in our decentralized MARL setting. To our knowledge, our work appears to be the first finite-sample analyses for MARL, which sheds light on understanding both the sample and computational efficiency of MARL algorithms.

    12/06/2018 ∙ by Kaiqing Zhang, et al. ∙ 2 share

    read it

  • Robust One-Bit Recovery via ReLU Generative Networks: Improved Statistical Rates and Global Landscape Analysis

    We study the robust one-bit compressed sensing problem whose goal is to design an algorithm that faithfully recovers any sparse target vector θ_0∈R^d uniformly m quantized noisy measurements. Under the assumption that the measurements are sub-Gaussian random vectors, to recover any k-sparse θ_0 (k≪ d) uniformly up to an error ε with high probability, the best known computationally tractable algorithm requires m≥Õ(k d/ε^4) measurements. In this paper, we consider a new framework for the one-bit sensing problem where the sparsity is implicitly enforced via mapping a low dimensional representation x_0 ∈R^k through a known n-layer ReLU generative network G:R^k→R^d. Such a framework poses low-dimensional priors on θ_0 without a known basis. We propose to recover the target G(x_0) via an unconstrained empirical risk minimization (ERM) problem under a much weaker sub-exponential measurement assumption. For such a problem, we establish a joint statistical and computational analysis. In particular, we prove that the ERM estimator in this new framework achieves an improved statistical rate of m=Õ(kn d /ε^2) recovering any G(x_0) uniformly up to an error ε. Moreover, from the lens of computation, despite non-convexity, we prove that the objective of our ERM problem has no spurious stationary point, that is, any stationary point are equally good for recovering the true target up to scaling with a certain accuracy. Furthermore, our analysis also shed lights on the possibility of inverting a deep generative model under partial and quantized measurements, complementing the recent success of using deep generative models for inverse problems.

    08/14/2019 ∙ by Shuang Qiu, et al. ∙ 2 share

    read it

  • On Stein's Identity and Near-Optimal Estimation in High-dimensional Index Models

    We consider estimating the parametric components of semi-parametric multiple index models in a high-dimensional non-Gaussian setting. Our estimators leverage the score function based second-order Stein's lemma and do not require Gaussian or elliptical symmetry assumptions made in the literature. Moreover, to handle score functions and response variables that are heavy-tailed, our estimators are constructed via carefully thresholding their empirical counterparts. We show that our estimator achieves near- optimal statistical rate of convergence in several settings. We supplement our theoretical results via simulation experiments that confirm the theory.

    09/26/2017 ∙ by Zhuoran Yang, et al. ∙ 0 share

    read it

  • Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference

    We study parameter estimation and asymptotic inference for sparse nonlinear regression. More specifically, we assume the data are given by y = f( x^β^* ) + ϵ, where f is nonlinear. To recover β^*, we propose an ℓ_1-regularized least-squares estimator. Unlike classical linear regression, the corresponding optimization problem is nonconvex because of the nonlinearity of f. In spite of the nonconvexity, we prove that under mild conditions, every stationary point of the objective enjoys an optimal statistical rate of convergence. In addition, we provide an efficient algorithm that provably converges to a stationary point. We also access the uncertainty of the obtained estimator. Specifically, based on any stationary point of the objective, we construct valid hypothesis tests and confidence intervals for the low dimensional components of the high-dimensional parameter β^*. Detailed numerical results are provided to back up our theory.

    11/14/2015 ∙ by Zhuoran Yang, et al. ∙ 0 share

    read it

  • On Semiparametric Exponential Family Graphical Models

    We propose a new class of semiparametric exponential family graphical models for the analysis of high dimensional mixed data. Different from the existing mixed graphical models, we allow the nodewise conditional distributions to be semiparametric generalized linear models with unspecified base measure functions. Thus, one advantage of our method is that it is unnecessary to specify the type of each node and the method is more convenient to apply in practice. Under the proposed model, we consider both problems of parameter estimation and hypothesis testing in high dimensions. In particular, we propose a symmetric pairwise score test for the presence of a single edge in the graph. Compared to the existing methods for hypothesis tests, our approach takes into account of the symmetry of the parameters, such that the inferential results are invariant with respect to the different parametrizations of the same edge. Thorough numerical simulations and a real data example are provided to back up our results.

    12/30/2014 ∙ by Zhuoran Yang, et al. ∙ 0 share

    read it

  • Misspecified Nonconvex Statistical Optimization for Phase Retrieval

    Existing nonconvex statistical optimization theory and methods crucially rely on the correct specification of the underlying "true" statistical models. To address this issue, we take a first step towards taming model misspecification by studying the high-dimensional sparse phase retrieval problem with misspecified link functions. In particular, we propose a simple variant of the thresholded Wirtinger flow algorithm that, given a proper initialization, linearly converges to an estimator with optimal statistical accuracy for a broad family of unknown link functions. We further provide extensive numerical experiments to support our theoretical findings.

    12/18/2017 ∙ by Zhuoran Yang, et al. ∙ 0 share

    read it

  • Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents

    We consider the problem of fully decentralized multi-agent reinforcement learning (MARL), where the agents are located at the nodes of a time-varying communication network. Specifically, we assume that the reward functions of the agents might correspond to different tasks, and are only known to the corresponding agent. Moreover, each agent makes individual decisions based on both the information observed locally and the messages received from its neighbors over the network. Within this setting, the collective goal of the agents is to maximize the globally averaged return over the network through exchanging information with their neighbors. To this end, we propose two decentralized actor-critic algorithms with function approximation, which are applicable to large-scale MARL problems where both the number of states and the number of agents are massively large. Under the decentralized structure, the actor step is performed individually by each agent with no need to infer the policies of others. For the critic step, we propose a consensus update via communication over the network. Our algorithms are fully incremental and can be implemented in an online fashion. Convergence analyses of the algorithms are provided when the value functions are approximated within the class of linear functions. Extensive simulation results with both linear and nonlinear function approximations are presented to validate the proposed algorithms. Our work appears to be the first study of fully decentralized MARL algorithms for networked agents with function approximation, with provable convergence guarantees.

    02/23/2018 ∙ by Kaiqing Zhang, et al. ∙ 0 share

    read it

  • Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization

    Despite the success of single-agent reinforcement learning, multi-agent reinforcement learning (MARL) remains challenging due to complex interactions between agents. Motivated by decentralized applications such as sensor networks, swarm robotics, and power grids, we study policy evaluation in MARL, where agents with jointly observed state-action pairs and private local rewards collaborate to learn the value of a given policy. In this paper, we propose a double averaging scheme, where each agent iteratively performs averaging over both space and time to incorporate neighboring gradient information and local reward information, respectively. We prove that the proposed algorithm converges to the optimal solution at a global geometric rate. In particular, such an algorithm is built upon a primal-dual reformulation of the mean squared projected Bellman error minimization problem, which gives rise to a decentralized convex-concave saddle-point problem. To the best of our knowledge, the proposed double averaging primal-dual optimization algorithm is the first to achieve fast finite-time convergence on decentralized convex-concave saddle-point problems.

    06/03/2018 ∙ by Hoi-To Wai, et al. ∙ 0 share

    read it

  • Tensor Methods for Additive Index Models under Discordance and Heterogeneity

    Motivated by the sampling problems and heterogeneity issues common in high- dimensional big datasets, we consider a class of discordant additive index models. We propose method of moments based procedures for estimating the indices of such discordant additive index models in both low and high-dimensional settings. Our estimators are based on factorizing certain moment tensors and are also applicable in the overcomplete setting, where the number of indices is more than the dimensionality of the datasets. Furthermore, we provide rates of convergence of our estimator in both high and low-dimensional setting. Establishing such results requires deriving tensor operator norm concentration inequalities that might be of independent interest. Finally, we provide simulation results supporting our theory. Our contributions extend the applicability of tensor methods for novel models in addition to making progress on understanding theoretical properties of such tensor methods.

    07/17/2018 ∙ by Krishnakumar Balasubramanian, et al. ∙ 0 share

    read it