
More Supervision, Less Computation: StatisticalComputational Tradeoffs in Weakly Supervised Learning
We consider the weakly supervised binary classification problem where the labels are randomly flipped with probability 1 α. Although there exist numerous algorithms for this problem, it remains theoretically unexplored how the statistical accuracies and computational efficiency of these algorithms depend on the degree of supervision, which is quantified by α. In this paper, we characterize the effect of α by establishing the informationtheoretic and computational boundaries, namely, the minimaxoptimal statistical accuracy that can be achieved by all algorithms, and polynomialtime algorithms under an oracle computational model. For small α, our result shows a gap between these two boundaries, which represents the computational price of achieving the informationtheoretic boundary due to the lack of supervision. Interestingly, we also show that this gap narrows as α increases. In other words, having more supervision, i.e., more correct labels, not only improves the optimal statistical accuracy as expected, but also enhances the computational efficiency for achieving such accuracy.
07/14/2019 ∙ by Xinyang Yi, et al. ∙ 6 ∙ shareread it

Provably Efficient Reinforcement Learning with Linear Function Approximation
Modern Reinforcement Learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy. The introduction of function approximation raises a fundamental set of challenges involving computational and statistical efficiency, especially given the need to manage the exploration/exploitation tradeoff. As a result, a core RL question remains open: how can we design provably efficient RL algorithms that incorporate function approximation? This question persists even in a basic setting with linear dynamics and linear rewards, for which only linear function approximation is needed. This paper presents the first provable RL algorithm with both polynomial runtime and polynomial sample complexity in this linear setting, without requiring a "simulator" or additional assumptions. Concretely, we prove that an optimistic modification of LeastSquares Value Iteration (LSVI)a classical algorithm frequently studied in the linear settingachieves Õ(√(d^3H^3T)) regret, where d is the ambient dimension of feature space, H is the length of each episode, and T is the total number of steps. Importantly, such regret is independent of the number of states and actions.
07/11/2019 ∙ by Chi Jin, et al. ∙ 3 ∙ shareread it

FiniteSample Analyses for Fully Decentralized MultiAgent Reinforcement Learning
Despite the increasing interest in multiagent reinforcement learning (MARL) in the community, understanding its theoretical foundation has long been recognized as a challenging problem. In this work, we make an attempt towards addressing this problem, by providing finitesample analyses for fully decentralized MARL. Specifically, we consider two fully decentralized MARL settings, where teams of agents are connected by timevarying communication networks, and either collaborate or compete in a zerosum game, without the absence of any central controller. These settings cover many conventional MARL settings in the literature. For both settings, we develop batch MARL algorithms that can be implemented in a fully decentralized fashion, and quantify the finitesample errors of the estimated actionvalue functions. Our error analyses characterize how the function class, the number of samples within each iteration, and the number of iterations determine the statistical accuracy of the proposed algorithms. Our results, compared to the finitesample bounds for singleagent RL, identify the involvement of additional error terms caused by decentralized computation, which is inherent in our decentralized MARL setting. To our knowledge, our work appears to be the first finitesample analyses for MARL, which sheds light on understanding both the sample and computational efficiency of MARL algorithms.
12/06/2018 ∙ by Kaiqing Zhang, et al. ∙ 2 ∙ shareread it

Robust OneBit Recovery via ReLU Generative Networks: Improved Statistical Rates and Global Landscape Analysis
We study the robust onebit compressed sensing problem whose goal is to design an algorithm that faithfully recovers any sparse target vector θ_0∈R^d uniformly m quantized noisy measurements. Under the assumption that the measurements are subGaussian random vectors, to recover any ksparse θ_0 (k≪ d) uniformly up to an error ε with high probability, the best known computationally tractable algorithm requires m≥Õ(k d/ε^4) measurements. In this paper, we consider a new framework for the onebit sensing problem where the sparsity is implicitly enforced via mapping a low dimensional representation x_0 ∈R^k through a known nlayer ReLU generative network G:R^k→R^d. Such a framework poses lowdimensional priors on θ_0 without a known basis. We propose to recover the target G(x_0) via an unconstrained empirical risk minimization (ERM) problem under a much weaker subexponential measurement assumption. For such a problem, we establish a joint statistical and computational analysis. In particular, we prove that the ERM estimator in this new framework achieves an improved statistical rate of m=Õ(kn d /ε^2) recovering any G(x_0) uniformly up to an error ε. Moreover, from the lens of computation, despite nonconvexity, we prove that the objective of our ERM problem has no spurious stationary point, that is, any stationary point are equally good for recovering the true target up to scaling with a certain accuracy. Furthermore, our analysis also shed lights on the possibility of inverting a deep generative model under partial and quantized measurements, complementing the recent success of using deep generative models for inverse problems.
08/14/2019 ∙ by Shuang Qiu, et al. ∙ 2 ∙ shareread it

On Stein's Identity and NearOptimal Estimation in Highdimensional Index Models
We consider estimating the parametric components of semiparametric multiple index models in a highdimensional nonGaussian setting. Our estimators leverage the score function based secondorder Stein's lemma and do not require Gaussian or elliptical symmetry assumptions made in the literature. Moreover, to handle score functions and response variables that are heavytailed, our estimators are constructed via carefully thresholding their empirical counterparts. We show that our estimator achieves near optimal statistical rate of convergence in several settings. We supplement our theoretical results via simulation experiments that confirm the theory.
09/26/2017 ∙ by Zhuoran Yang, et al. ∙ 0 ∙ shareread it

Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference
We study parameter estimation and asymptotic inference for sparse nonlinear regression. More specifically, we assume the data are given by y = f( x^β^* ) + ϵ, where f is nonlinear. To recover β^*, we propose an ℓ_1regularized leastsquares estimator. Unlike classical linear regression, the corresponding optimization problem is nonconvex because of the nonlinearity of f. In spite of the nonconvexity, we prove that under mild conditions, every stationary point of the objective enjoys an optimal statistical rate of convergence. In addition, we provide an efficient algorithm that provably converges to a stationary point. We also access the uncertainty of the obtained estimator. Specifically, based on any stationary point of the objective, we construct valid hypothesis tests and confidence intervals for the low dimensional components of the highdimensional parameter β^*. Detailed numerical results are provided to back up our theory.
11/14/2015 ∙ by Zhuoran Yang, et al. ∙ 0 ∙ shareread it

On Semiparametric Exponential Family Graphical Models
We propose a new class of semiparametric exponential family graphical models for the analysis of high dimensional mixed data. Different from the existing mixed graphical models, we allow the nodewise conditional distributions to be semiparametric generalized linear models with unspecified base measure functions. Thus, one advantage of our method is that it is unnecessary to specify the type of each node and the method is more convenient to apply in practice. Under the proposed model, we consider both problems of parameter estimation and hypothesis testing in high dimensions. In particular, we propose a symmetric pairwise score test for the presence of a single edge in the graph. Compared to the existing methods for hypothesis tests, our approach takes into account of the symmetry of the parameters, such that the inferential results are invariant with respect to the different parametrizations of the same edge. Thorough numerical simulations and a real data example are provided to back up our results.
12/30/2014 ∙ by Zhuoran Yang, et al. ∙ 0 ∙ shareread it

Misspecified Nonconvex Statistical Optimization for Phase Retrieval
Existing nonconvex statistical optimization theory and methods crucially rely on the correct specification of the underlying "true" statistical models. To address this issue, we take a first step towards taming model misspecification by studying the highdimensional sparse phase retrieval problem with misspecified link functions. In particular, we propose a simple variant of the thresholded Wirtinger flow algorithm that, given a proper initialization, linearly converges to an estimator with optimal statistical accuracy for a broad family of unknown link functions. We further provide extensive numerical experiments to support our theoretical findings.
12/18/2017 ∙ by Zhuoran Yang, et al. ∙ 0 ∙ shareread it

Fully Decentralized MultiAgent Reinforcement Learning with Networked Agents
We consider the problem of fully decentralized multiagent reinforcement learning (MARL), where the agents are located at the nodes of a timevarying communication network. Specifically, we assume that the reward functions of the agents might correspond to different tasks, and are only known to the corresponding agent. Moreover, each agent makes individual decisions based on both the information observed locally and the messages received from its neighbors over the network. Within this setting, the collective goal of the agents is to maximize the globally averaged return over the network through exchanging information with their neighbors. To this end, we propose two decentralized actorcritic algorithms with function approximation, which are applicable to largescale MARL problems where both the number of states and the number of agents are massively large. Under the decentralized structure, the actor step is performed individually by each agent with no need to infer the policies of others. For the critic step, we propose a consensus update via communication over the network. Our algorithms are fully incremental and can be implemented in an online fashion. Convergence analyses of the algorithms are provided when the value functions are approximated within the class of linear functions. Extensive simulation results with both linear and nonlinear function approximations are presented to validate the proposed algorithms. Our work appears to be the first study of fully decentralized MARL algorithms for networked agents with function approximation, with provable convergence guarantees.
02/23/2018 ∙ by Kaiqing Zhang, et al. ∙ 0 ∙ shareread it

MultiAgent Reinforcement Learning via Double Averaging PrimalDual Optimization
Despite the success of singleagent reinforcement learning, multiagent reinforcement learning (MARL) remains challenging due to complex interactions between agents. Motivated by decentralized applications such as sensor networks, swarm robotics, and power grids, we study policy evaluation in MARL, where agents with jointly observed stateaction pairs and private local rewards collaborate to learn the value of a given policy. In this paper, we propose a double averaging scheme, where each agent iteratively performs averaging over both space and time to incorporate neighboring gradient information and local reward information, respectively. We prove that the proposed algorithm converges to the optimal solution at a global geometric rate. In particular, such an algorithm is built upon a primaldual reformulation of the mean squared projected Bellman error minimization problem, which gives rise to a decentralized convexconcave saddlepoint problem. To the best of our knowledge, the proposed double averaging primaldual optimization algorithm is the first to achieve fast finitetime convergence on decentralized convexconcave saddlepoint problems.
06/03/2018 ∙ by HoiTo Wai, et al. ∙ 0 ∙ shareread it

Tensor Methods for Additive Index Models under Discordance and Heterogeneity
Motivated by the sampling problems and heterogeneity issues common in high dimensional big datasets, we consider a class of discordant additive index models. We propose method of moments based procedures for estimating the indices of such discordant additive index models in both low and highdimensional settings. Our estimators are based on factorizing certain moment tensors and are also applicable in the overcomplete setting, where the number of indices is more than the dimensionality of the datasets. Furthermore, we provide rates of convergence of our estimator in both high and lowdimensional setting. Establishing such results requires deriving tensor operator norm concentration inequalities that might be of independent interest. Finally, we provide simulation results supporting our theory. Our contributions extend the applicability of tensor methods for novel models in addition to making progress on understanding theoretical properties of such tensor methods.
07/17/2018 ∙ by Krishnakumar Balasubramanian, et al. ∙ 0 ∙ shareread it