Amy Zhang

is this you? claim profile

0

  • Learning Causal State Representations of Partially Observable Environments

    Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose mechanisms to approximate causal states, which optimally compress the joint history of actions and observations in partially-observable Markov decision processes. Our proposed algorithm extracts causal state representations from RNNs that are trained to predict subsequent observations given the history. We demonstrate that these learned task-agnostic state abstractions can be used to efficiently learn policies for reinforcement learning problems with rich observation spaces. We evaluate agents using multiple partially observable navigation tasks with both discrete (GridWorld) and continuous (VizDoom, ALE) observation processes that cannot be solved by traditional memory-limited methods. Our experiments demonstrate systematic improvement of the DQN and tabular models using approximate causal state representations with respect to recurrent-DQN baselines trained with raw inputs.

    06/25/2019 ∙ by Amy Zhang, et al. ∙ 68 share

    read it

  • Natural Environment Benchmarks for Reinforcement Learning

    While current benchmark reinforcement learning (RL) tasks have been useful to drive progress in the field, they are in many ways poor substitutes for learning with real-world data. By testing increasingly complex RL algorithms on low-complexity simulation environments, we often end up with brittle RL policies that generalize poorly beyond the very specific domain. To combat this, we propose three new families of benchmark RL domains that contain some of the complexity of the natural world, while still supporting fast and extensive data acquisition. The proposed domains also permit a characterization of generalization through fair train/test separation, and easy comparison and replication of results. Through this work, we challenge the RL research community to develop more robust algorithms that meet high standards of evaluation.

    11/14/2018 ∙ by Amy Zhang, et al. ∙ 6 share

    read it

  • Building Detection from Satellite Images on a Global Scale

    In the last several years, remote sensing technology has opened up the possibility of performing large scale building detection from satellite imagery. Our work is some of the first to create population density maps from building detection on a large scale. The scale of our work on population density estimation via high resolution satellite images raises many issues, that we will address in this paper. The first was data acquisition. Labeling buildings from satellite images is a hard problem, one where we found our labelers to only be about 85 quality of labels, so we designed two separate policies for labels meant for training sets and those meant for test sets, since our requirements of the two set types are quite different. We also trained weakly supervised footprint detection models with the classification labels, and semi-supervised approaches with a small number of pixel-level labels, which are very expensive to procure.

    07/27/2017 ∙ by Amy Zhang, et al. ∙ 0 share

    read it

  • Feedback Neural Network for Weakly Supervised Geo-Semantic Segmentation

    Learning from weakly-supervised data is one of the main challenges in machine learning and computer vision, especially for tasks such as image semantic segmentation where labeling is extremely expensive and subjective. In this paper, we propose a novel neural network architecture to perform weakly-supervised learning by suppressing irrelevant neuron activations. It localizes objects of interest by learning from image-level categorical labels in an end-to-end manner. We apply this algorithm to a practical challenge of transforming satellite images into a map of settlements and individual buildings. Experimental results show that the proposed algorithm achieves superior performance and efficiency when compared with various baseline models.

    12/08/2016 ∙ by Xianming Liu, et al. ∙ 0 share

    read it

  • Guess Who Rated This Movie: Identifying Users Through Subspace Clustering

    It is often the case that, within an online recommender system, multiple users share a common account. Can such shared accounts be identified solely on the basis of the userprovided ratings? Once a shared account is identified, can the different users sharing it be identified as well? Whenever such user identification is feasible, it opens the way to possible improvements in personalized recommendations, but also raises privacy concerns. We develop a model for composite accounts based on unions of linear subspaces, and use subspace clustering for carrying out the identification task. We show that a significant fraction of such accounts is identifiable in a reliable manner, and illustrate potential uses for personalized recommendation.

    08/09/2014 ∙ by Amy Zhang, et al. ∙ 0 share

    read it

  • Mapping the world population one building at a time

    High resolution datasets of population density which accurately map sparsely-distributed human populations do not exist at a global scale. Typically, population data is obtained using censuses and statistical modeling. More recently, methods using remotely-sensed data have emerged, capable of effectively identifying urbanized areas. Obtaining high accuracy in estimation of population distribution in rural areas remains a very challenging task due to the simultaneous requirements of sufficient sensitivity and resolution to detect very sparse populations through remote sensing as well as reliable performance at a global scale. Here, we present a computer vision method based on machine learning to create population maps from satellite imagery at a global scale, with a spatial sensitivity corresponding to individual buildings and suitable for global deployment. By combining this settlement data with census data, we create population maps with 30 meter resolution for 18 countries. We validate our method, and find that the building identification has an average precision and recall of 0.95 and 0.91, respectively and that the population estimates have a standard error of a factor 2 or less. Based on our data, we analyze 29 percent of the world population, and show that 99 percent lives within 36 km of the nearest urban cluster. The resulting high-resolution population datasets have applications in infrastructure planning, vaccination campaign planning, disaster response efforts and risk analysis such as high accuracy flood risk analysis.

    12/15/2017 ∙ by Tobias G. Tiecke, et al. ∙ 0 share

    read it

  • Composable Planning with Attributes

    The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with a set of user defined attributes that parameterize the features of interest. We propose a method that learns a policy for transitioning between "nearby" sets of attributes, and maintains a graph of possible transitions. Given a task at test time that can be expressed in terms of a target set of attributes, and a current state, our model infers the attributes of the current state and searches over paths through attribute space to get a high level plan, and then uses its low level policy to execute the plan. We show in 3D block stacking, grid-world games, and StarCraft that our model is able to generalize to longer, more complex tasks at test time by composing simpler learned policies.

    03/01/2018 ∙ by Amy Zhang, et al. ∙ 0 share

    read it

  • Decoupling Dynamics and Reward for Transfer Learning

    Current reinforcement learning (RL) methods can successfully learn single tasks but often generalize poorly to modest perturbations in task domain or training procedure. In this work, we present a decoupled learning strategy for RL that creates a shared representation space where knowledge can be robustly transferred. We separate learning the task representation, the forward dynamics, the inverse dynamics and the reward function of the domain, and show that this decoupling improves performance within the task, transfers well to changes in dynamics and reward, and can be effectively used for online planning. Empirical results show good performance in both continuous and discrete RL domains.

    04/27/2018 ∙ by Amy Zhang, et al. ∙ 0 share

    read it

  • A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning

    The risks and perils of overfitting in machine learning are well known. However most of the treatment of this, including diagnostic tools and remedies, was developed for the supervised learning case. In this work, we aim to offer new perspectives on the characterization and prevention of overfitting in deep Reinforcement Learning (RL) methods, with a particular focus on continuous domains. We examine several aspects, such as how to define and diagnose overfitting in MDPs, and how to reduce risks by injecting sufficient training diversity. This work complements recent findings on the brittleness of deep RL methods and offers practical observations for RL researchers and practitioners.

    06/20/2018 ∙ by Amy Zhang, et al. ∙ 0 share

    read it