
Learning bodyaffordances to simplify action spaces
Controlling embodied agents with many actuated degrees of freedom is a challenging task. We propose a method that can discover and interpolate between context dependent highlevel actions or bodyaffordances. These provide an abstract, lowdimensional interface indexing highdimensional and time extended action policies. Our method is related to recent ap proaches in the machine learning literature but is conceptually simpler and easier to implement. More specifically our method requires the choice of a ndimensional target sensor space that is endowed with a distance metric. The method then learns an also ndimensional embedding of possibly reactive bodyaffordances that spread as far as possible throughout the target sensor space.
08/15/2017 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it

Counterfactual Control for Free from Generative Models
We introduce a method by which a generative model learning the joint distribution between actions and future states can be used to automatically infer a control scheme for any desired reward function, which may be altered on the fly without retraining the model. In this method, the problem of action selection is reduced to one of gradient descent on the latent space of the generative model, with the model itself providing the means of evaluating outcomes and finding the gradient, much like how the reward network in Deep QNetworks (DQN) provides gradient information for the action generator. Unlike DQN or ActorCritic, which are conditional models for a specific reward, using a generative model of the full joint distribution permits the reward to be changed on the fly. In addition, the generated futures can be inspected to gain insight in to what the network 'thinks' will happen, and to what went wrong when the outcomes deviate from prediction.
02/22/2017 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it

Neural CoarseGraining: Extracting slowlyvarying latent degrees of freedom with neural networks
We present a loss function for neural networks that encompasses an idea of trivial versus nontrivial predictions, such that the network jointly determines its own prediction goals and learns to satisfy them. This permits the network to choose subsets of a problem which are most amenable to its abilities to focus on solving, while discarding 'distracting' elements that interfere with its learning. To do this, the network first transforms the raw data into a higherlevel categorical representation, and then trains a predictor from that new time series to its future. To prevent a trivial solution of mapping the signal to zero, we introduce a measure of nontriviality via a contrast between the prediction error of the learned model with a naive model of the overall signal statistics. The transform can learn to discard uninformative and unpredictable components of the signal in favor of the features which are both highly predictive and highly predictable. This creates a coarsegrained model of the timeseries dynamics, focusing on predicting the slowly varying latent parameters which control the statistics of the timeseries, rather than predicting the fast details directly. The result is a semisupervised algorithm which is capable of extracting latent parameters, segmenting sections of timeseries with differing statistics, and building a higherlevel representation of the underlying dynamics from unlabeled data.
09/01/2016 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it

Permutationequivariant neural networks applied to dynamics prediction
The introduction of convolutional layers greatly advanced the performance of neural networks on image tasks due to innately capturing a way of encoding and learning translationinvariant operations, matching one of the underlying symmetries of the image domain. In comparison, there are a number of problems in which there are a number of different inputs which are all 'of the same type'  multiple particles, multiple agents, multiple stock prices, etc. The corresponding symmetry to this is permutation symmetry, in that the algorithm should not depend on the specific ordering of the input data. We discuss a permutationinvariant neural network layer in analogy to convolutional layers, and show the ability of this architecture to learn to predict the motion of a variable number of interacting hard discs in 2D. In the same way that convolutional layers can generalize to different image sizes, the permutation layer we describe generalizes to different numbers of objects.
12/14/2016 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it

Learning to generate classifiers
We train a network to generate mappings between training sets and classification policies (a 'classifier generator') by conditioning on the entire training set via an attentional mechanism. The network is directly optimized for test set performance on an training set of related tasks, which is then transferred to unseen 'test' tasks. We use this to optimize for performance in the lowdata and unsupervised learning regimes, and obtain significantly better performance in the 1050 datapoint regime than support vector classifiers, random forests, XGBoost, and knearest neighbors on a range of small datasets.
03/30/2018 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it

Being curious about the answers to questions: novelty search with learned attention
We investigate the use of attentional neural network layers in order to learn a `behavior characterization' which can be used to drive novelty search and curiositybased policies. The space is structured towards answering a particular distribution of questions, which are used in a supervised way to train the attentional neural network. We find that in a 2d exploration task, the structure of the space successfully encodes local sensorymotor contingencies such that even a greedy local `do the most novel action' policy with no reinforcement learning or evolution can explore the space quickly. We also apply this to a high/low number guessing game task, and find that guessing according to the learned attention profile performs active inference and can discover the correct number more quickly than an exact but passive approach.
06/01/2018 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it

On the potential for openendedness in neural networks
Natural evolution gives the impression of leading to an openended process of increasing diversity and complexity. If our goal is to produce such openendedness artificially, this suggests an approach driven by evolutionary metaphor. On the other hand, techniques from machine learning and artificial intelligence are often considered too narrow to provide the sort of exploratory dynamics associated with evolution. In this paper, we hope to bridge that gap by reviewing common barriers to openendedness in the evolutioninspired approach and how they are dealt with in the evolutionary case  collapse of diversity, saturation of complexity, and failure to form new kinds of individuality. We then show how these problems map onto similar issues in the machine learning approach, and discuss how the same insights and solutions which alleviated those barriers in evolutionary approaches can be ported over. At the same time, the form these issues take in the machine learning formulation suggests new ways to analyze and resolve barriers to openendedness. Ultimately, we hope to inspire researchers to be able to interchangeably use evolutionary and gradientdescentbased machine learning methods to approach the design and creation of openended systems.
12/12/2018 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it

Generating the support with extreme value losses
When optimizing against the mean loss over a distribution of predictions in the context of a regression task, then even if there is a distribution of targets the optimal prediction distribution is always a delta function at a single value. Methods of constructing generative models need to overcome this tendency. We consider a simple method of summarizing the prediction error, such that the optimal strategy corresponds to outputting a distribution of predictions with a support that matches the support of the distribution of targets  optimizing against the minimal value of the loss given a set of samples from the prediction distribution, rather than the mean. We show that models trained against this loss learn to capture the support of the target distribution and, when combined with an auxiliary classifierlike prediction task, can be projected via rejection sampling to reproduce the full distribution of targets. The resulting method works well compared to other generative modeling approaches particularly in low dimensional spaces with highly nontrivial distributions, due to mode collapse solutions being globally suboptimal with respect to the extreme value loss. However, the method is less suited to highdimensional spaces such as images due to the scaling of the number of samples needed in order to accurately estimate the extreme value loss when the dimension of the data manifold becomes large.
02/08/2019 ∙ by Nicholas Guttenberg, et al. ∙ 0 ∙ shareread it
Nicholas Guttenberg
is this you? claim profile