
Stable Opponent Shaping in Differentiable Games
A growing number of learning methods are actually games which optimise multiple, interdependent objectives in parallel  from GANs and intrinsic curiosity to multiagent RL. Opponent shaping is a powerful approach to improve learning dynamics in such games, accounting for the fact that the 'environment' includes agents adapting to one another's updates. Learning with OpponentLearning Awareness (LOLA) is a recent algorithm which exploits this dynamic response and encourages cooperation in settings like the Iterated Prisoner's Dilemma. Although experimentally successful, we show that LOLA can exhibit 'arrogant' behaviour directly at odds with convergence. In fact, remarkably few algorithms have theoretical guarantees applying across all differentiable games. In this paper we present Stable Opponent Shaping (SOS), a new method that interpolates between LOLA and a stable variant named LookAhead. We prove that LookAhead locally converges and avoids strict saddles in all differentiable games, the strongest results in the field so far. SOS inherits these desirable guarantees, while also shaping the learning of opponents and consistently either matching or outperforming LOLA experimentally.
11/20/2018 ∙ by Alistair Letcher, et al. ∙ 74 ∙ shareread it

A Survey of Reinforcement Learning Informed by Natural Language
To be successful in realworld tasks, Reinforcement Learning (RL) needs to exploit the compositional, relational, and hierarchical structure of the world, and learn to transfer it to the task at hand. Recent advances in representation learning for language make it possible to build models that acquire world knowledge from text corpora and integrate this knowledge into downstream decision making problems. We thus argue that the time is right to investigate a tight integration of natural language understanding into RL in particular. We survey the state of the field, including work on instruction following, text games, and learning from textual domain knowledge. Finally, we call for the development of new environments as well as further investigation into the potential uses of recent Natural Language Processing (NLP) techniques for such tasks.
06/10/2019 ∙ by Jelena Luketina, et al. ∙ 52 ∙ shareread it

Neural Variational Inference For Estimating Uncertainty in Knowledge Graph Embeddings
Recent advances in Neural Variational Inference allowed for a renaissance in latent variable models in a variety of domains involving highdimensional data. While traditional variational methods derive an analytical approximation for the intractable distribution over the latent variables, here we construct an inference network conditioned on the symbolic representation of entities and relation types in the Knowledge Graph, to provide the variational distributions. The new framework results in a highlyscalable method. Under a Bernoulli sampling framework, we provide an alternative justification for commonly used techniques in largescale stochastic variational inference, which drastically reduce training time at a cost of an additional approximation to the variational lower bound. We introduce two models from this highly scalable probabilistic framework, namely the Latent Information and Latent Fact models, for reasoning over knowledge graphbased representations. Our Latent Information and Latent Fact models improve upon baseline performance under certain conditions. We use the learnt embedding variance to estimate predictive uncertainty during link prediction, and discuss the quality of these learnt uncertainty estimates. Our source code and datasets are publicly available online at https://github.com/alexanderimanicowenrivers/NeuralVariationalKnowledgeGraphs.
06/12/2019 ∙ by Alexander I. CowenRivers, et al. ∙ 1 ∙ shareread it

Adversarial Sets for Regularising Neural Link Predictors
In adversarial training, a set of models learn together by pursuing competing goals, usually defined on single data instances. However, in relational learning and other noni.i.d domains, goals can also be defined over sets of instances. For example, a link predictor for the isa relation needs to be consistent with the transitivity property: if isa(x_1, x_2) and isa(x_2, x_3) hold, isa(x_1, x_3) needs to hold as well. Here we use such assumptions for deriving an inconsistency loss, measuring the degree to which the model violates the assumptions on an adversariallygenerated set of examples. The training objective is defined as a minimax problem, where an adversary finds the most offending adversarial examples by maximising the inconsistency loss, and the model is trained by jointly minimising a supervised loss and the inconsistency loss on the adversarial examples. This yields the first method that can use functionfree Horn clauses (as in Datalog) to regularise any neural link predictor, with complexity independent of the domain size. We show that for several link prediction models, the optimisation problem faced by the adversary has efficient closedform solutions. Experiments on link prediction benchmarks indicate that given suitable prior knowledge, our method can significantly improve neural link predictors on all relevant metrics.
07/24/2017 ∙ by Pasquale Minervini, et al. ∙ 0 ∙ shareread it

Programming with a Differentiable Forth Interpreter
Given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. In this paper, we consider the case of prior procedural knowledge for neural networks, such as knowing how a program should traverse a sequence, but not what local actions should be performed at each step. To this end, we present an endtoend differentiable interpreter for the programming language Forth which enables programmers to write program sketches with slots that can be filled with behaviour trained from program inputoutput data. We can optimise this behaviour directly through gradient descent techniques on userspecified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of prior program structure and learn complex behaviours such as sequence sorting and addition. When connected to outputs of an LSTM and trained jointly, our interpreter achieves stateoftheart accuracy for endtoend reasoning about quantities expressed in natural language stories.
05/21/2016 ∙ by Matko Bošnjak, et al. ∙ 0 ∙ shareread it

EndtoEnd Differentiable Proving
We introduce neural networks for endtoend differentiable proving of queries to knowledge bases by operating on dense vector representations of symbols. These neural networks are constructed recursively by taking inspiration from the backward chaining algorithm as used in Prolog. Specifically, we replace symbolic unification with a differentiable computation on vector representations of symbols using a radial basis function kernel, thereby combining symbolic reasoning with learning subsymbolic vector representations. By using gradient descent, the resulting neural network can be trained to infer facts from a given incomplete knowledge base. It learns to (i) place representations of similar symbols in close proximity in a vector space, (ii) make use of such similarities to prove queries, (iii) induce logical rules, and (iv) use provided and induced logical rules for multihop reasoning. We demonstrate that this architecture outperforms ComplEx, a stateoftheart neural link prediction model, on three out of four benchmark knowledge bases while at the same time inducing interpretable functionfree firstorder logic rules.
05/31/2017 ∙ by Tim Rocktäschel, et al. ∙ 0 ∙ shareread it

Frustratingly Short Attention Spans in Neural Language Modeling
Neural language models predict the next token using a latent representation of the immediate token history. Recently, various methods for augmenting neural language models with an attention mechanism over a differentiable memory have been proposed. For predicting the next token, these models query information from a memory of the recent history which can facilitate learning mid and longrange dependencies. However, conventional attention mechanisms used in memoryaugmented neural language models produce a single output vector per time step. This vector is used both for predicting the next token as well as for the key and value of a differentiable memory of a token history. In this paper, we propose a neural language model with a keyvalue attention mechanism that outputs separate representations for the key and value of a differentiable memory, as well as for encoding the nextword distribution. This model outperforms existing memoryaugmented neural language models on two corpora. Yet, we found that our method mainly utilizes a memory of the five most recent output representations. This led to the unexpected main finding that a much simpler model based only on the concatenation of recent output representations from previous time steps is on par with more sophisticated memoryaugmented neural language models.
02/15/2017 ∙ by Michał Daniluk, et al. ∙ 0 ∙ shareread it

Learning Python Code Suggestion with a Sparse Pointer Network
To enhance developer productivity, all modern integrated development environments (IDEs) include code suggestion functionality that proposes likely next tokens at the cursor. While current IDEs work well for staticallytyped languages, their reliance on type annotations means that they do not provide the same level of support for dynamic programming languages as for staticallytyped languages. Moreover, suggestion engines in modern IDEs do not propose expressions or multistatement idiomatic code. Recent work has shown that language models can improve code suggestion systems by learning from software repositories. This paper introduces a neural language model with a sparse pointer network aimed at capturing very longrange dependencies. We release a largescale code suggestion corpus of 41M lines of Python code crawled from GitHub. On this corpus, we found standard neural language models to perform well at suggesting local phenomena, but struggle to refer to identifiers that are introduced many tokens in the past. By augmenting a neural language model with a pointer network specialized in referring to predefined classes of identifiers, we obtain a much lower perplexity and a 5 percentage points increase in accuracy for code suggestion compared to an LSTM baseline. In fact, this increase in code suggestion accuracy is due to a 13 times more accurate prediction of identifiers. Furthermore, a qualitative analysis shows this model indeed captures interesting longrange dependencies, like referring to a class member defined over 60 tokens in the past.
11/24/2016 ∙ by Avishkar Bhoopchand, et al. ∙ 0 ∙ shareread it

Lifted Rule Injection for Relation Embeddings
Methods based on representation learning currently hold the stateoftheart in many natural language processing and knowledge base inference tasks. Yet, a major challenge is how to efficiently incorporate commonsense knowledge into such models. A recent approach regularizes relation and entity representations by propositionalization of firstorder logic rules. However, propositionalization does not scale beyond domains with only few entities and rules. In this paper we present a highly efficient method for incorporating implication rules into distributed representations for automated knowledge base construction. We map entitytuple embeddings into an approximately Boolean space and encourage a partial ordering over relation embeddings based on implication rules mined from WordNet. Surprisingly, we find that the strong restriction of the entitytuple embedding space does not hurt the expressiveness of the model and even acts as a regularizer that improves generalization. By incorporating few commonsense rules, we achieve an increase of 2 percentage points mean average precision over a matrix factorization baseline, while observing a negligible increase in runtime.
06/27/2016 ∙ by Thomas Demeester, et al. ∙ 0 ∙ shareread it

Stance Detection with Bidirectional Conditional Encoding
Stance detection is the task of classifying the attitude expressed in a text towards a target such as Hillary Clinton to be "positive", negative" or "neutral". Previous work has assumed that either the target is mentioned in the text or that training data for every target is given. This paper considers the more challenging version of this task, where targets are not always mentioned and no training data is available for the test targets. We experiment with conditional LSTM encoding, which builds a representation of the tweet that is dependent on the target, and demonstrate that it outperforms encoding the tweet and the target independently. Performance is improved further when the conditional model is augmented with bidirectional encoding. We evaluate our approach on the SemEval 2016 Task 6 Twitter Stance Detection corpus achieving performance second best only to a system trained on semiautomatically labelled tweets for the test target. When such weak supervision is added, our approach achieves stateoftheart results.
06/17/2016 ∙ by Isabelle Augenstein, et al. ∙ 0 ∙ shareread it

MuFuRU: The MultiFunction Recurrent Unit
Recurrent neural networks such as the GRU and LSTM found wide adoption in natural language processing and achieve stateoftheart results for many tasks. These models are characterized by a memory state that can be written to and read from by applying gated composition operations to the current input and the previous state. However, they only cover a small subset of potentially useful compositions. We propose MultiFunction Recurrent Units (MuFuRUs) that allow for arbitrary differentiable functions as composition operations. Furthermore, MuFuRUs allow for an input and statedependent choice of these composition operations that is learned. Our experiments demonstrate that the additional functionality helps in different sequence modeling tasks, including the evaluation of propositional logic formulae, language modeling and sentiment analysis.
06/09/2016 ∙ by Dirk Weissenborn, et al. ∙ 0 ∙ shareread it
Tim Rocktäschel
is this you? claim profile
Postdoctoral Researcher at University of Oxford; Junior Research Fellow at Jesus College, Stipendiary Lecturer at Hertford College, PhD from UCL; Deep & Reinforcement Learning, NLP