Context Aware Machine Learning

01/10/2019
by   Yun Zeng, et al.
0

We propose a principle for exploring context in machine learning models. Starting with a simple assumption that each observation may or may not depend on its context, a conditional probability distribution is decomposed into two parts: context-free and context-sensitive. Then by employing the log-linear word production model for relating random variables to their embedding space representation and making use of the convexity of natural exponential function, we show that the embedding of an observation can also be decomposed into a weighted sum of two vectors, representing its context-free and context-sensitive parts, respectively. This simple yet revolutionary treatment of context provides a unified view of almost all existing deep learning models, leading to revisions of these models able to achieve significant performance boost. Specifically, our upgraded version of a recent sentence embedding model not only outperforms the original one by a large margin, but also leads to a new, principled approach for compositing the embeddings of bag-of-words features, as well as a new architecture for modeling attention in deep neural networks. More surprisingly, our new principle provides a novel understanding of the gates and equations defined by the long short term memory model, which also leads to a new model that is able to converge significantly faster and achieve much lower prediction errors. Furthermore, our principle also inspires a new type of generic neural network layer that better resembles real biological neurons than the traditional linear mapping plus nonlinear activation based architecture. Its multi-layer extension provides a new principle for deep neural networks which subsumes residual network (ResNet) as its special case, and its extension to convolutional neutral network model accounts for irrelevant input (e.g., background in an image) in addition to filtering.

READ FULL TEXT

page 13

page 17

page 18

page 19

page 27

page 29

page 30

page 31

research
06/09/2019

Happy Together: Learning and Understanding Appraisal From Natural Language

In this paper, we explore various approaches for learning two types of a...
research
02/23/2018

Reusing Weights in Subword-aware Neural Language Models

We propose several ways of reusing subword embeddings and other weights ...
research
01/09/2020

Binary and Multitask Classification Model for Dutch Anaphora Resolution: Die/Dat Prediction

The correct use of Dutch pronouns 'die' and 'dat' is a stumbling block f...
research
03/22/2018

Context is Everything: Finding Meaning Statistically in Semantic Spaces

This paper introduces a simple and explicit measure of word importance i...
research
04/18/2018

NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention

In this paper we present a deep-learning model that competed at SemEval-...
research
02/20/2018

Attentive Tensor Product Learning for Language Generation and Grammar Parsing

This paper proposes a new architecture - Attentive Tensor Product Learni...
research
04/17/2020

DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications

One of the limitations of deep learning models with sparse features toda...

Please sign up or login with your details

Forgot password? Click here to reset