Context is Everything: Finding Meaning Statistically in Semantic Spaces

03/22/2018
by   Eric Zelikman, et al.
1

This paper introduces a simple and explicit measure of word importance in a global context, including very small contexts (10+ sentences). After generating a word-vector space containing both 2-gram clauses and single tokens, it became clear that more contextually significant words disproportionately define clause meanings. Using this simple relationship in a weighted bag-of-words sentence embedding model results in sentence vectors that outperform the state-of-the-art for subjectivity/objectivity analysis, as well as paraphrase detection, and fall within those produced by state-of-the-art models for six other transfer learning tests. The metric was then extended to a sentence/document summarizer, an improved (and context-aware) cosine distance and a simple document stop word identifier. The sigmoid-global context weighted bag of words is presented as a new baseline for sentence embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2020

Combining Word Embeddings and N-grams for Unsupervised Document Summarization

Graph-based extractive document summarization relies on the quality of t...
research
11/17/2015

Learning to retrieve out-of-vocabulary words in speech recognition

Many Proper Names (PNs) are Out-Of-Vocabulary (OOV) words for speech rec...
research
04/06/2023

Static Fuzzy Bag-of-Words: a lightweight sentence embedding algorithm

The introduction of embedding techniques has pushed forward significantl...
research
07/27/2020

Characterizing the Effect of Sentence Context on Word Meanings: Mapping Brain to Behavior

Semantic feature models have become a popular tool for prediction and in...
research
02/26/2019

Improving a tf-idf weighted document vector embedding

We examine a number of methods to compute a dense vector embedding for a...
research
01/10/2019

Context Aware Machine Learning

We propose a principle for exploring context in machine learning models....
research
06/12/2018

Term Definitions Help Hypernymy Detection

Existing methods of hypernymy detection mainly rely on statistics over a...

Please sign up or login with your details

Forgot password? Click here to reset