Firearms and Tigers are Dangerous, Kitchen Knives and Zebras are Not: Testing whether Word Embeddings Can Tell

09/05/2018
by   Pia Sommerauer, et al.
0

This paper presents an approach for investigating the nature of semantic information captured by word embeddings. We propose a method that extends an existing human-elicited semantic property dataset with gold negative examples using crowd judgments. Our experimental approach tests the ability of supervised classifiers to identify semantic features in word embedding vectors and com- pares this to a feature-identification method based on full vector cosine similarity. The idea behind this method is that properties identified by classifiers, but not through full vector comparison are captured by embeddings. Properties that cannot be identified by either method are not. Our results provide an initial indication that semantic properties relevant for the way entities interact (e.g. dangerous) are captured, while perceptual information (e.g. colors) is not represented. We conclude that, though preliminary, these results show that our method is suitable for identifying which properties are captured by embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2023

PWESuite: Phonetic Word Embeddings and Tasks They Facilitate

Word embeddings that map words into a fixed-dimensional vector space are...
research
03/13/2018

Enhanced Word Representations for Bridging Anaphora Resolution

Most current models of word representations(e.g.,GloVe) have successfull...
research
10/04/2016

Are Word Embedding-based Features Useful for Sarcasm Detection?

This paper makes a simple increment to state-of-the-art in sarcasm detec...
research
08/18/2018

SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors

We present SeVeN (Semantic Vector Networks), a hybrid resource that enco...
research
07/19/2018

Imparting Interpretability to Word Embeddings

As an ubiquitous method in natural language processing, word embeddings ...
research
10/07/2019

Correlations between Word Vector Sets

Similarity measures based purely on word embeddings are comfortably comp...
research
05/30/2018

What the Vec? Towards Probabilistically Grounded Embeddings

Vector representation, or embedding, of words is commonly achieved with ...

Please sign up or login with your details

Forgot password? Click here to reset