Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings

02/05/2018
by   Gabriel Grand, et al.
0

The words of a language reflect the structure of the human mind, allowing us to transmit thoughts between individuals. However, language can represent only a subset of our rich and detailed cognitive architecture. Here, we ask what kinds of common knowledge (semantic memory) are captured by word meanings (lexical semantics). We examine a prominent computational model that represents words as vectors in a multidimensional space, such that proximity between word-vectors approximates semantic relatedness. Because related words appear in similar con-texts, such spaces - called "word embeddings" - can be learned from patterns of lexical co-occurrences in natural language. Despite their popularity, a fundamental concern about word embeddings is that they appear to be semantically "rigid": inter-word proximity captures only overall similarity, yet human judgments about object similarities are highly context-dependent and involve multiple, distinct semantic features. For example, dolphins and alligators appear similar in size, but differ in intelligence and aggressiveness. Could such context-dependent relationships be recovered from word embeddings? To address this issue, we introduce a powerful, domain-general solution: "semantic projection" of word-vectors onto lines that represent various object features, like size (the line extending from the word "small" to "big"), intelligence (from "dumb" to "smart"), or danger (from "safe" to "dangerous"). This method, which is intuitively analogous to placing objects "on a mental scale" between two extremes, recovers human judgments across a range of object categories and properties. We thus show that word embeddings inherit a wealth of common knowledge from word co-occurrence statistics and can be flexibly manipulated to express context-dependent meanings.

READ FULL TEXT

page 11

page 20

research
02/01/2022

Towards a Theoretical Understanding of Word and Relation Representation

Representing words by vectors, or embeddings, enables computational reas...
research
11/01/2017

Semantic Structure and Interpretability of Word Embeddings

Dense word embeddings, which encode semantic meanings of words to low di...
research
12/16/2019

Scale-dependent Relationships in Natural Language

Natural language exhibits statistical dependencies at a wide range of sc...
research
04/29/2015

On the universal structure of human lexical semantics

How universal is human conceptual structure? The way concepts are organi...
research
06/15/2021

Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection

One of the long-standing challenges in lexical semantics consists in lea...
research
02/08/2021

Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds

Unsupervised representation learning techniques, such as learning word e...
research
01/13/2022

Feature-rich multiplex lexical networks reveal mental strategies of early language learning

Knowledge in the human mind exhibits a dualistic vector/network nature. ...

Please sign up or login with your details

Forgot password? Click here to reset