Comparative study of LSA vs Word2vec embeddings in small corpora: a case study in dreams database

10/05/2016
by   Edgar Altszyler, et al.
0

Word embeddings have been extensively studied in large text datasets. However, only a few studies analyze semantic representations of small corpora, particularly relevant in single-person text production studies. In the present paper, we compare Skip-gram and LSA capabilities in this scenario, and we test both techniques to extract relevant semantic patterns in single-series dreams reports. LSA showed better performance than Skip-gram in small size training corpus in two semantic tests. As a study case, we show that LSA can capture relevant words associations in dream reports series, even in cases of small number of dreams or low-frequency words. We propose that LSA can be used to explore words associations in dreams reports, which could bring new insight into this classic research area of psychology

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2022

The Dependence on Frequency of Word Embedding Similarity Measures

Recent research has shown that static word embeddings can encode word fr...
research
02/25/2015

Breaking Sticks and Ambiguities with Adaptive Skip-gram

Recently proposed Skip-gram model is a powerful method for learning high...
research
01/02/2023

The Undesirable Dependence on Frequency of Gender Bias Metrics Based on Word Embeddings

Numerous works use word embedding-based metrics to quantify societal bia...
research
02/14/2020

Semantic Relatedness and Taxonomic Word Embeddings

This paper connects a series of papers dealing with taxonomic word embed...
research
03/19/2021

TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora

Embeddings of words and concepts capture syntactic and semantic regulari...
research
05/23/2019

Exploring Diseases and Syndromes in Neurology Case Reports from 1955 to 2017 with Text Mining

Background: A large number of neurology case reports have been published...
research
10/19/2021

Inter-Sense: An Investigation of Sensory Blending in Fiction

This study reports on the semantic organization of English sensory descr...

Please sign up or login with your details

Forgot password? Click here to reset