Evaluation of Word Embeddings for the Social Sciences

02/13/2023
by   Ricardo Schiffers, et al.
0

Word embeddings are an essential instrument in many NLP tasks. Most available resources are trained on general language from Web corpora or Wikipedia dumps. However, word embeddings for domain-specific language are rare, in particular for the social science domain. Therefore, in this work, we describe the creation and evaluation of word embedding models based on 37,604 open-access social science research papers. In the evaluation, we compare domain-specific and general language models for (i) language coverage, (ii) diversity, and (iii) semantic relationships. We found that the created domain-specific model, even with a relatively small vocabulary size, covers a large part of social science concepts, their neighborhoods are diverse in comparison to more general models. Across all relation types, we found a more extensive coverage of semantic relationships.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2019

Deep Learning and Word Embeddings for Tweet Classification for Crisis Response

Tradition tweet classification models for crisis response focus on convo...
research
10/06/2022

Domain-Specific Word Embeddings with Structure Prediction

Complementary to finding good general word embeddings, an important ques...
research
06/15/2023

Domain-specific ChatBots for Science using Embeddings

Large language models (LLMs) have emerged as powerful machine-learning s...
research
10/09/2020

Top-Rank-Focused Adaptive Vote Collection for the Evaluation of Domain-Specific Semantic Models

The growth of domain-specific applications of semantic models, boosted b...
research
02/25/2020

Language-Independent Tokenisation Rivals Language-Specific Tokenisation for Word Similarity Prediction

Language-independent tokenisation (LIT) methods that do not require labe...
research
12/16/2021

Unsupervised Matching of Data and Text

Entity resolution is a widely studied problem with several proposals to ...
research
02/01/2021

Automatic Expansion of Domain-Specific Affective Models for Web Intelligence Applications

Sentic computing relies on well-defined affective models of different co...

Please sign up or login with your details

Forgot password? Click here to reset