Insights into Analogy Completion from the Biomedical Domain

06/07/2017
by   Denis Newman-Griffis, et al.
0

Analogy completion has been a popular task in recent years for evaluating the semantic properties of word embeddings, but the standard methodology makes a number of assumptions about analogies that do not always hold, either in recent benchmark datasets or when expanding into other domains. Through an analysis of analogies in the biomedical domain, we identify three assumptions: that of a Single Answer for any given analogy, that the pairs involved describe the Same Relationship, and that each pair is Informative with respect to the other. We propose modifying the standard methodology to relax these assumptions by allowing for multiple correct answers, reporting MAP and MRR in addition to accuracy, and using multiple example pairs. We further present BMASS, a novel dataset for evaluating linguistic regularities in biomedical embeddings, and demonstrate that the relationships described in the dataset pose significant semantic challenges to current word embedding methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2021

Spanish Biomedical and Clinical Language Embeddings

We computed both Word and Sub-word Embeddings using FastText. For Sub-wo...
research
02/01/2018

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Neural word embeddings have been widely used in biomedical Natural Langu...
research
10/27/2022

Leveraging knowledge graphs to update scientific word embeddings using latent semantic imputation

The most interesting words in scientific texts will often be novel or ra...
research
05/11/2020

Evaluating Sparse Interpretable Word Embeddings for Biomedical Domain

Word embeddings have found their way into a wide range of natural langua...
research
02/20/2021

Knowledge-Base Enriched Word Embeddings for Biomedical Domain

Word embeddings have been shown adept at capturing the semantic and synt...
research
07/02/2018

Transparent, Efficient, and Robust Word Embedding Access with WOMBAT

We present WOMBAT, a Python tool which supports NLP practitioners in acc...
research
07/01/2016

Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource

Word embeddings have recently seen a strong increase in interest as a re...

Please sign up or login with your details

Forgot password? Click here to reset