Improving Correlation with Human Judgments by Integrating Semantic Similarity with Second--Order Vectors

09/02/2016
by   Bridget T. McInnes, et al.
0

Vector space methods that measure semantic similarity and relatedness often rely on distributional information such as co--occurrence frequencies or statistical measures of association to weight the importance of particular co--occurrences. In this paper, we extend these methods by incorporating a measure of semantic similarity based on a human curated taxonomy into a second--order vector representation. This results in a measure of semantic relatedness that combines both the contextual information available in a corpus--based vector space representation with the semantic knowledge found in a biomedical ontology. Our results show that incorporating semantic similarity into a second order co--occurrence matrices improves correlation with human judgments for both similarity and relatedness, and that our method compares favorably to various different word embedding methods that have recently been evaluated on the same reference standards we have used.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2019

A Comparison of Semantic Similarity Methods for Maximum Human Interpretability

The inclusion of semantic information in any similarity measures improve...
research
05/12/2018

Weight Initialization in Neural Language Models

Semantic Similarity is an important application which finds its use in m...
research
09/21/2017

Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness

Estimation of semantic similarity and relatedness between biomedical con...
research
09/30/2022

Evaluation of taxonomic and neural embedding methods for calculating semantic similarity

Modelling semantic similarity plays a fundamental role in lexical semant...
research
09/01/2020

Document Similarity from Vector Space Densities

We propose a computationally light method for estimating similarities be...
research
02/08/2023

A Parametric Similarity Method: Comparative Experiments based on Semantically Annotated Large Datasets

We present the parametric method SemSimp aimed at measuring semantic sim...
research
04/14/2020

Multi-Ontology Refined Embeddings (MORE): A Hybrid Multi-Ontology and Corpus-based Semantic Representation for Biomedical Concepts

Objective: Currently, a major limitation for natural language processing...

Please sign up or login with your details

Forgot password? Click here to reset