Combining Contrastive Learning and Knowledge Graph Embeddings to develop medical word embeddings for the Italian language

11/09/2022
by   Denys Amore Bondarenko, et al.
0

Word embeddings play a significant role in today's Natural Language Processing tasks and applications. While pre-trained models may be directly employed and integrated into existing pipelines, they are often fine-tuned to better fit with specific languages or domains. In this paper, we attempt to improve available embeddings in the uncovered niche of the Italian medical domain through the combination of Contrastive Learning (CL) and Knowledge Graph Embedding (KGE). The main objective is to improve the accuracy of semantic similarity between medical terms, which is also used as an evaluation task. Since the Italian language lacks medical texts and controlled vocabularies, we have developed a specific solution by combining preexisting CL methods (multi-similarity loss, contextualization, dynamic sampling) and the integration of KGEs, creating a new variant of the loss. Although without having outperformed the state-of-the-art, represented by multilingual models, the obtained results are encouraging, providing a significant leap in performance compared to the starting model, while using a significantly lower amount of data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/05/2020

CODER: Knowledge infused cross-lingual medical term embedding for term normalization

We propose a novel medical term embedding method named CODER, which stan...
research
01/05/2021

Integration of Domain Knowledge using Medical Knowledge Graph Deep Learning for Cancer Phenotyping

A key component of deep learning (DL) for natural language processing (N...
research
08/14/2017

Improved Answer Selection with Pre-Trained Word Embeddings

This paper evaluates existing and newly proposed answer selection method...
research
08/22/2022

Repurposing Knowledge Graph Embeddings for Triple Representation via Weak Supervision

The majority of knowledge graph embedding techniques treat entities and ...
research
03/24/2020

Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer!

A large number of embeddings trained on medical data have emerged, but i...
research
05/09/2018

Adversarial Contrastive Estimation

Learning by contrasting positive and negative samples is a general strat...
research
09/19/2019

Extracting Conceptual Knowledge from Natural Language Text Using Maximum Likelihood Principle

Domain-specific knowledge graphs constructed from natural language text ...

Please sign up or login with your details

Forgot password? Click here to reset