SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with Structured Semantics for Medical Text Mining

08/20/2021
by   Taolin Zhang, et al.
15

Recently, the performance of Pre-trained Language Models (PLMs) has been significantly improved by injecting knowledge facts to enhance their abilities of language understanding. For medical domains, the background knowledge sources are especially useful, due to the massive medical terms and their complicated relations are difficult to understand in text. In this work, we introduce SMedBERT, a medical PLM trained on large-scale medical corpora, incorporating deep structured semantic knowledge from neighbors of linked-entity.In SMedBERT, the mention-neighbor hybrid attention is proposed to learn heterogeneous-entity information, which infuses the semantic representations of entity types into the homogeneous neighboring entity structure. Apart from knowledge integration as external features, we propose to employ the neighbors of linked-entities in the knowledge graph as additional global contexts of text mentions, allowing them to communicate via shared neighbors, thus enrich their semantic representations. Experiments demonstrate that SMedBERT significantly outperforms strong baselines in various knowledge-intensive Chinese medical tasks. It also improves the performance of other tasks such as question answering, question matching and natural language inference.

READ FULL TEXT

page 4

page 5

page 6

page 7

page 8

page 9

page 10

page 13

research
08/16/2019

Learning Conceptual-Contexual Embeddings for Medical Text

External knowledge is often useful for natural language understanding ta...
research
04/29/2020

Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning

In this work, we aim at equipping pre-trained language models with struc...
research
02/27/2022

A Simple but Effective Pluggable Entity Lookup Table for Pre-trained Language Models

Pre-trained language models (PLMs) cannot well recall rich factual knowl...
research
11/20/2022

Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual Synonym Knowledge

Contextual synonym knowledge is crucial for those similarity-oriented ta...
research
09/01/2021

Does Knowledge Help General NLU? An Empirical Study

It is often observed in knowledge-centric tasks (e.g., common sense ques...
research
05/02/2023

Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge

Pre-trained language models (LMs) are used for knowledge intensive tasks...
research
09/02/2019

Enriching Medcial Terminology Knowledge Bases via Pre-trained Language Model and Graph Convolutional Network

Enriching existing medical terminology knowledge bases (KBs) is an impor...

Please sign up or login with your details

Forgot password? Click here to reset