Knowledge-Empowered Representation Learning for Chinese Medical Reading Comprehension: Task, Model and Resources

08/24/2020
by   Taolin Zhang, et al.
4

Machine Reading Comprehension (MRC) aims to extract answers to questions given a passage. It has been widely studied recently, especially in open domains. However, few efforts have been made on closed-domain MRC, mainly due to the lack of large-scale training data. In this paper, we introduce a multi-target MRC task for the medical domain, whose goal is to predict answers to medical questions and the corresponding support sentences from medical information sources simultaneously, in order to ensure the high reliability of medical knowledge serving. A high-quality dataset is manually constructed for the purpose, named Multi-task Chinese Medical MRC dataset (CMedMRC), with detailed analysis conducted. We further propose the Chinese medical BERT model for the task (CMedBERT), which fuses medical knowledge into pre-trained language models by the dynamic fusion mechanism of heterogeneous features and the multi-task learning strategy. Experiments show that CMedBERT consistently outperforms strong baselines by fusing context-aware and knowledge-aware token representations.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 6

page 8

page 9

page 10

research
09/18/2018

Multi-Task Learning for Machine Reading Comprehension

We propose a multi-task learning framework to jointly train a Machine Re...
research
11/14/2017

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

In this paper, we introduce DuReader, a new large-scale, open-domain Chi...
research
11/09/2020

Synonym Knowledge Enhanced Reader for Chinese Idiom Reading Comprehension

Machine reading comprehension (MRC) is the task that asks a machine to a...
research
12/19/2019

CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension

We present a Chinese judicial reading comprehension (CJRC) dataset which...
research
01/22/2021

A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization

Medical terminology normalization aims to map the clinical mention to te...
research
09/08/2023

The CALLA Dataset: Probing LLMs' Interactive Knowledge Acquisition from Chinese Medical Literature

The application of Large Language Models (LLMs) to the medical domain ha...
research
05/24/2019

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

In this paper we study yes/no questions that are naturally occurring ---...

Please sign up or login with your details

Forgot password? Click here to reset