Developing and Using Special-Purpose Lexicons for Cohort Selection from Clinical Notes

02/26/2019
by   Samarth Rawal, et al.
0

Background and Significance: Selecting cohorts for a clinical trial typically requires costly and time-consuming manual chart reviews resulting in poor participation. To help automate the process, National NLP Clinical Challenges (N2C2) conducted a shared challenge by defining 13 criteria for clinical trial cohort selection and by providing training and test datasets. This research was motivated by the N2C2 challenge. Methods: We broke down the task into 13 independent subtasks corresponding to each criterion and implemented subtasks using rules or a supervised machine learning model. Each task critically depended on knowledge resources in the form of task-specific lexicons, for which we developed a novel model-driven approach. The approach allowed us to first expand the lexicon from a seed set and then remove noise from the list, thus improving the accuracy. Results: Our system achieved an overall F measure of 0.9003 at the challenge, and was statistically tied for the first place out of 45 participants. The model-driven lexicon development and further debugging the rules/code on the training set improved overall F measure to 0.9140, overtaking the best numerical result at the challenge. Discussion: Cohort selection, like phenotype extraction and classification, is amenable to rule-based or simple machine learning methods, however, the lexicons involved, such as medication names or medical terms referring to a medical problem, critically determine the overall accuracy. Automated lexicon development has the potential for scalability and accuracy.

READ FULL TEXT
research
07/16/2019

A generic rule-based system for clinical trial patient selection

The n2c2 2018 Challenge task 1 aimed to identify patients who meet lists...
research
02/02/2020

Assessment of Amazon Comprehend Medical: Medication Information Extraction

In November 27, 2018, Amazon Web Services (AWS) released Amazon Comprehe...
research
04/25/2023

Sebis at SemEval-2023 Task 7: A Joint System for Natural Language Inference and Evidence Retrieval from Clinical Trial Reports

With the increasing number of clinical trial reports generated every day...
research
03/19/2019

Hybrid Approaches for our Participation to the n2c2 Challenge on Cohort Selection for Clinical Trials

Objective: Natural language processing can help minimize human intervent...
research
01/22/2020

DeepEnroll: Patient-Trial Matching with Deep Embeddingand Entailment Prediction

Clinical trials are essential for drug development but often suffer from...
research
10/25/2020

AutoSpeech 2020: The Second Automated Machine Learning Challenge for Speech Classification

The AutoSpeech challenge calls for automated machine learning (AutoML) s...
research
10/31/2018

Solving multiple-criteria R&D project selection problems with a data-driven evidential reasoning rule

In this paper, a likelihood based evidence acquisition approach is propo...

Please sign up or login with your details

Forgot password? Click here to reset