ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection

by   Sunjae Kwon, et al.

Opioid related aberrant behaviors (ORAB) present novel risk factors for opioid overdose. Previously, ORAB have been mainly assessed by survey results and by monitoring drug administrations. Such methods however, cannot scale up and do not cover the entire spectrum of aberrant behaviors. On the other hand, ORAB are widely documented in electronic health record notes. This paper introduces a novel biomedical natural language processing benchmark dataset named ODD, for ORAB Detection Dataset. ODD is an expert-annotated dataset comprising of more than 750 publicly available EHR notes. ODD has been designed to identify ORAB from patients' EHR notes and classify them into nine categories; 1) Confirmed Aberrant Behavior, 2) Suggested Aberrant Behavior, 3) Opioids, 4) Indication, 5) Diagnosed opioid dependency, 6) Benzodiapines, 7) Medication Changes, 8) Central Nervous System-related, and 9) Social Determinants of Health. We explored two state-of-the-art natural language processing (NLP) models (finetuning pretrained language models and prompt-tuning approaches) to identify ORAB. Experimental results show that the prompt-tuning models outperformed the finetuning models in most cateogories and the gains were especially higher among uncommon categories (Suggested aberrant behavior, Diagnosed opioid dependency and Medication change). Although the best model achieved the highest 83.92% on area under precision recall curve, uncommon classes (Suggested Aberrant Behavior, Diagnosed Opioid Dependence, and Medication Change) still have a large room for performance improvement.


page 1

page 2

page 3

page 4


Natural Language Processing Methods to Identify Oncology Patients at High Risk for Acute Care with Clinical Notes

Clinical notes are an essential component of a health record. This paper...

HYPE: A High Performing NLP System for Automatically Detecting Hypoglycemia Events from Electronic Health Record Notes

Hypoglycemia is common and potentially dangerous among those treated for...

MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score

This paper proposes a new natural language processing (NLP) application ...

Automated Identification of Eviction Status from Electronic Health Record Notes

Objective: Evictions are involved in a cascade of negative events that c...

Unsupervised Ensemble Ranking of Terms in Electronic Health Record Notes Based on Their Importance to Patients

Background: Electronic health record (EHR) notes contain abundant medica...

Associations Between Natural Language Processing (NLP) Enriched Social Determinants of Health and Suicide Death among US Veterans

Importance: Social determinants of health (SDOH) are known to be associa...

Please sign up or login with your details

Forgot password? Click here to reset