DDXPlus: A new Dataset for Medical Automatic Diagnosis

by   Arsene Fansi Tchango, et al.

There has been rapidly growing interests in Automatic Diagnosis (AD) and Automatic Symptom Detection (ASD) systems in the machine learning research literature, aiming to assist doctors in telemedicine services. These systems are designed to interact with patients, collect evidence relevant to their concerns, and make predictions about the underlying diseases. Doctors would review the interaction, including the evidence and the predictions, before making their final decisions. Despite the recent progress, an important piece of doctors' interactions with patients is missing in the design of AD and ASD systems, namely the differential diagnosis. Its absence is largely due to the lack of datasets that include such information for models to train on. In this work, we present a large-scale synthetic dataset that includes a differential diagnosis, along with the ground truth pathology, for each patient. In addition, this dataset includes more pathologies, as well as types of symtoms and antecedents. As a proof-of-concept, we extend several existing AD and ASD systems to incorporate differential diagnosis, and provide empirical evidence that using differentials in training signals is essential for such systems to learn to predict differentials. Dataset available at https://github.com/bruzwen/ddxplus


page 10

page 20

page 21


Towards Trustworthy Automatic Diagnosis Systems by Emulating Doctors' Reasoning with Deep Reinforcement Learning

The automation of the medical evidence acquisition and diagnosis process...

CoAD: Automatic Diagnosis through Symptom and Disease Collaborative Generation

Automatic diagnosis (AD), a critical application of AI in healthcare, em...

ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection

This paper introduces a new dataset called "ToyADMOS" designed for anoma...

Alcohol Intake Differentiates AD and LATE: A Telltale Lifestyle from Two Large-Scale Datasets

Alzheimer's disease (AD), as a progressive brain disease, affects cognit...

OmniPrint: A Configurable Printed Character Synthesizer

We introduce OmniPrint, a synthetic data generator of isolated printed c...

"My nose is running.""Are you also coughing?": Building A Medical Diagnosis Agent with Interpretable Inquiry Logics

With the rise of telemedicine, the task of developing Dialogue Systems f...

Please sign up or login with your details

Forgot password? Click here to reset