Guided Deep List: Automating the Generation of Epidemiological Line Lists from Open Sources

by   Saurav Ghosh, et al.

Real-time monitoring and responses to emerging public health threats rely on the availability of timely surveillance data. During the early stages of an epidemic, the ready availability of line lists with detailed tabular information about laboratory-confirmed cases can assist epidemiologists in making reliable inferences and forecasts. Such inferences are crucial to understand the epidemiology of a specific disease early enough to stop or control the outbreak. However, construction of such line lists requires considerable human supervision and therefore, difficult to generate in real-time. In this paper, we motivate Guided Deep List, the first tool for building automated line lists (in near real-time) from open source reports of emerging disease outbreaks. Specifically, we focus on deriving epidemiological characteristics of an emerging disease and the affected population from reports of illness. Guided Deep List uses distributed vector representations (ala word2vec) to discover a set of indicators for each line list feature. This discovery of indicators is followed by the use of dependency parsing based techniques for final extraction in tabular form. We evaluate the performance of Guided Deep List against a human annotated line list provided by HealthMap corresponding to MERS outbreaks in Saudi Arabia. We demonstrate that Guided Deep List extracts line list features with increased accuracy compared to a baseline method. We further show how these automatically extracted line list features can be used for making epidemiological inferences, such as inferring demographics and symptoms-to-hospitalization period of affected individuals.


page 1

page 5

page 11

page 13


Correlation-based Discovery of Disease Patterns for Syndromic Surveillance

Early outbreak detection is a key aspect in the containment of infectiou...

CensorSeeker: Generating a Large, Culture-Specific Blocklist for China

Internet censorship measurements rely on lists of websites to be tested,...

Automatically Generating a Large, Culture-Specific Blocklist for China

Internet censorship measurements rely on lists of websites to be tested,...

Structure and Stability of Internet Top Lists

Active Internet measurement studies rely on a list of targets to be scan...

Lists of Top Artists to Watch computed algorithmically

Lists of top artists to watch are periodically published by various art ...

The List is the Process: Reliable Pre-Integration Tracking of Commits on Mailing Lists

A considerable corpus of research on software evolution focuses on minin...

FANS: Fast Non-Autoregressive Sequence Generation for Item List Continuation

User-curated item lists, such as video-based playlists on Youtube and bo...

Please sign up or login with your details

Forgot password? Click here to reset