GenSpectrum Chat: Data Exploration in Public Health Using Large Language Models

05/23/2023
by   Chaoran Chen, et al.
0

Introduction: The COVID-19 pandemic highlighted the importance of making epidemiological data and scientific insights easily accessible and explorable for public health agencies, the general public, and researchers. State-of-the-art approaches for sharing data and insights included regularly updated reports and web dashboards. However, they face a trade-off between the simplicity and flexibility of data exploration. With the capabilities of recent large language models (LLMs) such as GPT-4, this trade-off can be overcome. Results: We developed the chatbot "GenSpectrum Chat" (https://cov-spectrum.org/chat) which uses GPT-4 as the underlying large language model (LLM) to explore SARS-CoV-2 genomic sequencing data. Out of 500 inputs from real-world users, the chatbot provided a correct answer for 453 prompts; an incorrect answer for 13 prompts, and no answer although the question was within scope for 34 prompts. We also tested the chatbot with inputs from 10 different languages, and despite being provided solely with English instructions and examples, it successfully processed prompts in all tested languages. Conclusion: LLMs enable new ways of interacting with information systems. In the field of public health, GenSpectrum Chat can facilitate the analysis of real-time pathogen genomic data. With our chatbot supporting interactive exploration in different languages, we envision quick and direct access to the latest evidence for policymakers around the world.

READ FULL TEXT

page 4

page 6

research
06/12/2023

Lost in Translation: Large Language Models in Non-English Content Analysis

In recent years, large language models (e.g., Open AI's GPT-4, Meta's LL...
research
03/20/2023

Language Model Behavior: A Comprehensive Survey

Transformer language models have received widespread public attention, y...
research
03/02/2018

Towards a Question Answering System over the Semantic Web

Thanks to the development of the Semantic Web, a lot of new structured d...
research
05/24/2023

Large Language Models are Few-Shot Health Learners

Large language models (LLMs) can capture rich representations of concept...
research
01/25/2023

Powering an AI Chatbot with Expert Sourcing to Support Credible Health Information Access

During a public health crisis like the COVID-19 pandemic, a credible and...
research
06/09/2022

Ancestor-to-Creole Transfer is Not a Walk in the Park

We aim to learn language models for Creole languages for which large vol...
research
06/29/2023

Computationally Assisted Quality Control for Public Health Data Streams

Irregularities in public health data streams (like COVID-19 Cases) hampe...

Please sign up or login with your details

Forgot password? Click here to reset