Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection

05/22/2023
by   Debarpan Bhattacharya, et al.
0

This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demographic information associated with age, gender and geographic location, as well as the health information relating to the symptoms, pre-existing respiratory ailments, comorbidity and SARS-CoV-2 test status. Our study is the first of its kind to manually annotate the audio quality of the entire dataset (amounting to 65 hours) through manual listening. The paper summarizes the data collection procedure, demographic, symptoms and audio data information. A COVID-19 classifier based on bi-directional long short-term (BLSTM) architecture, is trained and evaluated on the different population sub-groups contained in the dataset to understand the bias/fairness of the model. This enabled the analysis of the impact of gender, geographic location, date of recording, and language proficiency on the COVID-19 detection performance.

READ FULL TEXT

page 1

page 8

page 9

research
12/15/2022

A large-scale and PCR-referenced vocal audio dataset for COVID-19

The UK COVID-19 Vocal Audio Dataset is designed for the training and eva...
research
10/04/2021

The Second DiCOVA Challenge: Dataset and performance analysis for COVID-19 diagnosis using acoustics

The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aime...
research
03/08/2023

The Casual Conversations v2 Dataset

This paper introduces a new large consent-driven dataset aimed at assist...
research
12/15/2022

Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Recent work has reported that AI classifiers trained on audio recordings...
research
03/10/2021

Understanding the Representation and Representativeness of Age in AI Data Sets

A diverse representation of different demographic groups in AI training ...
research
10/23/2020

Impact of (SARS-CoV-2) COVID 19 on the indigenous language-speaking population in Mexico

The importance of the working document is that it allows the analysis of...
research
11/26/2020

Neural Networks for Pulmonary Disease Diagnosis using Auditory and Demographic Information

Pulmonary diseases impact millions of lives globally and annually. The r...

Please sign up or login with your details

Forgot password? Click here to reset