Visualizations of Complex Sequences of Family-Infant Vocalizations Using Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features

03/29/2022
by   Jialu Li, et al.
2

In the U.S., approximately 15-17 to have at least one diagnosed mental, behavioral or developmental disorder. However, such disorders often go undiagnosed, and the ability to evaluate and treat disorders in the first years of life is limited. To analyze infant developmental changes, previous studies have shown advanced ML models excel at classifying infant and/or parent vocalizations collected using cell phone, video, or audio-only recording device like LENA. In this study, we pilot test the audio component of a new infant wearable multi-modal device that we have developed called LittleBeats (LB). LB audio pipeline is advanced in that it provides reliable labels for both speaker diarization and vocalization classification tasks, compared with other platforms that only record audio and/or provide speaker diarization labels. We leverage wav2vec 2.0 to obtain superior and more nuanced results with the LB family audio stream. We use a bag-of-audio-words method with wav2vec 2.0 features to create high-level visualizations to understand family-infant vocalization interactions. We demonstrate that our high-quality visualizations capture major types of family vocalization interactions, in categories indicative of mental, behavioral, and developmental health, for both labeled and unlabeled LB audio.

READ FULL TEXT
research
05/21/2023

Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family Audio

To perform automatic family audio analysis, past studies have collected ...
research
08/04/2022

Exploring the Role of Emotion Regulation Difficulties in the Assessment of Mental Disorders

Several studies have been reported in the literature for the automatic d...
research
11/16/2022

Exploring Detection-based Method For Speaker Diarization @ Ego4D Audio-only Diarization Challenge 2022

We provide the technical report for Ego4D audio-only diarization challen...
research
10/03/2022

Simple Pooling Front-ends For Efficient Audio Classification

Recently, there has been increasing interest in building efficient audio...
research
03/28/2018

Topic Modeling Based Multi-modal Depression Detection

Major depressive disorder is a common mental disorder that affects almos...

Please sign up or login with your details

Forgot password? Click here to reset