DeepAI AI Chat
Log In Sign Up

Depa: Self-supervised audio embedding for depression detection

10/29/2019
by   Heinrich Dinkel, et al.
0

Depression detection research has increased over the last few decades as this disease is becoming a socially-centered problem. One major bottleneck for developing automatic depression detection methods lies on the limited data availability. Recently, pretrained text-embeddings have seen success in sparse data scenarios, while pretrained audio embeddings are rarely investigated. This paper proposes DEPA, a self-supervised, Word2Vec like pretrained depression audio embedding method for depression detection. An encoder-decoder network is used to extract DEPA on sparse-data in-domain (DAIC) and large-data out-domain (switchboard, Alzheimer's) datasets. With DEPA as the audio embedding, performance significantly outperforms traditional audio features regarding both classification and regression metrics. Moreover, we show that large-data out-domain pretraining is beneficial to depression detection performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

11/23/2022

ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation

Vision transformers, which were originally developed for natural languag...
06/24/2022

BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping

Methods for extracting audio and speech features have been studied since...
08/23/2021

How Transferable Are Self-supervised Features in Medical Image Classification Tasks?

Transfer learning has become a standard practice to mitigate the lack of...
05/13/2022

ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation

We present ViT5, a pretrained Transformer-based encoder-decoder model fo...
10/06/2022

Matching Text and Audio Embeddings: Exploring Transfer-learning Strategies for Language-based Audio Retrieval

We present an analysis of large-scale pretrained deep learning models us...
02/12/2020

Improving automated segmentation of radio shows with audio embeddings

Audio features have been proven useful for increasing the performance of...
12/08/2020

I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch

Growing research demonstrates that synthetic failure modes imply poor ge...