Text-based classification of interviews for mental health – juxtaposing the state of the art

07/29/2020
by   Joppe Valentijn Wouts, et al.
0

Currently, the state of the art for classification of psychiatric illness is based on audio-based classification. This thesis aims to design and evaluate a state of the art text classification network on this challenge. The hypothesis is that a well designed text-based approach poses a strong competition against the state-of-the-art audio based approaches. Dutch natural language models are being limited by the scarcity of pre-trained monolingual NLP models, as a result Dutch natural language models have a low capture of long range semantic dependencies over sentences. For this issue, this thesis presents belabBERT, a new Dutch language model extending the RoBERTa[15] architecture. belabBERT is trained on a large Dutch corpus (+32GB) of web crawled texts. After this thesis evaluates the strength of text-based classification, a brief exploration is done, extending the framework to a hybrid text- and audio-based classification. The goal of this hybrid framework is to show the principle of hybridisation with a very basic audio-classification network. The overall goal is to create the foundations for a hybrid psychiatric illness classification, by proving that the new text-based classification is already a strong stand-alone solution.

READ FULL TEXT
research
06/02/2021

belabBERT: a Dutch RoBERTa-based language model applied to psychiatric classification

Natural language processing (NLP) is becoming an important means for aut...
research
01/18/2021

HinFlair: pre-trained contextual string embeddings for pos tagging and text classification in the Hindi language

Recent advancements in language models based on recurrent neural network...
research
11/15/2019

A Subword Level Language Model for Bangla Language

Language models are at the core of natural language processing. The abil...
research
12/21/2019

Recurrent Hierarchical Topic-Guided Neural Language Models

To simultaneously capture syntax and global semantics from a text corpus...
research
05/19/2023

Pengi: An Audio Language Model for Audio Tasks

In the domain of audio processing, Transfer Learning has facilitated the...
research
03/19/2023

Audio-Text Models Do Not Yet Leverage Natural Language

Multi-modal contrastive learning techniques in the audio-text domain hav...
research
01/24/2022

Unified Multimodal Punctuation Restoration Framework for Mixed-Modality Corpus

The punctuation restoration task aims to correctly punctuate the output ...

Please sign up or login with your details

Forgot password? Click here to reset