Unsupervised Speech Representation Learning for Behavior Modeling using Triplet Enhanced Contextualized Networks

04/01/2021
by   Haoqi Li, et al.
10

Speech encodes a wealth of information related to human behavior and has been used in a variety of automated behavior recognition tasks. However, extracting behavioral information from speech remains challenging including due to inadequate training data resources stemming from the often low occurrence frequencies of specific behavioral patterns. Moreover, supervised behavioral modeling typically relies on domain-specific construct definitions and corresponding manually-annotated data, rendering generalizing across domains challenging. In this paper, we exploit the stationary properties of human behavior within an interaction and present a representation learning method to capture behavioral information from speech in an unsupervised way. We hypothesize that nearby segments of speech share the same behavioral context and hence map onto similar underlying behavioral representations. We present an encoder-decoder based Deep Contextualized Network (DCN) as well as a Triplet-Enhanced DCN (TE-DCN) framework to capture the behavioral context and derive a manifold representation, where speech frames with similar behaviors are closer while frames of different behaviors maintain larger distances. The models are trained on movie audio data and validated on diverse domains including on a couples therapy corpus and other publicly collected data (e.g., stand-up comedy). With encouraging results, our proposed framework shows the feasibility of unsupervised learning within cross-domain behavioral modeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2022

The MABe22 Benchmarks for Representation Learning of Multi-Agent Behavior

Real-world behavior is often shaped by complex interactions between mult...
research
10/15/2022

Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation

Unsupervised representation learning for speech audios attained impressi...
research
08/31/2019

Behavior Gated Language Models

Most current language modeling techniques only exploit co-occurrence, se...
research
07/18/2022

Discovering Behavioral Predispositions in Data to Improve Human Activity Recognition

The automatic, sensor-based assessment of challenging behavior of person...
research
06/17/2018

Learning Policy Representations in Multiagent Systems

Modeling agent behavior is central to understanding the emergence of com...
research
07/18/2018

Multi-Task Unsupervised Contextual Learning for Behavioral Annotation

Unsupervised learning has been an attractive method for easily deriving ...

Please sign up or login with your details

Forgot password? Click here to reset