An analysis of observation length requirements for machine understanding of human behaviors in spoken language

Machine learning-based human behavior modeling, often at the level of characterizing an entire clinical encounter such as a therapy session, has been shown to be useful across a range of domains in psychological research and practice from relationship and family studies to cancer care. Existing approaches typically first quantify the target behavior construct based on cues in an observation window, such as a fixed number of words, and then aggregate it over all the windows in that session. During this process, a sufficiently long window is employed so that adequate information is gathered to accurately estimate the construct. The link between behavior modeling and the observation length, however, has not been well studied, especially for spoken language. In this paper, we analyze the effect of observation window length on the quality of behavior quantification and present a framework for determining appropriate windows for a wide range of behaviors. Our analysis method employs two levels of evaluations: (a) extrinsic similarity between machine predictions and human expert annotations, and (b) intrinsic consistency between intra-machine and intra-human behavior relations. We apply our analysis on a dataset of real-life married couple interactions that are annotated for a large and diverse set of behavior codes and test the robustness of our findings to different machine learning models. We find that negative constructs such as blame can be accurately identified from short expressions while those pertaining to positive affect such as satisfaction tend to require slightly longer observation windows. Behaviors that describe more complex personality traits such as negotiation and avoidance are found to require very long observations and are difficult to quantify from language alone. Our findings are in agreement with similar work on acoustic cues, thin slices and human emotion perception.


An analysis of observation length requirements in spoken language for machine understanding of human behaviors

Automatic quantification of human interaction behaviors based on languag...

Parallel Context Windows Improve In-Context Learning of Large Language Models

For applications that require processing large amounts of text at infere...

Predicting Behavior in Cancer-Afflicted Patient and Spouse Interactions using Speech and Language

Cancer impacts the quality of life of those diagnosed as well as their s...

Scope and Arbitration in Machine Learning Clinical EEG Classification

A key task in clinical EEG interpretation is to classify a recording or ...

Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation

For real-life applications, it is crucial that end-to-end spoken languag...

An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

This technical report describes the CONE approach for Ego4D Natural Lang...

Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative Climate

In this work we present a multi-modal machine learning-based system, whi...

Please sign up or login with your details

Forgot password? Click here to reset