An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition
Personality is crucial for understanding human internal and external states. The majority of existing personality computing approaches suffer from complex and dataset-specific pre-processing steps and model training tricks. In the absence of a standardized benchmark with consistent experimental settings, it is not only impossible to fairly compare the real performances of these personality computing models but also makes them difficult to be reproduced. In this paper, we present the first reproducible audio-visual benchmarking framework to provide a fair and consistent evaluation of eight existing personality computing models (e.g., audio, visual and audio-visual) and seven standard deep learning models on both self-reported and apparent personality recognition tasks. We conduct a comprehensive investigation into all the benchmarked models to demonstrate their capabilities in modelling personality traits on two publicly available datasets, audio-visual apparent personality (ChaLearn First Impression) and self-reported personality (UDIVA) datasets. The experimental results conclude: (i) apparent personality traits, inferred from facial behaviours by most benchmarked deep learning models, show more reliability than self-reported ones; (ii) visual models frequently achieved superior performances than audio models on personality recognition; and (iii) non-verbal behaviours contribute differently in predicting different personality traits. We make the code publicly available at https://github.com/liaorongfan/DeepPersonality .
READ FULL TEXT