Continuous Affect Prediction Using Eye Gaze and Speech
Affective computing research traditionally focused on labeling a person's emotion as one of a discrete number of classes e.g. happy or sad. In recent times, more attention has been given to continuous affect prediction across dimensions in the emotional space, e.g. arousal and valence. Continuous affect prediction is the task of predicting a numerical value for different emotion dimensions. The application of continuous affect prediction is powerful in domains involving real-time audio-visual communications which could include remote or assistive technologies for psychological assessment of subjects. Modalities used for continuous affect prediction may include speech, facial expressions and physiological responses. As opposed to single modality analysis, the research community have combined multiple modalities to improve the accuracy of continuous affect prediction. In this context, this paper investigates a continuous affect prediction system using the novel combination of speech and eye gaze. A new eye gaze feature set is proposed. This novel approach uses open source software for real-time affect prediction in audio-visual communication environments. A unique advantage of the human-computer interface used here is that it does not require the subject to wear specialized and expensive eye-tracking headsets or intrusive devices. The results indicate that the combination of speech and eye gaze improves arousal prediction by 3.5 alone.
READ FULL TEXT