Learning Pain from Action Unit Combinations: A Weakly Supervised Approach via Multiple Instance Learning
Facial pain expression is an important modality for assessing pain, especially when a patient's verbal ability to communicate is impaired. A set of eight facial muscle based action units (AUs), which are defined by the Facial Action Coding System (FACS), have been widely studied and are highly reliable for pain detection through facial expressions. However, using FACS is a very time consuming task that makes its clinical use prohibitive. An automated facial expression recognition system (AFER) reliably detecting pain-related AUs would be highly beneficial for efficient and practical pain monitoring. Automated pain detection under clinical settings is viewed as a weakly supervised problem, which is not suitable general AFER system that trained on well labeled data. Existing pain oriented AFER research either focus on the individual pain-related AU recognition or bypassing the AU detection procedure by training a binary pain classifier from pain intensity data. In this paper, we decouple pain detection into two consecutive tasks: the AFER based AU labeling at video frame level and a probabilistic measure of pain at sequence level from AU combination scores. Our work is distinguished in the following aspects, 1) State of the art AFER tools Emotient is applied on pain oriented data sets for single AU labeling. 2) Two different data structures are proposed to encode AU combinations from single AU scores, which forms low-dimensional feature vectors for the learning framework. 3) Two weakly supervised learning frameworks namely multiple instance learning and multiple clustered instance learning are employed corresponding to each feature structure to learn pain from video sequences. The results shows 87 AUC on UNBC-McMaster dataset. Tests on Wilkie's dataset suggests the potential value of the proposed system for pain monitoring task under clinical settings.
READ FULL TEXT