Multiresolution Match Kernels for Gesture Video Classification
The emergence of depth imaging technologies like the Microsoft Kinect has renewed interest in computational methods for gesture classification based on videos. For several years now, researchers have used the Bag-of-Features (BoF) as a primary method for generation of feature vectors from video data for recognition of gestures. However, the BoF method is a coarse representation of the information in a video, which often leads to poor similarity measures between videos. Besides, when features extracted from different spatio-temporal locations in the video are pooled to create histogram vectors in the BoF method, there is an intrinsic loss of their original locations in space and time. In this paper, we propose a new Multiresolution Match Kernel (MMK) for video classification, which can be considered as a generalization of the BoF method. We apply this procedure to hand gesture classification based on RGB-D videos of the American Sign Language(ASL) hand gestures and our results show promise and usefulness of this new method.
READ FULL TEXT