Creating a Large-scale Synthetic Dataset for Human Activity Recognition
Creating and labelling datasets of videos for use in training Human Activity Recognition models is an arduous task. In this paper, we approach this by using 3D rendering tools to generate a synthetic dataset of videos, and show that a classifier trained on these videos can generalise to real videos. We use five different augmentation techniques to generate the videos, leading to a wide variety of accurately labelled unique videos. We fine tune a pre-trained I3D model on our videos, and find that the model is able to achieve a high accuracy of 73 the HMDB training set with our dataset provides a 2 performance of the classifier. Finally, we discuss possible extensions to the dataset, including virtual try on and modeling motion of the people.
READ FULL TEXT