SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data
State-of-the-art techniques of artificial intelligence, in particular deep learning, are mostly data-driven. However, collecting and manually labeling a large scale dataset is both difficult and expensive. A promising alternative is to introduce synthesized training data, so that the dataset size can be significantly enlarged with little human labor. But, this raises an important problem in active vision: given an infinite data space, how to effectively sample a finite subset to train a visual classifier? This paper presents an approach for learning from synthesized data effectively. The motivation is straightforward -- increasing the probability of seeing difficult training data. We introduce a module named SampleAhead to formulate the learning process into an online communication between a classifier and a sampler, and update them iteratively. In each round, we adjust the sampling distribution according to the classification results, and train the classifier using the data sampled from the updated distribution. Experiments are performed by introducing synthesized images rendered from ShapeNet models to assist PASCAL3D+ classification. Our approach enjoys higher classification accuracy, especially in the scenario of a limited number of training samples. This demonstrates its efficiency in exploring the infinite data space.
READ FULL TEXT