Time Matters in Using Data Augmentation for Vision-based Deep Reinforcement Learning
Data augmentation technique from computer vision has been widely considered as a regularization method to improve data efficiency and generalization performance in vision-based reinforcement learning. We variate the timing of using augmentation, which is, in turn, critical depending on tasks to be solved in training and testing. According to our experiments on Open AI Procgen Benchmark, if the regularization imposed by augmentation is helpful only in testing, it is better to procrastinate the augmentation after training than to use it during training in terms of sample and computation complexity. We note that some of such augmentations can disturb the training process. Conversely, an augmentation providing regularization useful in training needs to be used during the whole training period to fully utilize its benefit in terms of not only generalization but also data efficiency. These phenomena suggest a useful timing control of data augmentation in reinforcement learning.
READ FULL TEXT