Towards an Understanding of Our World by GANing Videos in the Wild
Existing generative video models work well only for videos with a static background. For dynamic scenes, applications of these models demand an extra pre-processing step of background stabilization. In fact, the task of background stabilization may very often prove impossible for videos in the wild. To the best of our knowledge, we present the first video generation framework that works in the wild, without making any assumption on the videos' content. This allows us to avoid the background stabilization step, completely. The proposed method also outperforms the state-of-the-art methods even when the static background assumption is valid. This is achieved by designing a robust one-stream video generation architecture by exploiting Wasserstein GAN frameworks for better convergence. Since the proposed architecture is one-stream, which does not formally distinguish between fore- and background, it can generate - and learn from - videos with dynamic backgrounds. The superiority of our model is demonstrated by successfully applying it to three challenging problems: video colorization, video inpainting, and future prediction.
READ FULL TEXT