We propose NeRF-VAE, a 3D scene generative model that incorporates geome...
Recently multimodal transformer models have gained popularity because th...
Videos are a rich source of multi-modal supervision. In this work, we le...
Recent work has shown how predictive modeling can endow agents with rich...
The question of whether deep neural networks are good at generalising be...