Recently, various methods for representation learning on Knowledge Bases...
While today's video recognition systems parse snapshots or short clips
a...
In this paper, we study Multiscale Vision Transformers (MViT) as a unifi...
We introduce PyTorchVideo, an open-source deep-learning library that pro...
Graph Convolutional Networks (GCNs) are typically studied through the le...
We present a large-scale study on unsupervised spatiotemporal representa...
We present Multiscale Vision Transformers (MViT) for video and image
rec...
We introduce an approach for pre-training egocentric video models using
...
We present a multiview pseudo-labeling approach to video learning, a nov...
Automated hyperparameter optimization (HPO) has shown great power in man...
Highlight detection has the potential to significantly ease video browsi...
We propose an end-to-end learning framework for segmenting generic objec...
360^∘ panoramas are a rich medium, yet notoriously difficult to
visualiz...
Existing methods to recognize actions in static images take the images a...
We propose an end-to-end learning framework for segmenting generic objec...
We propose an end-to-end learning framework for generating foreground ob...