PyTorchVideo: A Deep Learning Library for Video Understanding

by   Haoqi Fan, et al.

We introduce PyTorchVideo, an open-source deep-learning library that provides a rich set of modular, efficient, and reproducible components for a variety of video understanding tasks, including classification, detection, self-supervised learning, and low-level processing. The library covers a full stack of video understanding tools including multimodal data loading, transformations, and models that reproduce state-of-the-art performance. PyTorchVideo further supports hardware acceleration that enables real-time inference on mobile devices. The library is based on PyTorch and can be used by any training framework; for example, PyTorchLightning, PySlowFast, or Classy Vision. PyTorchVideo is available at


page 1

page 2

page 3

page 4


CVNets: High Performance Library for Computer Vision

We introduce CVNets, a high-performance open-source library for training...

LAVIS: A Library for Language-Vision Intelligence

We introduce LAVIS, an open-source deep learning library for LAnguage-VI...

Slideflow: Deep Learning for Digital Histopathology with Real-Time Whole-Slide Visualization

Deep learning methods have emerged as powerful tools for analyzing histo...

System-status-aware Adaptive Network for Online Streaming Video Understanding

Recent years have witnessed great progress in deep neural networks for r...

ChainerCV: a Library for Deep Learning in Computer Vision

Despite significant progress of deep learning in the field of computer v...

The Umbrella software suite for automated asteroid detection

We present the Umbrella software suite for asteroid detection, validatio...

GURLS: a Least Squares Library for Supervised Learning

We present GURLS, a least squares, modular, easy-to-extend software libr...

Please sign up or login with your details

Forgot password? Click here to reset