LAVIS: A Library for Language-Vision Intelligence

09/15/2022
by   Dongxu Li, et al.
0

We introduce LAVIS, an open-source deep learning library for LAnguage-VISion research and applications. LAVIS aims to serve as a one-stop comprehensive library that brings recent advancements in the language-vision field accessible for researchers and practitioners, as well as fertilizing future research and development. It features a unified interface to easily access state-of-the-art image-language, video-language models and common datasets. LAVIS supports training, evaluation and benchmarking on a rich variety of tasks, including multimodal classification, retrieval, captioning, visual question answering, dialogue and pre-training. In the meantime, the library is also highly extensible and configurable, facilitating future development and customization. In this technical report, we describe design principles, key components and functionalities of the library, and also present benchmarking results across common language-vision tasks. The library is available at: https://github.com/salesforce/LAVIS.

READ FULL TEXT
research
11/18/2021

PyTorchVideo: A Deep Learning Library for Video Understanding

We introduce PyTorchVideo, an open-source deep-learning library that pro...
research
10/17/2022

Vision-Language Pre-training: Basics, Recent Advances, and Future Trends

This paper surveys vision-language pre-training (VLP) methods for multim...
research
05/24/2023

Vision + Language Applications: A Survey

Text-to-image generation has attracted significant interest from researc...
research
12/26/2022

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

To facilitate research on text generation, this paper presents a compreh...
research
03/24/2022

minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models

We present minicons, an open source library that provides a standard API...
research
11/12/2019

Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research

We present Kaolin, a PyTorch library aiming to accelerate 3D deep learni...
research
09/19/2023

A Configurable Library for Generating and Manipulating Maze Datasets

Understanding how machine learning models respond to distributional shif...

Please sign up or login with your details

Forgot password? Click here to reset