PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals

11/29/2022
by   Zhihao Zhang, et al.
0

We propose a PiggyBack, a Visual Question Answering platform that allows users to apply the state-of-the-art visual-language pretrained models easily. The PiggyBack supports the full stack of visual question answering tasks, specifically data processing, model fine-tuning, and result visualisation. We integrate visual-language models, pretrained by HuggingFace, an open-source API platform of deep learning technologies; however, it cannot be runnable without programming skills or deep learning understanding. Hence, our PiggyBack supports an easy-to-use browser-based user interface with several deep learning visual language pretrained models for general users and domain experts. The PiggyBack includes the following benefits: Free availability under the MIT License, Portability due to web-based and thus runs on almost any platform, A comprehensive data creation and processing technique, and ease of use on deep learning-based visual language pretrained models. The demo video is available on YouTube and can be found at https://youtu.be/iz44RZ1lF4s.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/11/2021

Mounting Video Metadata on Transformer-based Language Model for Open-ended Video Question Answering

Video question answering has recently received a lot of attention from m...
research
07/25/2022

Is GPT-3 all you need for Visual Question Answering in Cultural Heritage?

The use of Deep Learning and Computer Vision in the Cultural Heritage do...
research
07/02/2017

Modulating early visual processing by language

It is commonly assumed that language refers to high-level visual concept...
research
06/16/2021

Probing Image-Language Transformers for Verb Understanding

Multimodal image-language transformers have achieved impressive results ...
research
10/07/2019

ViP: Video Platform for PyTorch

This work presents the Video Platform for PyTorch (ViP), a deep learning...
research
10/05/2022

Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model

Explainable question answering systems should produce not only accurate ...
research
09/05/2019

A Generalized Web Component for Domain-Independent Smart Assistants

This article introduces an open-source web component, Instant Expert, wh...

Please sign up or login with your details

Forgot password? Click here to reset