ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering

06/06/2019
by   Zhou Yu, et al.
0

Recent developments in modeling language and vision have been successfully applied to image question answering. It is both crucial and natural to extend this research direction to the video domain for video question answering (VideoQA). Compared to the image domain where large scale and fully annotated benchmark datasets exists, VideoQA datasets are limited to small scale and are automatically generated, etc. These limitations restrict their applicability in practice. Here we introduce ActivityNet-QA, a fully annotated and large scale VideoQA dataset. The dataset consists of 58,000 QA pairs on 5,800 complex web videos derived from the popular ActivityNet dataset. We present a statistical analysis of our ActivityNet-QA dataset and conduct extensive experiments on it by comparing existing VideoQA baselines. Moreover, we explore various video representation strategies to improve VideoQA performance, especially for long videos. The dataset is available at https://github.com/MILVLG/activitynet-qa

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

research
02/03/2022

JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension

Question Answering (QA) is a task in which a machine understands a given...
research
09/05/2018

TVQA: Localized, Compositional Video Question Answering

Recent years have witnessed an increasing interest in image-based questi...
research
12/02/2018

How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos

Understanding web instructional videos is an essential branch of video u...
research
03/29/2021

TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events

Traffic event cognition and reasoning in videos is an important task tha...
research
06/13/2023

WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences

We present WebGLM, a web-enhanced question-answering system based on the...
research
05/30/2019

MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms

We introduce a large-scale dataset of math word problems and an interpre...
research
09/09/2018

Transforming Question Answering Datasets Into Natural Language Inference Datasets

Existing datasets for natural language inference (NLI) have propelled re...

Please sign up or login with your details

Forgot password? Click here to reset