AI Chat AI Image Generator AI Video Text to Speech

Building a Video-and-Language Dataset with Human Actions for Multimodal Logical Inference

06/27/2021

∙

by Riko Suzuki, et al.

∙

∙

This paper introduces a new video-and-language dataset with human actions for multimodal logical inference, which focuses on intentional and aspectual expressions that describe dynamic human actions. The dataset consists of 200 videos, 5,554 action labels, and 1,942 action triplets of the form <subject, predicate, object> that can be translated into logical semantic representations. The dataset is expected to be useful for evaluating multimodal inference systems between videos and semantically complicated sentences including negation and quantification.

Riko Suzuki
2 publications
Hitomi Yanaka
18 publications
Koji Mineshima
18 publications
Daisuke Bekki
14 publications

page 1

page 3

page 4

research

∙ 06/10/2019

Multimodal Logical Inference System for Visual-Textual Entailment

A large amount of research about multimodal inference across text and vi...

0 Riko Suzuki, et al. ∙

research

∙ 12/03/2012

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

We introduce UCF101 which is currently the largest dataset of human acti...

0 Khurram Soomro, et al. ∙

research

∙ 03/10/2020

Video Caption Dataset for Describing Human Actions in Japanese

In recent years, automatic video caption generation has attracted consid...

3 Yutaro Shigeto, et al. ∙

research

∙ 06/10/2019

Identifying Visible Actions in Lifestyle Vlogs

We consider the task of identifying human actions visible in online vide...

1 Oana Ignat, et al. ∙

research

∙ 01/15/2020

EEV Dataset: Predicting Expressions Evoked by Diverse Videos

When we watch videos, the visual and auditory information we experience ...

5 Jennifer J. Sun, et al. ∙

research

∙ 12/05/2022

Muscles in Action

Small differences in a person's motion can engage drastically different ...

0 Mia Chiquier, et al. ∙

research

∙ 12/14/2022

Learning and Predicting Multimodal Vehicle Action Distributions in a Unified Probabilistic Model Without Labels

We present a unified probabilistic model that learns a representative se...

0 Charles Richter, et al. ∙