An Analysis of Action Recognition Datasets for Language and Vision Tasks

04/24/2017
by   Spandana Gella, et al.
0

A large amount of recent research has focused on tasks that combine language and vision, resulting in a proliferation of datasets and methods. One such task is action recognition, whose applications include image annotation, scene under- standing and image retrieval. In this survey, we categorize the existing ap- proaches based on how they conceptualize this problem and provide a detailed review of existing datasets, highlighting their di- versity as well as advantages and disad- vantages. We focus on recently devel- oped datasets which link visual informa- tion with linguistic resources and provide a fine-grained syntactic and semantic anal- ysis of actions in images.

READ FULL TEXT
research
07/10/2016

Annotation Methodologies for Vision and Language Dataset Creation

Annotated datasets are commonly used in the training and evaluation of t...
research
08/18/2020

ConvGRU in Fine-grained Pitching Action Recognition for Action Outcome Prediction

Prediction of the action outcome is a new challenge for a robot collabor...
research
10/12/2021

Joint Learning On The Hierarchy Representation for Fine-Grained Human Action Recognition

Fine-grained human action recognition is a core research topic in comput...
research
11/11/2021

Fine-Grained Image Analysis with Deep Learning: A Survey

Fine-grained image analysis (FGIA) is a longstanding and fundamental pro...
research
05/19/2020

Retrieving and Highlighting Action with Spatiotemporal Reference

In this paper, we present a framework that jointly retrieves and spatiot...
research
07/06/2019

Deep Learning for Fine-Grained Image Analysis: A Survey

Computer vision (CV) is the process of using machines to understand and ...
research
05/05/2015

Contextual Action Recognition with R*CNN

There are multiple cues in an image which reveal what action a person is...

Please sign up or login with your details

Forgot password? Click here to reset