Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods

07/22/2019
by   Aditya Mogadala, et al.
5

Integration of vision and language tasks has seen a significant growth in the recent times due to surge of interest from multi-disciplinary communities such as deep learning, computer vision, and natural language processing. In this survey, we focus on ten different vision and language integration tasks in terms of their problem formulation, methods, existing datasets, evaluation measures, and comparison of results achieved with the corresponding state-of-the-art methods. This goes beyond earlier surveys which are either task-specific or concentrate only on one type of visual content i.e., image or video. We then conclude the survey by discussing some possible future directions for integration of vision and language research.

READ FULL TEXT

page 5

page 9

page 12

page 13

page 16

page 17

page 19

page 22

research
03/22/2022

Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions

A long-term goal of AI research is to build intelligent agents that can ...
research
12/26/2022

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Artificial Intelligence (AI) and its applications have sparked extraordi...
research
01/15/2016

Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures

Automatic description generation from natural images is a challenging pr...
research
02/16/2023

GLUECons: A Generic Benchmark for Learning Under Constraints

Recent research has shown that integrating domain knowledge into deep le...
research
09/18/2017

Normal Integration: A Survey

The need for efficient normal integration methods is driven by several c...
research
07/15/2022

Reasoning about Actions over Visual and Linguistic Modalities: A Survey

'Actions' play a vital role in how humans interact with the world and en...
research
06/01/2018

Video Description: A Survey of Methods, Datasets and Evaluation Metrics

Automatic video description is useful for assisting the visually impaire...

Please sign up or login with your details

Forgot password? Click here to reset