Challenges and Prospects in Vision and Language Research

04/19/2019
by   Kushal Kafle, et al.
16

Language grounded image understanding tasks have often been proposed as a method for evaluating progress in artificial intelligence. Ideally, these tasks should test a plethora of capabilities that integrate computer vision, reasoning, and natural language understanding. However, rather than behaving as visual Turing tests, recent studies have demonstrated state-of-the-art systems are achieving good performance through flaws in datasets and evaluation procedures. We review the current state of affairs and outline a path forward.

READ FULL TEXT

page 2

page 8

page 12

research
01/17/2021

Understanding in Artificial Intelligence

Current Artificial Intelligence (AI) methods, most based on deep learnin...
research
05/12/2023

ArtGPT-4: Artistic Vision-Language Understanding with Adapter-enhanced MiniGPT-4

In recent years, large language models (LLMs) have made significant prog...
research
06/29/2022

Is it possible not to cheat on the Turing Test_Exploring the potential and challenges for true natural language 'understanding' by computers

The increasing sophistication of NLP models has renewed optimism regardi...
research
05/24/2023

On Degrees of Freedom in Defining and Testing Natural Language Understanding

Natural language understanding (NLU) studies often exaggerate or underes...
research
05/12/2021

News Headline Grouping as a Challenging NLU Task

Recent progress in Natural Language Understanding (NLU) has seen the lat...
research
04/13/2016

Visual Storytelling

We introduce the first dataset for sequential vision-to-language, and ex...
research
09/21/2023

Rethinking the Evaluating Framework for Natural Language Understanding in AI Systems: Language Acquisition as a Core for Future Metrics

In the burgeoning field of artificial intelligence (AI), the unprecedent...

Please sign up or login with your details

Forgot password? Click here to reset