CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning

11/26/2018
by   Jerome Abdelnour, et al.
0

We introduce the task of acoustic question answering (AQA) in the area of acoustic reasoning. In this task an agent learns to answer questions on the basis of acoustic context. In order to promote research in this area, we propose a data generation paradigm adapted from CLEVR (Johnson et al. 2017). We generate acoustic scenes by leveraging a bank elementary sounds. We also provide a number of functional programs that can be used to compose questions and answers that exploit the relationships between the attributes of the elementary sounds in each scene. We provide AQA datasets of various sizes as well as the data generation code. As a preliminary experiment to validate our data, we report the accuracy of current state of the art visual question answering models when they are applied to the AQA task without modifications. Although there is a plethora of question answering tasks based on text, image or video data, to our knowledge, we are the first to propose answering questions directly on audio streams. We hope this contribution will facilitate the development of research in the area.

READ FULL TEXT

page 3

page 8

page 9

page 11

research
02/28/2019

From Visual to Acoustic Question Answering

We introduce the new task of Acoustic Question Answering (AQA) to promot...
research
03/14/2022

ScienceWorld: Is your Agent Smarter than a 5th Grader?

This paper presents a new benchmark, ScienceWorld, to test agents' scien...
research
06/11/2021

NAAQA: A Neural Architecture for Acoustic Question Answering

The goal of the Acoustic Question Answering (AQA) task is to answer a fr...
research
02/13/2016

Science Question Answering using Instructional Materials

We provide a solution for elementary science test using instructional ma...
research
06/11/2018

Prosody Modifications for Question-Answering in Voice-Only Settings

Many popular form factors of digital assistant---such as Amazon Echo, Ap...
research
06/06/2022

Invariant Grounding for Video Question Answering

Video Question Answering (VideoQA) is the task of answering questions ab...
research
11/19/2015

Dynamic Adaptive Network Intelligence

Accurate representational learning of both the explicit and implicit rel...

Please sign up or login with your details

Forgot password? Click here to reset