Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering

04/07/2023
by   Hung-Ting Su, et al.
0

Causal Video Question Answering (CVidQA) queries not only association or temporal relations but also causal relations in a video. Existing question synthesis methods pre-trained question generation (QG) systems on reading comprehension datasets with text descriptions as inputs. However, QG models only learn to ask association questions (e.g., “what is someone doing...”) and result in inferior performance due to the poor transfer of association knowledge to CVidQA, which focuses on causal questions like “why is someone doing ...”. Observing this, we proposed to exploit causal knowledge to generate question-answer pairs, and proposed a novel framework, Causal Knowledge Extraction from Language Models (CaKE-LM), leveraging causal commonsense knowledge from language models to tackle CVidQA. To extract knowledge from LMs, CaKE-LM generates causal questions containing two events with one triggering another (e.g., “score a goal” triggers “soccer player kicking ball”) by prompting LM with the action (soccer player kicking ball) to retrieve the intention (to score a goal). CaKE-LM significantly outperforms conventional methods by 4 Causal-VidQA datasets. We also conduct comprehensive analyses and provide key findings for future research.

READ FULL TEXT

page 1

page 4

page 7

research
06/08/2021

Comprehension Based Question Answering using Bloom's Taxonomy

Current pre-trained language models have lots of knowledge, but a more l...
research
01/01/2022

Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization

Commonsense question answering (CQA) aims to test if models can answer q...
research
03/12/2021

Cooperative Learning of Zero-Shot Machine Reading Comprehension

Pretrained language models have significantly improved the performance o...
research
12/26/2021

ArT: All-round Thinker for Unsupervised Commonsense Question-Answering

Without labeled question-answer pairs for necessary training, unsupervis...
research
09/01/2022

Why Do Neural Language Models Still Need Commonsense Knowledge to Handle Semantic Variations in Question Answering?

Many contextualized word representations are now learned by intricate ne...
research
04/20/2023

Why Does ChatGPT Fall Short in Answering Questions Faithfully?

Recent advancements in Large Language Models, such as ChatGPT, have demo...
research
09/12/2023

Circuit Breaking: Removing Model Behaviors with Targeted Ablation

Language models often exhibit behaviors that improve performance on a pr...

Please sign up or login with your details

Forgot password? Click here to reset