Conventional Transformer-based Video Question Answering (VideoQA) approa...
Event-Level Video Question Answering (EVQA) requires complex reasoning a...
Large language models (LLMs) such as Chat-GPT can produce coherent, cohe...
Question Generation (QG) aims to automate the task of composing question...