Traffic-Domain Video Question Answering with Automatic Captioning

07/18/2023
by   Ehsan Qasemi, et al.
0

Video Question Answering (VidQA) exhibits remarkable potential in facilitating advanced machine reasoning capabilities within the domains of Intelligent Traffic Monitoring and Intelligent Transportation Systems. Nevertheless, the integration of urban traffic scene knowledge into VidQA systems has received limited attention in previous research endeavors. In this work, we present a novel approach termed Traffic-domain Video Question Answering with Automatic Captioning (TRIVIA), which serves as a weak-supervision technique for infusing traffic-domain knowledge into large video-language models. Empirical findings obtained from the SUTD-TrafficQA task highlight the substantial enhancements achieved by TRIVIA, elevating the accuracy of representative video-language models by a remarkable 6.5 points (19.88 promise for driving advancements in the field, inspiring researchers and practitioners alike to unlock the full potential of emerging video-language models in traffic-related applications.

READ FULL TEXT

page 2

page 4

research
12/04/2022

Utilizing Background Knowledge for Robust Reasoning over Traffic Situations

Understanding novel situations in the traffic domain requires an intrica...
research
09/13/2023

TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models

With the promotion of chatgpt to the public, Large language models indee...
research
08/07/2023

KITLM: Domain-Specific Knowledge InTegration into Language Models for Question Answering

Large language models (LLMs) have demonstrated remarkable performance in...
research
03/29/2021

TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events

Traffic event cognition and reasoning in videos is an important task tha...
research
09/05/2023

Augmenting Black-box LLMs with Medical Textbooks for Clinical Question Answering

Large-scale language models (LLMs), such as ChatGPT, are capable of gene...
research
05/25/2023

The Dangers of trusting Stochastic Parrots: Faithfulness and Trust in Open-domain Conversational Question Answering

Large language models are known to produce output which sounds fluent an...
research
07/17/2023

Harnessing Scalable Transactional Stream Processing for Managing Large Language Models [Vision]

Large Language Models (LLMs) have demonstrated extraordinary performance...

Please sign up or login with your details

Forgot password? Click here to reset