A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference

06/26/2023
by   Chao Zhang, et al.
0

Affordance-centric Question-driven Task Completion (AQTC) for Egocentric Assistant introduces a groundbreaking scenario. In this scenario, through learning instructional videos, AI assistants provide users with step-by-step guidance on operating devices. In this paper, we present a solution for enhancing video alignment to improve multi-step inference. Specifically, we first utilize VideoCLIP to generate video-script alignment features. Afterwards, we ground the question-relevant content in instructional videos. Then, we reweight the multimodal context to emphasize prominent features. Finally, we adopt GRU to conduct multi-step inference. Through comprehensive experiments, we demonstrate the effectiveness and superiority of our method, which secured the 2nd place in CVPR'2023 AQTC challenge. Our code is available at https://github.com/zcfinal/LOVEU-CVPR23-AQTC.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2022

Winning the CVPR'2022 AQTC Challenge: A Two-stage Function-centric Approach

Affordance-centric Question-driven Task Completion for Egocentric Assist...
research
08/22/2023

Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition

We are concerned with a challenging scenario in unpaired multiview video...
research
06/29/2022

Technical Report for CVPR 2022 LOVEU AQTC Challenge

This technical report presents the 2nd winning model for AQTC, a task ne...
research
03/08/2022

AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant

A long-standing goal of intelligent assistants such as AR glasses/robots...
research
06/06/2023

Learning to Ground Instructional Articles in Videos through Narrations

In this paper we present an approach for localizing steps of procedural ...
research
03/24/2023

Aligning Step-by-Step Instructional Diagrams to Video Demonstrations

Multimodal alignment facilitates the retrieval of instances from one mod...
research
05/28/2021

2nd Place Solution for IJCAI-PRICAI 2020 3D AI Challenge: 3D Object Reconstruction from A Single Image

In this paper, we present our solution for the IJCAI–PRICAI–20 3D AI Cha...

Please sign up or login with your details

Forgot password? Click here to reset