Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding

09/26/2022
by   Erica K. Shimomoto, et al.
0

This paper explores the task of Temporal Video Grounding (TVG) where, given an untrimmed video and a query sentence, the goal is to recognize and determine temporal boundaries of action instances in the video described by the provided natural language queries. Recent works solve this task by directly encoding the query using large pre-trained language models (PLM). However, isolating the effects of the improved language representations is difficult, as these works also propose improvements in the visual inputs. Furthermore, these PLMs significantly increase the computational cost of training TVG models. Therefore, this paper studies the effects of PLMs in the TVG task and assesses the applicability of NLP parameter-efficient training alternatives based on adapters. We couple popular PLMs with a selection of existing approaches and test different adapters to reduce the impact of the additional parameters. Our results on three challenging datasets show that TVG models could greatly benefit from PLMs when these are fine-tuned for the task and that adapters are an effective alternative to full fine-tuning, even though they are not tailored for our task. Concretely, adapters helped save on computational cost, allowing PLM integration in larger TVG models and delivering results comparable to the state-of-the-art models. Finally, through benchmarking different types of adapters in TVG, our results shed light on what kind of adapters work best for each studied case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2020

How fine can fine-tuning be? Learning efficient language models

State-of-the-art performance on language understanding tasks is now achi...
research
09/30/2020

Multiple Word Embeddings for Increased Diversity of Representation

Most state-of-the-art models in natural language processing (NLP) are ne...
research
06/27/2020

Video-Grounded Dialogues with Pretrained Generation Language Models

Pre-trained language models have shown remarkable success in improving v...
research
11/04/2021

CoreLM: Coreference-aware Language Model Fine-Tuning

Language Models are the underpin of all modern Natural Language Processi...
research
07/15/2022

Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis

Representational Similarity Analysis is a method from cognitive neurosci...
research
07/17/2023

GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution

Augmenting large language models (LLM) to use external tools enhances th...
research
06/29/2023

Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research

Many recent improvements in NLP stem from the development and use of lar...

Please sign up or login with your details

Forgot password? Click here to reset