Semantic-Aware Pretraining for Dense Video Captioning

04/13/2022
by   Teng Wang, et al.
6

This report describes the details of our approach for the event dense-captioning task in ActivityNet Challenge 2021. We present a semantic-aware pretraining method for dense video captioning, which empowers the learned features to recognize high-level semantic concepts. Diverse video features of different modalities are fed into an event captioning module to generate accurate and meaningful sentences. Our final ensemble model achieves a 10.00 METEOR score on the test set.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro