-
Local-Global Video-Text Interactions for Temporal Grounding
This paper addresses the problem of text-to-video temporal grounding, wh...
read it
-
Towards Oracle Knowledge Distillation with Neural Architecture Search
We present a novel framework of knowledge distillation that is capable o...
read it
-
Streamlined Dense Video Captioning
Dense video captioning is an extremely challenging task since accurate a...
read it
-
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
We study how to leverage off-the-shelf visual and linguistic data to cop...
read it
-
Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
Overfitting is one of the most critical challenges in deep neural networ...
read it
-
Text-guided Attention Model for Image Captioning
Visual attention plays an important role to understand images and demons...
read it
-
MarioQA: Answering Questions by Watching Gameplay Videos
We present a framework to analyze various aspects of models for video qu...
read it

Jonghwan Mun
is this you? claim profile