VRFP: On-the-fly Video Retrieval using Web Images and Fast Fisher Vector Products

12/10/2015
by   Xintong Han, et al.
0

VRFP is a real-time video retrieval framework based on short text input queries, which obtains weakly labeled training images from the web after the query is known. The retrieved web images representing the query and each database video are treated as unordered collections of images, and each collection is represented using a single Fisher Vector built on CNN features. Our experiments show that a Fisher Vector is robust to noise present in web images and compares favorably in terms of accuracy to other standard representations. While a Fisher Vector can be constructed efficiently for a new query, matching against the test set is slow due to its high dimensionality. To perform matching in real-time, we present a lossless algorithm that accelerates the inner product computation between high dimensional Fisher Vectors. We prove that the expected number of multiplications required decreases quadratically with the sparsity of Fisher Vectors. We are not only able to construct and apply query models in real-time, but with the help of a simple re-ranking scheme, we also outperform state-of-the-art automatic retrieval methods by a significant margin on TRECVID MED13 (3.5 (5.2 different paradigms for automatic video retrieval - zero-shot learning and on-the-fly retrieval.

READ FULL TEXT

page 11

page 13

research
04/12/2016

GPU-FV: Realtime Fisher Vector and Its Applications in Video Monitoring

Fisher vector has been widely used in many multimedia retrieval and visu...
research
07/17/2014

Efficient On-the-fly Category Retrieval using ConvNets and GPUs

We investigate the gains in precision and speed, that can be obtained by...
research
03/18/2017

Deep Tensor Encoding

Learning an encoding of feature vectors in terms of an over-complete dic...
research
02/01/2017

Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval

This paper addresses the problem of large scale image retrieval, with th...
research
09/14/2023

Zero-shot Audio Topic Reranking using Large Language Models

The Multimodal Video Search by Examples (MVSE) project investigates usin...
research
10/27/2022

DESSERT: An Efficient Algorithm for Vector Set Search with Vector Set Queries

We study the problem of vector set search with vector set queries. This ...

Please sign up or login with your details

Forgot password? Click here to reset