05/04/2021 ∙ by Luca Rossetto, et al. ∙ 0

For research results to be comparable, it is important to have common datasets for experimentation and evaluation. The size of such datasets, however, can be an obstacle to their use. The Vimeo Creative Commons Collection (V3C) is a video dataset designed to be representative of video content found on the web, containing roughly 3800 hours of video in total, split into three shards. In this paper, we present insights on the second of these shards (V3C2) and discuss their implications for research areas, such as video retrieval, for which the dataset might be particularly useful. We also provide all the extracted data in order to simplify the use of the dataset.



