CVB: A Video Dataset of Cattle Visual Behaviors

by   Ali Zia, et al.

Existing image/video datasets for cattle behavior recognition are mostly small, lack well-defined labels, or are collected in unrealistic controlled environments. This limits the utility of machine learning (ML) models learned from them. Therefore, we introduce a new dataset, called Cattle Visual Behaviors (CVB), that consists of 502 video clips, each fifteen seconds long, captured in natural lighting conditions, and annotated with eleven visually perceptible behaviors of grazing cattle. We use the Computer Vision Annotation Tool (CVAT) to collect our annotations. To make the procedure more efficient, we perform an initial detection and tracking of cattle in the videos using appropriate pre-trained models. The results are corrected by domain experts along with cattle behavior labeling in CVAT. The pre-hoc detection and tracking step significantly reduces the manual annotation time and effort. Moreover, we convert CVB to the atomic visual action (AVA) format and train and evaluate the popular SlowFast action recognition model on it. The associated preliminary results confirm that we can localize the cattle and recognize their frequently occurring behaviors with confidence. By creating and sharing CVB, our aim is to develop improved models capable of recognizing all important behaviors accurately and to assist other researchers and practitioners in developing and evaluating new ML models for cattle behavior classification using video data.


page 2

page 4


Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding

Understanding animals' behaviors is significant for a wide range of appl...

SLAC: A Sparsely Labeled Dataset for Action Classification and Localization

This paper describes a procedure for the creation of large-scale video d...

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

This paper introduces a video dataset of spatio-temporally localized Ato...

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

Computer vision has a great potential to help our daily lives by searchi...

Action Class Relation Detection and Classification Across Multiple Video Datasets

The Meta Video Dataset (MetaVD) provides annotated relations between act...

SPIN: A High Speed, High Resolution Vision Dataset for Tracking and Action Recognition in Ping Pong

We introduce a new high resolution, high frame rate stereo video dataset...

A Human-ML Collaboration Framework for Improving Video Content Reviews

We deal with the problem of localized in-video taxonomic human annotatio...

Please sign up or login with your details

Forgot password? Click here to reset