Mid-level Representation for Visual Recognition

12/23/2015
by   Moin Nabi, et al.
0

Visual Recognition is one of the fundamental challenges in AI, where the goal is to understand the semantics of visual data. Employing mid-level representation, in particular, shifted the paradigm in visual recognition. The mid-level image/video representation involves discovering and training a set of mid-level visual patterns (e.g., parts and attributes) and represent a given image/video utilizing them. The mid-level patterns can be extracted from images and videos using the motion and appearance information of visual phenomenas. This thesis targets employing mid-level representations for different high-level visual recognition tasks, namely (i)image understanding and (ii)video understanding. In the case of image understanding, we focus on object detection/recognition task. We investigate on discovering and learning a set of mid-level patches to be used for representing the images of an object category. We specifically employ the discriminative patches in a subcategory-aware webly-supervised fashion. We, additionally, study the outcomes provided by employing the subcategory-based models for undoing dataset bias.

READ FULL TEXT

page 15

page 23

page 25

page 26

page 30

research
05/28/2014

Detection Bank: An Object Detection Based Video Representation for Multimedia Event Recognition

While low-level image features have proven to be effective representatio...
research
07/09/2022

Learning Structured Representations of Visual Scenes

As the intermediate-level representations bridging the two levels, struc...
research
06/11/2020

Visualizing and Understanding Vision System

How the human vision system addresses the object identity-preserving rec...
research
05/14/2012

Unsupervised Discovery of Mid-Level Discriminative Patches

The goal of this paper is to discover a set of discriminative patches wh...
research
12/13/2018

Using Motion and Internal Supervision in Object Recognition

In this thesis we address two related aspects of visual object recogniti...
research
08/18/2023

Audio-Visual Glance Network for Efficient Video Recognition

Deep learning has made significant strides in video understanding tasks,...
research
04/19/2015

Visual Recognition Using Directional Distribution Distance

In computer vision, an entity such as an image or video is often represe...

Please sign up or login with your details

Forgot password? Click here to reset