Approximate Policy Iteration for Budgeted Semantic Video Segmentation

07/26/2016
by   Behrooz Mahasseni, et al.
0

This paper formulates and presents a solution to the new problem of budgeted semantic video segmentation. Given a video, the goal is to accurately assign a semantic class label to every pixel in the video within a specified time budget. Typical approaches to such labeling problems, such as Conditional Random Fields (CRFs), focus on maximizing accuracy but do not provide a principled method for satisfying a time budget. For video data, the time required by CRF and related methods is often dominated by the time to compute low-level descriptors of supervoxels across the video. Our key contribution is the new budgeted inference framework for CRF models that intelligently selects the most useful subsets of descriptors to run on subsets of supervoxels within the time budget. The objective is to maintain an accuracy as close as possible to the CRF model with no time bound, while remaining within the time budget. Our second contribution is the algorithm for learning a policy for the sparse selection of supervoxels and their descriptors for budgeted CRF inference. This learning algorithm is derived by casting our problem in the framework of Markov Decision Processes, and then instantiating a state-of-the-art policy learning algorithm known as Classification-Based Approximate Policy Iteration. Our experiments on multiple video datasets show that our learning approach and framework is able to significantly reduce computation time, and maintain competitive accuracy under varying budgets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2019

Bipartite Conditional Random Fields for Panoptic Segmentation

We tackle the panoptic segmentation problem with a conditional random fi...
research
11/30/2017

Budget-Aware Activity Detection with A Recurrent Policy Network

In this paper, we address the challenging problem of effi- cient tempora...
research
08/11/2016

Learning Dynamic Hierarchical Models for Anytime Scene Labeling

With increasing demand for efficient image and video analysis, test-time...
research
09/17/2020

Fast and Accurate Sequence Labeling with Approximate Inference Network

The linear-chain Conditional Random Field (CRF) model is one of the most...
research
10/01/2018

Learnable Pooling Methods for Video Classification

We introduce modifications to state-of-the-art approaches to aggregating...
research
02/13/2023

Joint Span Segmentation and Rhetorical Role Labeling with Data Augmentation for Legal Documents

Segmentation and Rhetorical Role Labeling of legal judgements play a cru...
research
11/22/2020

Video SemNet: Memory-Augmented Video Semantic Network

Stories are a very compelling medium to convey ideas, experiences, socia...

Please sign up or login with your details

Forgot password? Click here to reset