PIC: Permutation Invariant Convolution for Recognizing Long-range Activities

03/18/2020
by   Noureldien Hussein, et al.
1

Neural operations as convolutions, self-attention, and vector aggregation are the go-to choices for recognizing short-range actions. However, they have three limitations in modeling long-range activities. This paper presents PIC, Permutation Invariant Convolution, a novel neural layer to model the temporal structure of long-range activities. It has three desirable properties. i. Unlike standard convolution, PIC is invariant to the temporal permutations of features within its receptive field, qualifying it to model the weak temporal structures. ii. Different from vector aggregation, PIC respects local connectivity, enabling it to learn long-range temporal abstractions using cascaded layers. iii. In contrast to self-attention, PIC uses shared weights, making it more capable of detecting the most discriminant visual evidence across long and noisy videos. We study the three properties of PIC and demonstrate its effectiveness in recognizing the long-range activities of Charades, Breakfast, and MultiThumos.

READ FULL TEXT
research
12/04/2018

Timeception for Complex Action Recognition

This paper focuses on the temporal aspect for recognizing human activiti...
research
04/03/2020

TimeGate: Conditional Gating of Segments in Long-range Activities

When recognizing a long-range activity, exploring the entire video is ex...
research
10/04/2013

Director Field Model of the Primary Visual Cortex for Contour Detection

We aim to build the simplest possible model capable of detecting long, n...
research
06/18/2019

Multiple Testing Embedded in an Aggregation Tree to Identify where Two Distributions Differ

A key goal of flow cytometry data analysis is to identify the subpopulat...
research
12/14/2021

Explore Long-Range Context feature for Speaker Verification

Capturing long-range dependency and modeling long temporal contexts is p...
research
01/14/2019

Long range actions, connectedness, and dismantlability in relational structures

In this paper we study alternative characterizations of dismantlability ...
research
02/28/2018

Retrieval and Registration of Long-Range Overlapping Frames for Scalable Mosaicking of In Vivo Fetoscopy

Purpose: The standard clinical treatment of Twin-to-Twin Transfusion Syn...

Please sign up or login with your details

Forgot password? Click here to reset