Video In Sentences Out

04/12/2012
by   Andrei Barbu, et al.
0

We present a system that produces sentential descriptions of video: who did what to whom, and where and how they did it. Action class is rendered as a verb, participant objects as noun phrases, properties of those objects as adjectival modifiers in those noun phrases,spatial relations between those participants as prepositional phrases, and characteristics of the event as prepositional-phrase adjuncts and adverbial modifiers. Extracting the information needed to render these linguistic entities requires an approach to event recognition that recovers object tracks, the track-to-role assignments, and changing body posture.

READ FULL TEXT

page 5

page 12

research
03/13/2022

Informative Causality Extraction from Medical Literature via Dependency-tree based Patterns

Extracting cause-effect entities from medical literature is an important...
research
08/21/2020

To Paraphrase or Not To Paraphrase: User-Controllable Selective Paraphrase Generation

In this article, we propose a paraphrase generation technique to keep th...
research
11/21/2016

Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues

This paper presents a framework for localization or grounding of phrases...
research
10/01/2020

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

The task of video object segmentation with referring expressions (langua...
research
04/17/2021

Characterizing Idioms: Conventionality and Contingency

Idioms are unlike other phrases in two important ways. First, the words ...
research
02/23/2016

Petrarch 2 : Petrarcher

PETRARCH 2 is the fourth generation of a series of Event-Data coders ste...
research
04/16/2012

Large-Scale Automatic Labeling of Video Events with Verbs Based on Event-Participant Interaction

We present an approach to labeling short video clips with English verbs ...

Please sign up or login with your details

Forgot password? Click here to reset