Du Tran

research

∙ 06/16/2023

Learning Space-Time Semantic Correspondences

We propose a new task of space-time semantic correspondence prediction i...

0 Du Tran, et al. ∙

research

∙ 03/09/2023

Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision

Many top-down architectures for instance segmentation achieve significan...

3 Tarun Kalluri, et al. ∙

research

∙ 02/16/2023

MINOTAUR: Multi-task Video Grounding From Multimodal Queries

Video understanding tasks take many forms, from action detection to visu...

0 Raghav Goyal, et al. ∙

research

∙ 04/12/2022

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity

Open-world instance segmentation is the task of grouping pixels into obj...

2 Weiyao Wang, et al. ∙

research

∙ 06/17/2021

Long-Short Temporal Contrastive Learning of Video Transformers

Video transformers have recently emerged as a competitive alternative to...

0 Jue Wang, et al. ∙

research

∙ 04/10/2021

Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation

Current state-of-the-art object detection and segmentation methods work ...

0 Weiyao Wang, et al. ∙

research

∙ 12/15/2020

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation

A majority of approaches solve the problem of video frame interpolation ...

7 Tarun Kalluri, et al. ∙

research

∙ 11/28/2019

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

The visual and audio modalities are highly correlated yet they contain d...

25 Humam Alwassel, et al. ∙

research

∙ 06/10/2019

FASTER Recurrent Networks for Video Classification

Video classification methods often divide the video into short clips, do...

0 Linchao Zhu, et al. ∙

research

∙ 06/10/2019

UniDual: A Unified Model for Image and Video Understanding

Although a video is effectively a sequence of images, visual perception ...

4 Yufei Wang, et al. ∙

research

∙ 06/07/2019

Video Modeling with Correlation Networks

Motion is a salient cue to recognize actions in video. Modern action rec...

0 Heng Wang, et al. ∙

research

∙ 06/06/2019

Learning Temporal Pose Estimation from Sparsely-Labeled Videos

Modern approaches for multi-person pose estimation in video require larg...

0 Gedas Bertasius, et al. ∙

research

∙ 05/29/2019

What Makes Training Multi-Modal Networks Hard?

Consider end-to-end training of a multi-modal vs. a single-modal network...

0 Weiyao Wang, et al. ∙

research

∙ 05/02/2019

Large-scale weakly-supervised pre-training for video action recognition

Current fully-supervised video datasets consist of only a few hundred th...

0 Deepti Ghadiyaram, et al. ∙

research

∙ 04/08/2019

SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition

While many action recognition datasets consist of collections of brief, ...

0 Bruno Korbar, et al. ∙

research

∙ 04/04/2019

Video Classification with Channel-Separated Convolutional Networks

Group convolution has been shown to offer great computational savings in...

0 Du Tran, et al. ∙

research

∙ 01/26/2019

DistInit: Learning Video Representations without a Single Labeled Video

Video recognition models have progressed significantly over the past few...

0 Rohit Girdhar, et al. ∙

research

∙ 12/11/2018

Learning Discriminative Motion Features Through Detection

Despite huge success in the image domain, modern detection models such a...

0 Gedas Bertasius, et al. ∙

research

∙ 06/30/2018

Co-Training of Audio and Video Representations from Self-Supervised Temporal Synchronization

There is a natural correlation between the visual and auditive elements ...

0 Bruno Korbar, et al. ∙

research

∙ 12/26/2017

Detect-and-Track: Efficient Pose Estimation in Videos

This paper addresses the problem of estimating and tracking human body k...

0 Rohit Girdhar, et al. ∙

research

∙ 11/30/2017

A Closer Look at Spatiotemporal Convolutions for Action Recognition

In this paper we discuss several forms of spatiotemporal convolutions fo...

0 Du Tran, et al. ∙

research

∙ 01/29/2017

Transformation-Based Models of Video Sequences

In this work we propose a simple unsupervised approach for next frame pr...

0 Joost van Amersfoort, et al. ∙

research

∙ 06/23/2016

VideoMCC: a New Benchmark for Video Comprehension

While there is overall agreement that future technology for organizing, ...

0 Du Tran, et al. ∙

research

∙ 12/02/2014

Learning Spatiotemporal Features with 3D Convolutional Networks

We propose a simple, yet effective approach for spatiotemporal feature l...

0 Du Tran, et al. ∙

research

∙ 12/20/2013

EXMOVES: Classifier-based Features for Scalable Action Recognition

This paper introduces EXMOVES, learned exemplar-based features for effic...

0 Du Tran, et al. ∙

Du Tran

Featured Co-authors

Sign in with Google

Consider DeepAI Pro