Junting Pan

research

∙ 07/03/2023

JourneyDB: A Benchmark for Generative Image Understanding

While recent advancements in vision-language models have revolutionized ...

0 Junting Pan, et al. ∙

research

∙ 06/15/2023

Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen Large Language Models

Video Question Answering (VideoQA) has been significantly advanced from ...

0 Junting Pan, et al. ∙

research

∙ 05/22/2023

VideoLLM: Modeling Video Sequence with Large Language Models

With the exponential growth of video data, there is an urgent need for a...

0 Guo Chen, et al. ∙

research

∙ 05/04/2023

Personalize Segment Anything Model with One Shot

Driven by large-data pre-training, Segment Anything Model (SAM) has been...

4 Renrui Zhang, et al. ∙

research

∙ 12/06/2022

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

The foundation models have recently shown excellent performance on a var...

4 Yi Wang, et al. ∙

research

∙ 11/17/2022

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges

In this report, we present our champion solutions to five tracks at Ego4...

0 Guo Chen, et al. ∙

research

∙ 06/27/2022

ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning for Action Recognition

Capitalizing on large pre-trained models for various downstream tasks of...

7 Junting Pan, et al. ∙

research

∙ 05/06/2022

EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers

Self-attention based models such as vision transformers (ViTs) have emer...

18 Junting Pan, et al. ∙

research

∙ 06/16/2020

1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020

This technical report introduces our winning solution to the spatio-temp...

0 Siyu Chen, et al. ∙

research

∙ 06/14/2020

Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization

Localizing persons and recognizing their actions from videos is a challe...

7 Junting Pan, et al. ∙

research

∙ 03/11/2019

Video Generation from Single Semantic Label Map

This paper proposes the novel task of video generation conditioned on a ...

16 Junting Pan, et al. ∙

research

∙ 03/03/2019

Unsupervised Bi-directional Flow-based Video Generation from one Snapshot

Imagining multiple consecutive frames given one single snapshot is chall...

6 Lu Sheng, et al. ∙

research

∙ 02/19/2018

Online Action Detection in Untrimmed, Streaming Videos - Modeling and Evaluation

The goal of Online Action Detection (OAD) is to detect action in a timel...

0 Zheng Shou, et al. ∙

research

∙ 01/04/2017

SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

We introduce SalGAN, a deep convolutional neural network for visual sali...

0 Junting Pan, et al. ∙

research

∙ 03/02/2016

Shallow and Deep Convolutional Networks for Saliency Prediction

The prediction of salient areas in images has been traditionally address...

0 Junting Pan, et al. ∙

research

∙ 07/06/2015

End-to-end Convolutional Network for Saliency Prediction

The prediction of saliency areas in images has been traditionally addres...

0 Junting Pan, et al. ∙

Junting Pan

Featured Co-authors

Sign in with Google

Consider DeepAI Pro