b'Sharky.TV'

research

∙ 09/08/2023

MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask

Recent advancements in diffusion models have showcased their impressive ...

0 Yupeng Zhou, et al. ∙

research

∙ 09/02/2023

MagicProp: Diffusion-based Video Editing via Motion-aware Appearance Propagation

This paper addresses the issue of modifying the visual appearance of vid...

0 Hanshu Yan, et al. ∙

research

∙ 08/28/2023

MagicEdit: High-Fidelity and Temporally Coherent Video Editing

In this report, we present MagicEdit, a surprisingly simple yet effectiv...

0 Jun Hao Liew, et al. ∙

research

∙ 08/28/2023

MagicAvatar: Multimodal Avatar Generation and Animation

This report presents MagicAvatar, a framework for multimodal video gener...

0 Jianfeng Zhang, et al. ∙

research

∙ 08/21/2023

Dataset Quantization

State-of-the-art deep neural networks are trained with large amounts (mi...

0 Daquan Zhou, et al. ∙

research

∙ 07/20/2023

AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models

Existing customization methods require access to multiple reference exam...

0 Jiachun Pan, et al. ∙

research

∙ 07/17/2023

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

LLMs have demonstrated remarkable abilities at interacting with humans t...

0 fcq, et al. ∙

research

∙ 06/15/2023

COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

Due to the limited scale and quality of video-text training corpus, most...

0 Sihan Chen, et al. ∙

research

∙ 05/24/2023

Delving Deeper into Data Scaling in Masked Image Modeling

Understanding whether self-supervised learning methods can scale with un...

0 Cheng-Ze Lu, et al. ∙

research

∙ 05/22/2023

VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending

Large-scale image-text contrastive pre-training models, such as CLIP, ha...

0 Xingjian He, et al. ∙

research

∙ 04/03/2023

Associating Spatially-Consistent Grouping with Text-supervised Semantic Segmentation

In this work, we investigate performing semantic segmentation solely thr...

0 Yabo Zhang, et al. ∙

research

∙ 04/01/2023

DOAD: Decoupled One Stage Action Detection Network

Localizing people and recognizing their actions from videos is a challen...

0 Shuning Chang, et al. ∙

research

∙ 03/27/2023

OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis

We present OmniAvatar, a novel geometry-guided 3D head synthesis model t...

0 Hongyi Xu, et al. ∙

research

∙ 03/24/2023

AgileGAN3D: Few-Shot 3D Portrait Stylization by Augmented Transfer Learning

While substantial progresses have been made in automated 2D portrait sty...

0 Guoxian Song, et al. ∙

research

∙ 03/23/2023

TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

In this paper, we investigate an open research task of generating contro...

0 Jiacheng Wei, et al. ∙

research

∙ 01/26/2023

Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring

Image-text pretrained models, e.g., CLIP, have shown impressive general ...

0 Ruyang Liu, et al. ∙

research

∙ 01/19/2023

Multimodal Video Adapter for Parameter Efficient Video Text Retrieval

State-of-the-art video-text retrieval (VTR) methods usually fully fine-t...

0 Bowen Zhang, et al. ∙

research

∙ 01/18/2023

Temporal Perceiving Video-Language Pre-training

Video-Language Pre-training models have recently significantly improved ...

0 Fan Ma, et al. ∙

research

∙ 01/15/2023

CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition

Contrastive Masked Autoencoder (CMAE), as a new self-supervised framewor...

0 Cheng-Ze Lu, et al. ∙

research

∙ 12/21/2022

Class Prototype-based Cleaner for Label Noise Learning

Semi-supervised learning based methods are current SOTA solutions to the...

0 Jingjia Huang, et al. ∙

research

∙ 12/13/2022

PV3D: A 3D Generative Model for Portrait Video Generation

Recent advances in generative adversarial networks (GANs) have demonstra...

4 Eric Zhongcong Xu, et al. ∙

research

∙ 11/27/2022

Diffusion Probabilistic Model Made Slim

Despite the recent visually-pleasing results achieved, the massive compu...

0 Xingyi Yang, et al. ∙

research

∙ 11/25/2022

Expanding Small-Scale Datasets with Guided Imagination

The power of Deep Neural Networks (DNNs) depends heavily on the training...

0 Yifan Zhang, et al. ∙

research

∙ 11/22/2022

Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition

This paper does not attempt to design a state-of-the-art method for visu...

0 Qibin Hou, et al. ∙

research

∙ 11/20/2022

MagicVideo: Efficient Video Generation With Latent Diffusion Models

We present an efficient text-to-video generation framework based on late...

0 Daquan Zhou, et al. ∙

research

∙ 10/28/2022

MagicMix: Semantic Mixing with Diffusion Models

Have you ever imagined what a corgi-alike coffee machine or a tiger-alik...

0 Jun Hao Liew, et al. ∙

research

∙ 10/24/2022

Reachability-Aware Laplacian Representation in Reinforcement Learning

In Reinforcement Learning (RL), Laplacian Representation (LapRep) is a t...

0 Kaixin Wang, et al. ∙

research

∙ 10/17/2022

Scaling Shifting Your Features: A New Baseline for Efficient Model Tuning

Existing fine-tuning methods either tune all parameters of the pre-train...

0 Dongze Lian, et al. ∙

research

∙ 10/02/2022

ManiCLIP: Multi-Attribute Face Manipulation from Text

In this paper we present a novel multi-attribute face manipulation metho...

0 Hao Wang, et al. ∙

research

∙ 08/01/2022

AvatarGen: a 3D Generative Model for Animatable Human Avatars

Unsupervised generation of clothed virtual humans with various appearanc...

0 Jianfeng Zhang, et al. ∙

research

∙ 07/27/2022

Contrastive Masked Autoencoders are Stronger Vision Learners

Masked image modeling (MIM) has achieved promising results on various vi...

0 Zhicheng Huang, et al. ∙

research

∙ 07/16/2022

Clover: Towards A Unified Video-Language Alignment and Fusion Model

Building a universal video-language model for solving various video unde...

0 Jingjia Huang, et al. ∙

research

∙ 05/28/2022

Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors

Domain Adaptation of Black-box Predictors (DABP) aims to learn a model o...

0 Jianfei Yang, et al. ∙

research

∙ 05/27/2022

Sharpness-Aware Training for Free

Modern deep neural networks (DNNs) have achieved state-of-the-art perfor...

0 Jiawei Du, et al. ∙

research

∙ 05/23/2022

Tyger: Task-Type-Generic Active Learning for Molecular Property Prediction

How to accurately predict the properties of molecules is an essential pr...

8 Kuangqi Zhou, et al. ∙

research

∙ 04/26/2022

Understanding The Robustness in Vision Transformers

Recent studies show that Vision Transformers(ViTs) exhibit strong robust...

13 Daquan Zhou, et al. ∙

research

∙ 03/29/2022

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision

Existing self-supervised 3D human pose estimation schemes have largely r...

0 Kehong Gong, et al. ∙

research

∙ 03/29/2022

Generalizing Few-Shot NAS with Gradient Matching

Efficient performance estimation of architectures drawn from large searc...

0 Shoukang Hu, et al. ∙

research

∙ 02/15/2022

SODAR: Segmenting Objects by DynamicallyAggregating Neighboring Mask Representations

Recent state-of-the-art one-stage instance segmentation model SOLO divid...

8 PetsTime, et al. ∙

research

∙ 01/30/2022

The Geometry of Robust Value Functions

The space of value functions is a fundamental concept in reinforcement l...

0 Kaixin Wang, et al. ∙

research

∙ 01/12/2022

Towards Adversarially Robust Deep Image Denoising

This work systematically investigates the adversarial robustness of deep...

10 Hanshu Yan, et al. ∙

research

∙ 12/16/2021

UMAD: Universal Model Adaptation under Domain and Category Shift

Learning to reject unknown samples (not present in the source classes) i...

0 Jian Liang, et al. ∙

research

∙ 12/09/2021

Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning

Class Incremental Learning (CIL) aims at learning a multi-class classifi...

0 Yujun Shi, et al. ∙

research

∙ 12/08/2021

Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering

In this work we develop a generalizable and efficient Neural Radiance Fi...

0 Mingfei Chen, et al. ∙

research

∙ 11/30/2021

Shunted Self-Attention via Multi-Scale Token Aggregation

Recent Vision Transformer (ViT) models have demonstrated encouraging res...

0 Sucheng Ren, et al. ∙

research

∙ 11/22/2021

MetaFormer is Actually What You Need for Vision

Transformers have shown great potential in computer vision tasks. A comm...

20 Weihao Yu, et al. ∙

research

∙ 11/07/2021

Direct Multi-view Multi-person 3D Pose Estimation

We present Multi-view Pose transformer (MvP) for estimating multi-person...

4 PetsTime, et al. ∙

research

∙ 10/09/2021

Deep Long-Tailed Learning: A Survey

Deep long-tailed learning, one of the most challenging problems in visua...

0 Yifan Zhang, et al. ∙

research

∙ 10/07/2021

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Overparametrized Deep Neural Networks (DNNs) often achieve astounding pe...

0 Jiawei Du, et al. ∙

research

∙ 09/15/2021

PnP-DETR: Towards Efficient Visual Analysis with Transformers

Recently, DETR pioneered the solution of vision tasks with transformers,...

10 PetsTime, et al. ∙

Sharky.TV

Featured Co-authors

Sign in with Google

Consider DeepAI Pro