Baining Guo

research

∙ 09/07/2023

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

We present InstructDiffusion, a unifying and generic framework for align...

0 Zigang Geng, et al. ∙

research

∙ 07/26/2023

Adaptive Frequency Filters As Efficient Global Token Mixers

Recent vision transformers, large-kernel CNNs and MLPs have attained rem...

0 Zhipeng Huang, et al. ∙

research

∙ 03/17/2023

IRGen: Generative Modeling for Image Retrieval

While generative modeling has been ubiquitous in natural language proces...

0 Yidan Zhang, et al. ∙

research

∙ 03/16/2023

Efficient Diffusion Training via Min-SNR Weighting Strategy

Denoising diffusion models have been a mainstream approach for image gen...

0 Tiankai Hang, et al. ∙

research

∙ 05/27/2022

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation

Masked image modeling (MIM) learns representations with remarkably good ...

4 Yixuan Wei, et al. ∙

research

∙ 04/22/2022

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition

Image classification, which classifies images by pre-defined categories,...

2 Yixuan Wei, et al. ∙

research

∙ 03/02/2022

Protecting Celebrities with Identity Consistency Transformer

In this work we propose Identity Consistency Transformer, a novel face f...

9 Xiaoyi Dong, et al. ∙

research

∙ 12/20/2021

StyleSwin: Transformer-based GAN for High-resolution Image Generation

Despite the tantalizing success in a broad of vision tasks, transformers...

15 Bowen Zhang, et al. ∙

research

∙ 11/29/2021

Vector Quantized Diffusion Model for Text-to-Image Synthesis

We present the vector quantized diffusion (VQ-Diffusion) model for text-...

10 Shuyang Gu, et al. ∙

research

∙ 11/19/2021

Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

We study joint video and language (VL) pre-training to enable cross-moda...

0 Hongwei Xue, et al. ∙

research

∙ 11/18/2021

Swin Transformer V2: Scaling Up Capacity and Resolution

We present techniques for scaling Swin Transformer up to 3 billion param...

0 Ze Liu, et al. ∙

research

∙ 07/01/2021

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

We present CSWin Transformer, an efficient and effective Transformer-bas...

2 Xiaoyi Dong, et al. ∙

research

∙ 04/03/2021

Aggregated Contextual Transformations for High-Resolution Image Inpainting

State-of-the-art image inpainting approaches can suffer from generating ...

5 Yanhong Zeng, et al. ∙

research

∙ 03/25/2021

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

This paper presents a new vision Transformer, called Swin Transformer, t...

0 Ze Liu, et al. ∙

research

∙ 12/07/2020

Identity-Driven DeepFake Detection

DeepFake detection has so far been dominated by “artifact-driven” method...

9 Xiaoyi Dong, et al. ∙

research

∙ 06/07/2020

Learning Texture Transformer Network for Image Super-Resolution

We study on image super-resolution (SR), which aims to recover realistic...

0 Fuzhi Yang, et al. ∙

research

∙ 12/31/2019

Face X-ray for More General Face Forgery Detection

In this paper we propose a novel image representation called face X-ray ...

14 Lingzhi Li, et al. ∙

research

∙ 04/16/2019

Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting

High-quality image inpainting requires filling missing regions in a dama...

32 Yanhong Zeng, et al. ∙

research

∙ 10/06/2015

Unsupervised Extraction of Video Highlights Via Robust Recurrent Auto-encoders

With the growing popularity of short-form video sharing platforms such a...

0 Huan Yang, et al. ∙

research

∙ 12/11/2013

Fast Neighborhood Graph Search using Cartesian Concatenation

In this paper, we propose a new data structure for approximate nearest n...

0 Jingdong Wang, et al. ∙

research

∙ 01/10/2013

Planning and Acting under Uncertainty: A New Model for Spoken Dialogue Systems

Uncertainty plays a central role in spoken dialogue systems. Some stocha...

0 Bo Zhang, et al. ∙

Baining Guo

Featured Co-authors

Sign in with Google

Consider DeepAI Pro