In So Kweon

research

∙ 09/21/2023

MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation

Unsupervised domain adaptation (UDA) is an effective approach to handle ...

0 Fei Pan, et al. ∙

research

∙ 09/05/2023

NICE 2023 Zero-shot Image Captioning Challenge

In this report, we introduce NICE project[<https://nice.lgresearch.ai/>]...

0 Taehoon Kim, et al. ∙

research

∙ 08/18/2023

Long-range Multimodal Pretraining for Movie Understanding

Learning computer vision models from (and for) movies has a long-standin...

0 Dawit Mureja Argaw, et al. ∙

research

∙ 07/03/2023

ACDMSR: Accelerated Conditional Diffusion Models for Single Image Super-Resolution

Diffusion models have gained significant popularity in the field of imag...

0 Axi Niu, et al. ∙

research

∙ 05/01/2023

Attack-SAM: Towards Evaluating Adversarial Robustness of Segment Anything Model

Segment Anything Model (SAM) has attracted significant attention recentl...

0 Chenshuang Zhang, et al. ∙

research

∙ 04/10/2023

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation

Video Panoptic Segmentation (VPS) aims to achieve comprehensive pixel-le...

0 Inkyu Shin, et al. ∙

research

∙ 04/04/2023

One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era

OpenAI has recently released GPT-4 (a.k.a. ChatGPT plus), which is demon...

0 Chaoning Zhang, et al. ∙

research

∙ 03/30/2023

Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples

The objective of this work is to explore the learning of visually ground...

0 Hyeonggon Ryu, et al. ∙

research

∙ 03/30/2023

Complementary Random Masking for RGB-Thermal Semantic Segmentation

RGB-thermal semantic segmentation is one potential solution to achieve r...

0 Ukcheol Shin, et al. ∙

research

∙ 03/29/2023

TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation

Test-time adaptation methods have been gaining attention recently as a p...

0 Taeyeop Lee, et al. ∙

research

∙ 03/23/2023

A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI

Generative AI has demonstrated impressive performance in various fields,...

0 Chenshuang Zhang, et al. ∙

research

∙ 03/21/2023

Self-Sufficient Framework for Continuous Sign Language Recognition

The goal of this work is to develop self-sufficient framework for Contin...

0 Youngjoon Jang, et al. ∙

research

∙ 03/21/2023

A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?

As ChatGPT goes viral, generative AI (AIGC, a.k.a AI-generated content) ...

0 Chaoning Zhang, et al. ∙

research

∙ 03/17/2023

Bidirectional Domain Mixup for Domain Adaptive Semantic Segmentation

Mixup provides interpolated training samples and allows the model to obt...

0 Daehan Kim, et al. ∙

research

∙ 03/14/2023

Text-to-image Diffusion Model in Generative AI: A Survey

This survey reviews text-to-image diffusion models in the context that d...

0 Chenshuang Zhang, et al. ∙

research

∙ 03/03/2023

EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization

This paper presents a simple yet effective approach that improves contin...

0 Junha Song, et al. ∙

research

∙ 01/26/2023

Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data

We present a novel data-efficient semi-supervised framework to improve t...

0 Dong-Jin Kim, et al. ∙

research

∙ 01/02/2023

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

Driven by improved architectures and better representation learning fram...

0 Sanghyun Woo, et al. ∙

research

∙ 12/20/2022

Tracking by Associating Clips

The tracking-by-detection paradigm today has become the dominant method ...

0 Sanghyun Woo, et al. ∙

research

∙ 12/20/2022

Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection

Scaling object taxonomies is one of the important steps toward a robust ...

0 Sanghyun Woo, et al. ∙

research

∙ 12/16/2022

CD-TTA: Compound Domain Test-time Adaptation for Semantic Segmentation

Test-time adaptation (TTA) has attracted significant attention due to it...

0 Junha Song, et al. ∙

research

∙ 12/16/2022

Learning Classifiers of Prototypes and Reciprocal Points for Universal Domain Adaptation

Universal Domain Adaptation aims to transfer the knowledge between the d...

0 Sungsu Hur, et al. ∙

research

∙ 11/21/2022

MATE: Masked Autoencoders are Online 3D Test-Time Learners

We propose MATE, the first Test-Time-Training (TTT) method designed for ...

0 M. Jehanzeb Mirza, et al. ∙

research

∙ 11/01/2022

Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition

The goal of this work is background-robust continuous sign language reco...

0 Youngjoon Jang, et al. ∙

research

∙ 10/21/2022

Neural Fields for Robotic Object Manipulation from a Single Image

We present a unified and compact representation for object rendering, 3D...

0 Valts Blukis, et al. ∙

research

∙ 09/13/2022

Moving from 2D to 3D: volumetric medical image classification for rectal cancer staging

Volumetric images from Magnetic Resonance Imaging (MRI) provide invaluab...

0 Joohyung Lee, et al. ∙

research

∙ 08/03/2022

Per-Clip Video Object Segmentation

Recently, memory-based approaches show promising results on semi-supervi...

0 KwanYong Park, et al. ∙

research

∙ 08/01/2022

Generative Bias for Visual Question Answering

The task of Visual Question Answering (VQA) is known to be plagued by th...

0 Jae Won Cho, et al. ∙

research

∙ 07/30/2022

A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond

Masked autoencoders are scalable vision learners, as the title of MAE <c...

0 Chaoning Zhang, et al. ∙

research

∙ 07/22/2022

Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustness

Adversarial training (AT) for robust representation learning and self-su...

0 Chaoning Zhang, et al. ∙

research

∙ 07/20/2022

The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing

Machine learning is transforming the video editing industry. Recent adva...

0 Dawit Mureja Argaw, et al. ∙

research

∙ 07/19/2022

ML-BPM: Multi-teacher Learning with Bidirectional Photometric Mixing for Open Compound Domain Adaptation in Semantic Segmentation

Open compound domain adaptation (OCDA) considers the target domain as th...

0 Fei Pan, et al. ∙

research

∙ 07/07/2022

DRL-ISP: Multi-Objective Camera ISP with Deep Reinforcement Learning

In this paper, we propose a multi-objective camera ISP framework that ut...

2 Ukcheol Shin, et al. ∙

research

∙ 06/01/2022

Labeling Where Adapting Fails: Cross-Domain Semantic Segmentation with Point Supervision via Active Selection

Training models dedicated to semantic segmentation requires a large amou...

0 Fei Pan, et al. ∙

research

∙ 05/30/2022

TubeFormer-DeepLab: Video Mask Transformer

We present TubeFormer-DeepLab, the first attempt to tackle multiple core...

0 Dahun Kim, et al. ∙

research

∙ 03/30/2022

Investigating Top-k White-Box and Transferable Black-box Attack

Existing works have identified the limitation of top-1 attack success ra...

0 Chaoning Zhang, et al. ∙

research

∙ 03/30/2022

Dual Temperature Helps Contrastive Learning Without Many Negative Samples: Towards Understanding and Simplifying MoCo

Contrastive learning (CL) is widely known to require many negative sampl...

0 Chaoning Zhang, et al. ∙

research

∙ 03/30/2022

How Does SimSiam Avoid Collapse Without Negative Samples? A Unified Understanding with Self-supervised Contrastive Learning

To avoid collapse in self-supervised learning (SSL), a contrastive loss ...

0 Chaoning Zhang, et al. ∙

research

∙ 03/29/2022

Long-term Video Frame Interpolation via Feature Propagation

Video frame interpolation (VFI) works generally predict intermediate fra...

0 Dawit Mureja Argaw, et al. ∙

research

∙ 02/12/2022

Audio-Visual Fusion Layers for Event Type Aware Video Recognition

Human brain is continuously inundated with the multisensory information ...

0 Arda Senocak, et al. ∙

research

∙ 02/11/2022

Noise Augmentation Is All You Need For FGSM Fast Adversarial Training: Catastrophic Overfitting And Robust Overfitting Require Different Augmentation

Adversarial training (AT) and its variants are the most effective approa...

0 Chaoning Zhang, et al. ∙

research

∙ 02/07/2022

Learning Sound Localization Better From Semantically Similar Samples

The objective of this work is to localize the sound sources in visual sc...

0 Arda Senocak, et al. ∙

research

∙ 01/12/2022

Maximizing Self-supervision from Thermal Image for Effective Self-supervised Learning of Depth and Ego-motion

Recently, self-supervised learning of depth and ego-motion from thermal ...

14 Ukcheol Shin, et al. ∙

research

∙ 11/25/2021

Facial Depth and Normal Estimation using Single Dual-Pixel Camera

Many mobile manufacturers recently have adopted Dual-Pixel (DP) sensors ...

0 Minjun Kang, et al. ∙

research

∙ 11/24/2021

UDA-COPE: Unsupervised Domain Adaptation for Category-level Object Pose Estimation

Learning to estimate object pose often requires ground-truth (GT) labels...

0 Taeyeop Lee, et al. ∙

research

∙ 11/23/2021

Deep Point Cloud Reconstruction

Point cloud obtained from 3D scanning is often sparse, noisy, and irregu...

0 Jaesung Choe, et al. ∙

research

∙ 11/22/2021

PointMixer: MLP-Mixer for Point Cloud Understanding

MLP-Mixer has newly appeared as a new challenger against the realm of CN...

0 Jaesung Choe, et al. ∙

research

∙ 11/10/2021

Self-Supervised Real-time Video Stabilization

Videos are a popular media form, where online video streaming has recent...

17 Jinsoo Choi, et al. ∙

research

∙ 10/21/2021

Single-Modal Entropy based Active Learning for Visual Question Answering

Constructing a large-scale labeled dataset in the real world, especially...

0 Dong-Jin Kim, et al. ∙

research

∙ 10/13/2021

Attentive and Contrastive Learning for Joint Depth and Motion Field Estimation

Estimating the motion of the camera together with the 3D structure of th...

0 Seokju Lee, et al. ∙

In So Kweon

Featured Co-authors

Sign in with Google

Consider DeepAI Pro