We study a challenging problem of unsupervised discovery of object landm...
Federated learning (FL) is a distributed learning paradigm that enables
...
Benefiting from prompt tuning, recent years have witnessed the promising...
Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT...
It is imperative to ensure the robustness of deep learning models in cri...
Recent video recognition models utilize Transformer models for long-rang...
Image restoration involves recovering a high-quality clean image from it...
Hybrid volumetric medical image segmentation models, combining the advan...
The latest breakthroughs in large vision-language models, such as Bard a...
Conversation agents fueled by Large Language Models (LLMs) are providing...
Whitening loss provides theoretical guarantee in avoiding feature collap...
Most previous co-salient object detection works mainly focus on extracti...
Current transformer-based change detection (CD) approaches either employ...
Burst image processing is becoming increasingly popular in recent years....
Adopting contrastive image-text pretrained models like CLIP towards vide...
Existing video instance segmentation (VIS) approaches generally follow a...
Deep neural networks (DNNs) have enabled astounding progress in several
...
The transferability of adversarial perturbations between image models ha...
Adversarial training is an effective approach to make deep neural networ...
Although existing semi-supervised learning models achieve remarkable suc...
Large-scale multi-modal training with image-text pairs imparts strong
ge...
The pose-guided person image generation task requires synthesizing
photo...
A desirable objective in self-supervised learning (SSL) is to avoid feat...
Pre-trained vision-language (V-L) models such as CLIP have shown excelle...
The continual learning setting aims to learn new tasks over time without...
Neural networks have proven to be very powerful at computer vision tasks...
The operating room (OR) is a dynamic and complex environment consisting ...
Nowadays, there are more surgical procedures that are being performed us...
Deep learning-based algorithms have seen a massive popularity in differe...
One of the key factors behind the recent success in visual tracking is t...
In recent past, several domain generalization (DG) methods have been
pro...
Transferable adversarial attacks optimize adversaries from a pretrained
...
Existing open-vocabulary object detectors typically enlarge their vocabu...
Semi-supervised learning (SSL) is one of the dominant approaches to addr...
In the pursuit of achieving ever-increasing accuracy, large and complex
...
We propose a novel self-supervised Video Object Segmentation (VOS) appro...
Deep learning models tend to forget their earlier knowledge while
increm...
State-of-the-art transformer-based video instance segmentation (VIS)
app...
Following unprecedented success on the natural language tasks, Transform...
In an autonomous driving system, perception - identification of features...
Screening cluttered and occluded contraband items from baggage X-ray sca...
We propose a novel few-shot action recognition framework, STRM, which
en...
Creative sketching or doodling is an expressive activity, where imaginat...
In this paper, we propose self-supervised training for video transformer...
Open-world object detection (OWOD) is a challenging computer vision prob...
The aim of this paper is to formalize a new continual semi-supervised
le...
Automated systems designed for screening contraband items from the X-ray...
Multi-label zero-shot learning (ZSL) is a more realistic counter-part of...
Multi-label recognition is a fundamental, and yet is a challenging task ...
Identifying potential threats concealed within the baggage is of prime
c...