Spurious correlation caused by subgroup underrepresentation has received...
Momentum is known to accelerate the convergence of gradient descent in
s...
Training an image captioner without annotated image-sentence pairs has g...
For multi-modal magnetic resonance (MR) brain tumor image segmentation,
...
Despite extensive studies, the underlying reason as to why overparameter...
Recent works on over-parameterized neural networks have shown that the
s...
Given the massive cost of language model pre-training, a non-trivial
imp...
Deep metric learning techniques have been used for visual representation...
The computer-aided disease diagnosis from radiomic data is important in ...
Recently, deep metric learning techniques received attention, as the lea...
It is believed that Gradient Descent (GD) induces an implicit bias towar...
Sharpness-Aware Minimization (SAM) is a highly effective regularization
...
Prior knowledge about the imaging physics provides a mechanistic forward...
Language modeling on large-scale datasets leads to impressive performanc...
Clinical adoption of personalized virtual heart simulations faces challe...
As part of the effort to understand implicit bias of gradient descent in...
Normalization layers (e.g., Batch Normalization, Layer Normalization) we...
Deep learning experiments in Cohen et al. (2021) using deterministic Gra...
In this paper, a modified intermediately homogenized peridynamic (IH-PD)...
Deep learning continues to play as a powerful state-of-art technique tha...
Structural magnetic resonance imaging studies have shown that brain
anat...
In contrast to SGD, adaptive gradient methods like Adam allow robust tra...
A fractal mobile-immobile (MIM in short) solute transport model in porou...
The generalization mystery of overparametrized deep nets has motivated
e...
Understanding the implicit bias of Stochastic Gradient Descent (SGD) is ...
Few-shot object detection, which aims at detecting novel objects rapidly...
In the present paper we propose a reduced temperature non-equilibrium mo...
This paper considers batch Reinforcement Learning (RL) with general valu...
When fitting statistical models, some predictors are often found to be
c...
It is generally recognized that finite learning rate (LR), in contrast t...
Matrix factorization is a simple and natural test-bed to investigate the...
Convolutional neural networks often dominate fully-connected counterpart...
Recent works (e.g., (Li and Arora, 2020)) suggest that the use of popula...
Deep neural networks have shown great potential in image reconstruction
...
Particle filtering is a popular method for inferring latent states in
st...
Computer-aided diagnosis via deep learning relies on large-scale annotat...
Learning rich representation from data is an important task for deep
gen...
Normalization methods such as batch normalization are commonly used in
o...
Recent research shows that for training with ℓ_2 loss, convolutional
neu...
Intriguing empirical evidence exists that deep learning can work well wi...
Recent research shows that the following two models are equivalent: (a)
...
To improve the ability of VAE to disentangle in the latent space, existi...
The success of deep learning in medical imaging is mostly achieved at th...
Mode connectivity is a surprising phenomenon in the loss landscape of de...
Emotion recognition plays an important role in human-computer interactio...
Over-parameterized deep neural networks trained by simple first-order me...
How well does a classic deep net architecture like AlexNet or VGG19 clas...
In this paper, we proposed a novel Identity-free conditional Generative
...
Recent works have cast some light on the mystery of why deep nets fit an...
In this paper, we proposed a novel Probabilistic Attribute Tree-CNN (PAT...