
Metalearning PINN loss functions
We propose a metalearning technique for offline discovery of physicsin...
DiscreteValued Neural Communication
Deep learning has advanced from fully connected architectures to structu...
Understanding Dynamics of Nonlinear Representation Learning and Its Application
Representations of the world environment play a crucial role in machine ...
Adversarial Training Helps Transfer Learning via Better Representations
Transfer learning aims to leverage models pretrained on source data to ...
MemStream: MemoryBased Anomaly Detection in MultiAspect Streams with Concept Drift
Given a stream of entries over time in a multiaspect data setting where...
Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions
We propose a new type of neural networks, Kronecker neural networks (KNN...
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
Graph Neural Networks (GNNs) have been studied through the lens of expre...
A Recipe for Global Convergence Guarantee in Deep Neural Networks
Existing global convergence guarantees of (stochastic) gradient descent ...
CAC: A Clustering Based Framework for Classification
In data containing heterogeneous subpopulations, classification performa...
On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers
A deep equilibrium model uses implicit layers, which are implicitly defi...
When and How Mixup Improves Calibration
In many machine learning applications, it is important for the model to ...
Towards DomainAgnostic Contrastive Learning
Despite recent success, most contrastive selfsupervised learning method...
How Does Mixup Help With Robustness and Generalization?
Mixup is a popular data augmentation technique based on taking convex co...
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
From CNNs to attention mechanisms, encoding inductive biases into neural...
Locally adaptive activation functions with slope recovery term for deep and physicsinformed neural networks
We propose two approaches of locally adaptive activation functions namel...
Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes
In this paper, we theoretically prove that gradient descent can find a g...
A Stochastic FirstOrder Method for Ordered Empirical Risk Minimization
We propose a new stochastic firstorder method for empirical risk minimi...
Every Local Minimum is a Global Minimum of an Induced Model
For nonconvex optimization in machine learning, this paper proves that ...
Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit
Recent work has noted that all bad local minima can be removed from neur...
Elimination of All Bad Local Minima in Deep Learning
In this paper, we theoretically prove that we can eliminate all suboptim...
Effect of Depth and Width on Local Minima in Deep Learning
In this paper, we analyze the effects of depth and width on the quality ...
Depth with Nonlinearity Creates No Bad Local Minima in ResNets
In this paper, we prove that depth with nonlinearity creates no bad loca...
Generalization in Machine Learning via Analytical Learning Theory
This paper introduces a novel measuretheoretic learning theory to analy...
Theory of Deep Learning III: explaining the nonoverfitting puzzle
A main puzzle of deep networks revolves around the absence of overfittin...
Generalization in Deep Learning
This paper explains why deep learning can generalize well, despite large...
Deep SemiRandom Features for Nonlinear Function Approximation
We propose semirandom features for nonlinear function approximation. Th...
Depth Creates No Bad Local Minima
In deep learning, depth, as well as nonlinearity, create nonconvex loss...
Streaming Normalization: Towards Simpler and More Biologicallyplausible Normalizations for Online and Recurrent Learning
We systematically explored a spectrum of normalization algorithms relate...
Global Continuous Optimization with Error Bound and Fast Convergence
This paper considers global optimization with a blackbox unknown object...
Deep Learning without Poor Local Minima
In this paper, we prove a conjecture published in 1989 and also partiall...
Bounded Optimal Exploration in MDP
Within the framework of probably approximately correct Markov decision p...
Bayesian Optimization with Exponential Convergence
This paper presents a Bayesian optimization method with exponential conv...
A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model
Bayesian Reinforcement Learning (RL) is capable of not only incorporatin...
