
Metalearning PINN loss functions
We propose a metalearning technique for offline discovery of physicsin...
read it

DiscreteValued Neural Communication
Deep learning has advanced from fully connected architectures to structu...
read it

Understanding Dynamics of Nonlinear Representation Learning and Its Application
Representations of the world environment play a crucial role in machine ...
read it

Adversarial Training Helps Transfer Learning via Better Representations
Transfer learning aims to leverage models pretrained on source data to ...
read it

MemStream: MemoryBased Anomaly Detection in MultiAspect Streams with Concept Drift
Given a stream of entries over time in a multiaspect data setting where...
read it

Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions
We propose a new type of neural networks, Kronecker neural networks (KNN...
read it

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
Graph Neural Networks (GNNs) have been studied through the lens of expre...
read it

A Recipe for Global Convergence Guarantee in Deep Neural Networks
Existing global convergence guarantees of (stochastic) gradient descent ...
read it

CAC: A Clustering Based Framework for Classification
In data containing heterogeneous subpopulations, classification performa...
read it

On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers
A deep equilibrium model uses implicit layers, which are implicitly defi...
read it

When and How Mixup Improves Calibration
In many machine learning applications, it is important for the model to ...
read it

Towards DomainAgnostic Contrastive Learning
Despite recent success, most contrastive selfsupervised learning method...
read it

How Does Mixup Help With Robustness and Generalization?
Mixup is a popular data augmentation technique based on taking convex co...
read it

Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
From CNNs to attention mechanisms, encoding inductive biases into neural...
read it

Locally adaptive activation functions with slope recovery term for deep and physicsinformed neural networks
We propose two approaches of locally adaptive activation functions namel...
read it

Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes
In this paper, we theoretically prove that gradient descent can find a g...
read it

A Stochastic FirstOrder Method for Ordered Empirical Risk Minimization
We propose a new stochastic firstorder method for empirical risk minimi...
read it

Every Local Minimum is a Global Minimum of an Induced Model
For nonconvex optimization in machine learning, this paper proves that ...
read it

Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit
Recent work has noted that all bad local minima can be removed from neur...
read it

Elimination of All Bad Local Minima in Deep Learning
In this paper, we theoretically prove that we can eliminate all suboptim...
read it

Effect of Depth and Width on Local Minima in Deep Learning
In this paper, we analyze the effects of depth and width on the quality ...
read it

Depth with Nonlinearity Creates No Bad Local Minima in ResNets
In this paper, we prove that depth with nonlinearity creates no bad loca...
read it

Generalization in Machine Learning via Analytical Learning Theory
This paper introduces a novel measuretheoretic learning theory to analy...
read it

Theory of Deep Learning III: explaining the nonoverfitting puzzle
A main puzzle of deep networks revolves around the absence of overfittin...
read it

Generalization in Deep Learning
This paper explains why deep learning can generalize well, despite large...
read it

Deep SemiRandom Features for Nonlinear Function Approximation
We propose semirandom features for nonlinear function approximation. Th...
read it

Depth Creates No Bad Local Minima
In deep learning, depth, as well as nonlinearity, create nonconvex loss...
read it

Streaming Normalization: Towards Simpler and More Biologicallyplausible Normalizations for Online and Recurrent Learning
We systematically explored a spectrum of normalization algorithms relate...
read it

Global Continuous Optimization with Error Bound and Fast Convergence
This paper considers global optimization with a blackbox unknown object...
read it

Deep Learning without Poor Local Minima
In this paper, we prove a conjecture published in 1989 and also partiall...
read it

Bounded Optimal Exploration in MDP
Within the framework of probably approximately correct Markov decision p...
read it

Bayesian Optimization with Exponential Convergence
This paper presents a Bayesian optimization method with exponential conv...
read it

A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model
Bayesian Reinforcement Learning (RL) is capable of not only incorporatin...
read it