Representation learning plays a crucial role in automated feature select...
Algorithm- and data-dependent generalization bounds are required to expl...
Neural network compression has been an increasingly important subject, d...
Minimising upper bounds on the population risk or the generalisation gap...
Algorithmic stability is an important notion that has proven powerful fo...
This paper deals with the problem of efficient sampling from a stochasti...
Cyclic and randomized stepsizes are widely used in the deep learning pra...
Providing generalization guarantees for modern neural networks has been ...
Heavy-tail phenomena in stochastic gradient descent (SGD) have been repo...
In this paper, we propose a new covering technique localized for the
tra...
Recent studies have shown that heavy tails can emerge in stochastic
opti...
Recent studies have shown that gradient descent (GD) can achieve improve...
Recent theoretical studies have shown that heavy-tails can emerge in
sto...
Understanding generalization in modern machine learning settings has bee...
Disobeying the classical wisdom of statistical learning theory, modern d...
Despite the ubiquitous use of stochastic optimization algorithms in mach...
The Sliced-Wasserstein distance (SW) is being increasingly used in machi...
Understanding generalization in deep learning has been one of the major
...
Neural network compression techniques have become increasingly popular a...
Recent advances in Transformer models allow for unprecedented sequence
l...
Recent studies have provided both empirical and theoretical evidence
ill...
Gaussian noise injections (GNIs) are a family of simple and widely-used
...
Neural style transfer, allowing to apply the artistic style of one image...
We study the regularisation induced in neural networks by Gaussian noise...
In this paper, we investigate the limiting behavior of a continuous-time...
Despite its success in a wide range of applications, characterizing the
...
In recent years, various notions of capacity and complexity have been
pr...
We introduce a new paradigm, measure synchronization, for
synchronizing ...
The idea of slicing divergences has been proven to be successful when
co...
Probability metrics have become an indispensable part of modern statisti...
Stochastic gradient descent with momentum (SGDm) is one of the most popu...
The gradient noise (GN) in the stochastic gradient descent (SGD) algorit...
Approximate Bayesian Computation (ABC) is a popular method for approxima...
Research on style transfer and domain translation has clearly demonstrat...
Stochastic gradient descent (SGD) has been widely used in machine learni...
Minimum expected distance estimation (MEDE) algorithms have been widely ...
We present an entirely new geometric and probabilistic approach to
synch...
We introduce a dynamic generative model, Bayesian allocation model (BAM)...
This paper focuses on single-channel semi-supervised speech enhancement....
The Wasserstein distance and its variations, e.g., the sliced-Wasserstei...
Recent studies on diffusion-based sampling methods have shown that Lange...
The gradient noise (GN) in the stochastic gradient descent (SGD) algorit...
By building up on the recent theory that established the connection betw...
Recent studies have illustrated that stochastic gradient Markov Chain Mo...
We introduce Tempered Geodesic Markov Chain Monte Carlo (TG-MCMC) algori...
In the recent years, there has been an increasing academic and industria...
Along with the recent advances in scalable Markov Chain Monte Carlo meth...
Neural time-series data contain a wide variety of prototypical signal
wa...
Recently, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods...
We propose HAMSI (Hessian Approximated Multiple Subsets Iteration), whic...