Umut Şimşekli

research

∙ 07/24/2023

Nonparametric Linear Feature Learning in Regression Through Regularisation

Representation learning plays a crucial role in automated feature select...

0 Bertille Follain, et al. ∙

research

∙ 07/04/2023

Generalization Guarantees via Algorithm-dependent Rademacher Complexity

Algorithm- and data-dependent generalization bounds are required to expl...

0 Sarah Sachs, et al. ∙

research

∙ 06/13/2023

Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD

Neural network compression has been an increasingly important subject, d...

0 Yijun Wan, et al. ∙

research

∙ 06/07/2023

Learning via Wasserstein-Based High Probability Generalisation Bounds

Minimising upper bounds on the population risk or the generalisation gap...

16 Paul Viallard, et al. ∙

research

∙ 05/20/2023

Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent

Algorithmic stability is an important notion that has proven powerful fo...

0 Lingjiong Zhu, et al. ∙

research

∙ 03/30/2023

Efficient Sampling of Stochastic Differential Equations with Positive Semi-Definite Models

This paper deals with the problem of efficient sampling from a stochasti...

0 Anant Raj, et al. ∙

research

∙ 02/10/2023

Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD

Cyclic and randomized stepsizes are widely used in the deep learning pra...

0 Mert Gurbuzbalaban, et al. ∙

research

∙ 02/06/2023

Generalization Bounds with Data-dependent Fractal Dimensions

Providing generalization guarantees for modern neural networks has been ...

0 Benjamin Dupuis, et al. ∙

research

∙ 01/27/2023

Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions

Heavy-tail phenomena in stochastic gradient descent (SGD) have been repo...

0 Anant Raj, et al. ∙

research

∙ 09/19/2022

Generalization Bounds for Stochastic Gradient Descent via Localized ε-Covers

In this paper, we propose a new covering technique localized for the tra...

0 Sejun Park, et al. ∙

research

∙ 06/02/2022

Algorithmic Stability of Heavy-Tailed Stochastic Gradient Descent on Least Squares

Recent studies have shown that heavy tails can emerge in stochastic opti...

0 Anant Raj, et al. ∙

research

∙ 05/23/2022

Chaotic Regularization and Heavy-Tailed Limits for Deterministic Gradient Descent

Recent studies have shown that gradient descent (GD) can achieve improve...

7 Soon Hoe Lim, et al. ∙

research

∙ 05/13/2022

Heavy-Tail Phenomenon in Decentralized SGD

Recent theoretical studies have shown that heavy-tails can emerge in sto...

0 Mert Gurbuzbalaban, et al. ∙

research

∙ 03/04/2022

Rate-Distortion Theoretic Generalization Bounds for Stochastic Learning Algorithms

Understanding generalization in modern machine learning settings has bee...

0 Milad Sefidgaran, et al. ∙

research

∙ 11/25/2021

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Disobeying the classical wisdom of statistical learning theory, modern d...

3 Tolga Birdal, et al. ∙

research

∙ 08/02/2021

Generalization Properties of Stochastic Optimizers via Trajectory Analysis

Despite the ubiquitous use of stochastic optimization algorithms in mach...

0 Liam Hodgkinson, et al. ∙

research

∙ 06/29/2021

Fast Approximation of the Sliced-Wasserstein Distance Using Concentration of Random Projections

The Sliced-Wasserstein distance (SW) is being increasingly used in machi...

0 Kimia Nadjahi, et al. ∙

research

∙ 06/09/2021

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

Understanding generalization in deep learning has been one of the major ...

0 Alexander Camuto, et al. ∙

research

∙ 06/07/2021

Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

Neural network compression techniques have become increasingly popular a...

0 Melih Barsbey, et al. ∙

research

∙ 05/18/2021

Relative Positional Encoding for Transformers with Linear Complexity

Recent advances in Transformer models allow for unprecedented sequence l...

0 Antoine Liutkus, et al. ∙

research

∙ 02/20/2021

Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance

Recent studies have provided both empirical and theoretical evidence ill...

0 Hongjian Wang, et al. ∙

research

∙ 02/13/2021

Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections

Gaussian noise injections (GNIs) are a family of simple and widely-used ...

5 Alexander Camuto, et al. ∙

research

∙ 02/10/2021

Self-Supervised VQ-VAE For One-Shot Music Style Transfer

Neural style transfer, allowing to apply the artistic style of one image...

0 Ondřej Cífka, et al. ∙

research

∙ 07/14/2020

Explicit Regularisation in Gaussian Noise Injections

We study the regularisation induced in neural networks by Gaussian noise...

0 Alexander Camuto, et al. ∙

research

∙ 07/13/2020

Quantitative Propagation of Chaos for SGD in Wide Neural Networks

In this paper, we investigate the limiting behavior of a continuous-time...

0 Valentin De Bortoli, et al. ∙

research

∙ 06/16/2020

Hausdorff Dimension, Stochastic Differential Equations, and Generalization in Neural Networks

Despite its success in a wide range of applications, characterizing the ...

1 Umut Şimşekli, et al. ∙

research

∙ 06/08/2020

The Heavy-Tail Phenomenon in SGD

In recent years, various notions of capacity and complexity have been pr...

0 Mert Gurbuzbalaban, et al. ∙

research

∙ 04/01/2020

Synchronizing Probability Measures on Rotations via Optimal Transport

We introduce a new paradigm, measure synchronization, for synchronizing ...

3 Tolga Birdal, et al. ∙

research

∙ 03/12/2020

Statistical and Topological Properties of Sliced Probability Divergences

The idea of slicing divergences has been proven to be successful when co...

0 Kimia Nadjahi, et al. ∙

research

∙ 02/28/2020

Generalized Sliced Distances for Probability Distributions

Probability metrics have become an indispensable part of modern statisti...

0 Soheil Kolouri, et al. ∙

research

∙ 02/13/2020

Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise

Stochastic gradient descent with momentum (SGDm) is one of the most popu...

5 Umut Şimşekli, et al. ∙

research

∙ 11/29/2019

On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks

The gradient noise (GN) in the stochastic gradient descent (SGD) algorit...

0 Umut Şimşekli, et al. ∙

research

∙ 10/28/2019

Approximate Bayesian Computation with the Sliced-Wasserstein Distance

Approximate Bayesian Computation (ABC) is a popular method for approxima...

0 Kimia Nadjahi, et al. ∙

research

∙ 07/04/2019

Supervised Symbolic Music Style Translation Using Synthetic Data

Research on style transfer and domain translation has clearly demonstrat...

0 Ondřej Cífka, et al. ∙

research

∙ 06/21/2019

First Exit Time Analysis of Stochastic Gradient Descent Under Heavy-Tailed Gradient Noise

Stochastic gradient descent (SGD) has been widely used in machine learni...

12 Thanh Huy Nguyen, et al. ∙

research

∙ 06/11/2019

Asymptotic Guarantees for Learning Generative Models with the Sliced-Wasserstein Distance

Minimum expected distance estimation (MEDE) algorithms have been widely ...

0 Kimia Nadjahi, et al. ∙

research

∙ 04/11/2019

Probabilistic Permutation Synchronization using the Riemannian Structure of the Birkhoff Polytope

We present an entirely new geometric and probabilistic approach to synch...

16 Tolga Birdal, et al. ∙

research

∙ 03/11/2019

Bayesian Allocation Model: Inference by Sequential Monte Carlo for Nonnegative Tensor Factorizations and Topic Models using Polya Urns

We introduce a dynamic generative model, Bayesian allocation model (BAM)...

0 Ali Taylan Cemgil, et al. ∙

research

∙ 02/08/2019

Speech enhancement with variational autoencoders and alpha-stable distributions

This paper focuses on single-channel semi-supervised speech enhancement....

0 Simon Leglaive, et al. ∙

research

∙ 02/01/2019

Generalized Sliced Wasserstein Distances

The Wasserstein distance and its variations, e.g., the sliced-Wasserstei...

0 Soheil Kolouri, et al. ∙

research

∙ 01/22/2019

Non-Asymptotic Analysis of Fractional Langevin Monte Carlo for Non-Convex Optimization

Recent studies on diffusion-based sampling methods have shown that Lange...

0 Thanh Huy Nguyen, et al. ∙

research

∙ 01/18/2019

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks

The gradient noise (GN) in the stochastic gradient descent (SGD) algorit...

0 Umut Şimşekli, et al. ∙

research

∙ 06/21/2018

Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions

By building up on the recent theory that established the connection betw...

0 Umut Şimşekli, et al. ∙

research

∙ 06/07/2018

Asynchronous Stochastic Quasi-Newton MCMC for Non-Convex Optimization

Recent studies have illustrated that stochastic gradient Markov Chain Mo...

0 Umut Şimşekli, et al. ∙

research

∙ 05/31/2018

Bayesian Pose Graph Optimization via Bingham Distributions and Tempered Geodesic MCMC

We introduce Tempered Geodesic Markov Chain Monte Carlo (TG-MCMC) algori...

2 Tolga Birdal, et al. ∙

research

∙ 02/26/2018

A Generative Model for Non-Intrusive Load Monitoring in Commercial Buildings

In the recent years, there has been an increasing academic and industria...

0 Simon Henriet, et al. ∙

research

∙ 06/12/2017

Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic Differential Equations for Markov Chain Monte Carlo

Along with the recent advances in scalable Markov Chain Monte Carlo meth...

0 Umut Şimşekli, et al. ∙

research

∙ 05/22/2017

Learning the Morphology of Brain Signals Using Alpha-Stable Convolutional Sparse Coding

Neural time-series data contain a wide variety of prototypical signal wa...

0 Mainak Jas, et al. ∙

research

∙ 02/10/2016

Stochastic Quasi-Newton Langevin Monte Carlo

Recently, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods...

0 Umut Şimşekli, et al. ∙

research

∙ 09/05/2015

HAMSI: A Parallel Incremental Optimization Algorithm Using Quadratic Approximations for Solving Partially Separable Problems

We propose HAMSI (Hessian Approximated Multiple Subsets Iteration), whic...

0 Kamer Kaya, et al. ∙

Umut Şimşekli

Featured Co-authors

Sign in with Google

Consider DeepAI Pro