Levent Sagun

research

∙ 07/11/2023

Weisfeiler and Lehman Go Measurement Modeling: Probing the Validity of the WL Test

The expressive power of graph neural networks is usually measured by com...

0 Arjun Subramonian, et al. ∙

research

∙ 12/13/2022

Simplicity Bias Leads to Amplified Performance Disparities

The simple idea that not all things are equally difficult has surprising...

0 Samuel J. Bell, et al. ∙

research

∙ 07/20/2022

Measuring and signing fairness as performance under multiple stakeholder distributions

As learning machines increase their influence on decisions concerning hu...

0 David Lopez-Paz, et al. ∙

research

∙ 03/28/2022

Understanding out-of-distribution accuracies through quantifying difficulty of test samples

Existing works show that although modern neural networks achieve remarka...

0 Berfin Simsek, et al. ∙

research

∙ 02/16/2022

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

Discriminative self-supervised learning allows training models on any ra...

70 Priya Goyal, et al. ∙

research

∙ 02/15/2022

Fairness Indicators for Systematic Assessments of Visual Feature Extractors

Does everyone equally benefit from computer vision systems? Answers to t...

0 Priya Goyal, et al. ∙

research

∙ 06/10/2021

Transformed CNNs: recasting pre-trained convolutional layers with self-attention

Vision Transformers (ViT) have recently emerged as a powerful alternativ...

0 Stéphane d'Ascoli, et al. ∙

research

∙ 03/19/2021

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

Convolutional architectures have proven extremely successful for vision ...

5 Stéphane d'Ascoli, et al. ∙

research

∙ 03/09/2021

More data or more parameters? Investigating the effect of data structure on generalization

One of the central features of deep learning is the generalization abili...

0 Stéphane d'Ascoli, et al. ∙

research

∙ 06/05/2020

Triple descent and the two kinds of overfitting: Where why do they appear?

A recent line of research has highlighted the existence of a double desc...

19 Stéphane d'Ascoli, et al. ∙

research

∙ 11/29/2019

On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks

The gradient noise (GN) in the stochastic gradient descent (SGD) algorit...

0 Umut Şimşekli, et al. ∙

research

∙ 06/16/2019

Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias

Despite the phenomenal success of deep neural networks in a broad range ...

0 Stéphane d'Ascoli, et al. ∙

research

∙ 01/18/2019

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks

The gradient noise (GN) in the stochastic gradient descent (SGD) algorit...

0 Umut Şimşekli, et al. ∙

research

∙ 01/06/2019

Scaling description of generalization with number of parameters in deep learning

We provide a description for the evolution of the generalization perform...

0 Mario Geiger, et al. ∙

research

∙ 10/22/2018

A jamming transition from under- to over-parametrization affects loss landscape and generalization

We argue that in fully-connected networks a phase transition delimits th...

0 Stefano Spigler, et al. ∙

research

∙ 09/25/2018

The jamming transition as a paradigm to understand the loss landscape of deep neural networks

Deep learning has been immensely successful at a variety of tasks, rangi...

0 Mario Geiger, et al. ∙

research

∙ 04/18/2017

SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine

We publicly release a new large-scale dataset, called SearchQA, for mach...

0 Matthew Dunn, et al. ∙

research

∙ 03/23/2017

Perspective: Energy Landscapes for Machine Learning

Machine learning techniques are being increasingly used as flexible non-...

0 Andrew J. Ballard, et al. ∙

research

∙ 11/22/2016

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

We look at the eigenvalues of the Hessian of a loss function before and ...

0 Levent Sagun, et al. ∙

research

∙ 11/06/2016

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

This paper proposes a new optimization algorithm called Entropy-SGD for ...

0 Pratik Chaudhari, et al. ∙

research

∙ 11/19/2015

Universal halting times in optimization and machine learning

The authors present empirical distributions for the halting time (measur...

0 Levent Sagun, et al. ∙

research

∙ 12/20/2014

Explorations on high dimensional landscapes

Finding minima of a real valued non-convex function over a high dimensio...

0 Levent Sagun, et al. ∙

Levent Sagun

Featured Co-authors

Sign in with Google

Consider DeepAI Pro