
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Convolutional architectures have proven extremely successful for vision ...
read it

More data or more parameters? Investigating the effect of data structure on generalization
One of the central features of deep learning is the generalization abili...
read it

An analytic theory of shallow networks dynamics for hinge loss classification
Neural networks have been shown to perform incredibly well in classifica...
read it

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval
Despite the widespread use of gradientbased algorithms for optimizing h...
read it

Triple descent and the two kinds of overfitting: Where why do they appear?
A recent line of research has highlighted the existence of a double desc...
read it

Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime
Deep neural networks can achieve remarkable generalization performances ...
read it

Landscape Complexity for the Empirical Risk of Generalized Linear Models
We present a method to obtain the average and the typical value of the n...
read it

Who is Afraid of Big Bad Minima? Analysis of GradientFlow in a Spiked MatrixTensor Model
Gradientbased algorithms are effective for many machine learning tasks,...
read it

Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias
Despite the phenomenal success of deep neural networks in a broad range ...
read it

Attractive vs. truncated repulsive supercooled liquids : dynamics is encoded in the pair correlation function
We compare glassy dynamics in two liquids that differ in the form of the...
read it

How to iron out rough landscapes and get optimal performances: Replicated Gradient Descent and its application to tensor PCA
In many highdimensional estimation problems the main task consists in m...
read it

Large deviations for the largest eigenvalues and eigenvectors of spiked random matrices
We consider matrices formed by a random N× N matrix drawn from the Gauss...
read it

Scaling description of generalization with number of parameters in deep learning
We provide a description for the evolution of the generalization perform...
read it

Marvels and Pitfalls of the Langevin Algorithm in Noisy Highdimensional Inference
Gradientdescentbased algorithms and their stochastic versions have wid...
read it

A jamming transition from under to overparametrization affects loss landscape and generalization
We argue that in fullyconnected networks a phase transition delimits th...
read it

The jamming transition as a paradigm to understand the loss landscape of deep neural networks
Deep learning has been immensely successful at a variety of tasks, rangi...
read it

Complex energy landscapes in spikedtensor and simple glassy models: ruggedness, arrangements of local minima and phase transitions
We study rough highdimensional landscapes in which an increasingly stro...
read it
Giulio Biroli
verfied profileProfessor of Theoretical Physics at Ecole Normale Supérieure (Paris), working on statistical physics, quantum physics, complex systems, AI.