We study the type of solutions to which stochastic gradient descent conv...
Recent research shows that when Gradient Descent (GD) is applied to neur...
Recent work has highlighted the role of initialization scale in determin...
Background: Recent developments have made it possible to accelerate neur...
With an eye toward understanding complexity control in deep learning, we...
Stochastic Gradient Descent (SGD) is a central tool in machine learning....
The implicit bias of gradient descent is not fully understood even in si...