Lipschitz continuity of the gradient mapping of a continuously different...
A typical assumption for the analysis of first order optimization method...
Matrix Factorization is a popular non-convex objective, for which altern...
Backtracking line-search is an old yet powerful strategy for finding bet...
We identify a class of over-parameterized deep neural networks with stan...
Adaptive gradient methods have become recently very popular, in particul...