Angxiu Ni

is this you? claim profile


  • Linear Range in Gradient Descent

    This paper defines linear range as the range of parameter perturbations which approximately leads to linear perturbations in states. We compute linear range by comparing the actual perturbations in states and the tangent solution of a network. Linear range is a new criterion for gradients to be meaningful, thus having many possible applications. In particular, we propose that the optimal learning rate at the beginning of training can be found automatically, by selecting a stepsize such that all minibatches are within linear range. We demonstrate our algorithm on a network with canonical architecture and a ResNet.

    05/11/2019 ∙ by Angxiu Ni, et al. ∙ 0 share

    read it