Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors
We consider stochastic gradient descent for binary classification problems in a reproducing kernel Hilbert space. In traditional analysis, it is known that the expected classification error converges more slowly than the expected risk even when assuming a low-noise condition on the conditional label probabilities. Consequently, the resulting rate is sublinear. Therefore, it is important to consider whether much faster convergence of the expected classification error can be achieved. In recent research, an exponential convergence rate for stochastic gradient descent was shown under a strong low-noise condition, but theoretical analysis of this was limited to the square loss function, which is somewhat inadequate for binary classification tasks. In this paper, we show an exponential convergence rate of the expected classification error in the final phase of learning for a wide class of differentiable convex loss functions under similar assumptions.
READ FULL TEXT