BISTA: a Bregmanian proximal gradient method without the global Lipschitz continuity assumption
The problem of minimization of a separable convex objective function has various theoretical and real-world applications. One of the popular methods for solving this problem is the proximal gradient method (proximal forward-backward algorithm). A very common assumption in the use of this method is that the gradient of the smooth term in the objective function is globally Lipschitz continuous. However, this assumption is not always satisfied in practice, thus casting a limitation on the method. In this paper we discuss, in a wide class of finite and infinite-dimensional spaces, a new variant (BISTA) of the proximal gradient method which does not impose the above-mentioned global Lipschitz continuity assumption. A key contribution of the method is the dependence of the iterative steps on a certain decomposition of the objective set into subsets. Moreover, we use a Bregman divergence in the proximal forward-backward operation. Under certain practical conditions, a non-asymptotic rate of convergence (that is, in the function values) is established, as well as the weak convergence of the whole sequence to a minimizer. We also obtain a few auxiliary results of independent interest, among them a general and useful stability principle which, roughly speaking, says that given a uniformly continuous function defined on an arbitrary metric space, if we slightly change the objective set over which the optimal (extreme) values are computed, then these values vary slightly. This principle suggests a general scheme for tackling a wide class of non-convex and non-smooth optimization problems.
READ FULL TEXT