Maximing the Margin in the Input Space
We propose a novel criterion for support vector machine learning: maximizing the margin in the input space, not in the feature (Hilbert) space. This criterion is a discriminative version of the principal curve proposed by Hastie et al. The criterion is appropriate in particular when the input space is already a well-designed feature space with rather small dimensionality. The definition of the margin is generalized in order to represent prior knowledge. The derived algorithm consists of two alternating steps to estimate the dual parameters. Firstly, the parameters are initialized by the original SVM. Then one set of parameters is updated by Newton-like procedure, and the other set is updated by solving a quadratic programming problem. The algorithm converges in a few steps to a local optimum under mild conditions and it preserves the sparsity of support vectors. Although the complexity to calculate temporal variables increases the complexity to solve the quadratic programming problem for each step does not change. It is also shown that the original SVM can be seen as a special case. We further derive a simplified algorithm which enables us to use the existing code for the original SVM.
READ FULL TEXT