A New Angle on L2 Regularization
Imagine two high-dimensional clusters and a hyperplane separating them. Consider in particular the angle between: the direction joining the two clusters' centroids and the normal to the hyperplane. In linear classification, this angle depends on the level of L2 regularization used. Can you explain why?
READ FULL TEXT