Robust convex clustering: How does fusion penalty enhance robustness?
Convex clustering has gained popularity recently due to its desirable performance in empirical studies. It involves solving a convex optimization problem with the cost function being a squared error loss plus a fusion penalty that encourages the estimated centroids for observations in the same cluster to be identical. However, when data are contaminated, convex clustering with a squared error loss will fail to identify correct cluster memberships even when there is only one arbitrary outlier. To address this challenge, we propose a robust convex clustering method. Theoretically, we show that the new estimator is resistant to arbitrary outliers: it does not break down until more than half of the observations are arbitrary outliers. In particular, we observe a new phenomenon that the fusion penalty can help enhance robustness. Numerical studies are performed to demonstrate the competitive performance of the proposed method.
READ FULL TEXT