Sample Complexity of Adversarially Robust Linear Classification on Separated Data
We consider the sample complexity of learning with adversarial robustness. Most prior theoretical results for this problem have considered a setting where different classes in the data are close together or overlapping. Motivated by some real applications, we consider, in contrast, the well-separated case where there exists a classifier with perfect accuracy and robustness, and show that the sample complexity narrates an entirely different story. Specifically, for linear classifiers, we show a large class of well-separated distributions where the expected robust loss of any algorithm is at least Ω(d/n), whereas the max margin algorithm has expected standard loss O(1/n). This shows a gap in the standard and robust losses that cannot be obtained via prior techniques. Additionally, we present an algorithm that, given an instance where the robustness radius is much smaller than the gap between the classes, gives a solution with expected robust loss is O(1/n). This shows that for very well-separated data, convergence rates of O(1/n) are achievable, which is not the case otherwise. Our results apply to robustness measured in any ℓ_p norm with p > 1 (including p = ∞).
READ FULL TEXT