ℓ_p-Regression in the Arbitrary Partition Model of Communication

07/11/2023
by   Yi Li, et al.
0

We consider the randomized communication complexity of the distributed ℓ_p-regression problem in the coordinator model, for p∈ (0,2]. In this problem, there is a coordinator and s servers. The i-th server receives A^i∈{-M, -M+1, …, M}^n× d and b^i∈{-M, -M+1, …, M}^n and the coordinator would like to find a (1+ϵ)-approximate solution to min_x∈ℝ^n(∑_i A^i)x - (∑_i b^i)_p. Here M ≤poly(nd) for convenience. This model, where the data is additively shared across servers, is commonly referred to as the arbitrary partition model. We obtain significantly improved bounds for this problem. For p = 2, i.e., least squares regression, we give the first optimal bound of Θ̃(sd^2 + sd/ϵ) bits. For p ∈ (1,2),we obtain an Õ(sd^2/ϵ + sd/poly(ϵ)) upper bound. Notably, for d sufficiently large, our leading order term only depends linearly on 1/ϵ rather than quadratically. We also show communication lower bounds of Ω(sd^2 + sd/ϵ^2) for p∈ (0,1] and Ω(sd^2 + sd/ϵ) for p∈ (1,2]. Our bounds considerably improve previous bounds due to (Woodruff et al. COLT, 2013) and (Vempala et al., SODA, 2020).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset