Entangled Polynomial Codes for Secure, Private, and Batch Distributed Matrix Multiplication: Breaking the ”Cubic” Barrier

01/15/2020 ∙ by Qian Yu, et al. ∙ 0

In distributed matrix multiplication, a common scenario is to assign each worker a fraction of the multiplication task, by partition the input matrices into smaller submatrices. In particular, by dividing two input matrices into m-by-p and p-by-n subblocks, a single multiplication task can be viewed as computing linear combinations of pmn submatrix products, which can be assigned to pmn workers. Such block-partitioning based designs have been widely studied under the topics of secure, private, and batch computation, where the state of the arts all require computing at least “cubic” (pmn) number of submatrix multiplications. Entangled polynomial codes, first presented for straggler mitigation, provides a powerful method for breaking the cubic barrier. It achieves a subcubic recovery threshold, meaning that the final product can be recovered from any subset of multiplication results with a size order-wise smaller than pmn. In this work, we show that entangled polynomial codes can be further extended to also include these three important settings, and provide a unified framework that order-wise reduces the total computational costs upon the state of the arts by achieving subcubic recovery thresholds.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Large scale distributed computing faces several modern challenges, in particular, to provide resiliency against stragglers, robustness against computing errors, security against Byzantine and eavesdropping adversaries, privacy of sensitive information, and to efficiently handle repetitive computation [1, 2, 3, 4, 5, 6, 7]. Coded computing is an emerging field that resolves these issues by introducing and developing new coding theoretic concepts, started focusing on straggler mitigation [8, 9, 10], then later extended to secure and private computation [11, 12, 6, 13, 14, 7].

Coding for straggler mitigation is first studied in [8] for linear computation, where classical linear codes can be directly applied to achieve same performances. For computation beyond linear functions, new classes of coding designs are needed to achieve optimality. In [10], we studied matrix-by-matrix multiplication and introduced the polynomial coded computing (PCC) framework. The main coding idea is to jointly encode the input variables into single variate polynomials where the coefficients are functions of the inputs. By assigning each worker evaluations of these polynomials as coded variables, they essentially evaluate a new polynomial composed by each worker’s computation and the encoding functions at the same point. As long as the needed final results can be recovered from the coefficients of the composed polynomial, the master can decode the final output when sufficiently many workers complete their computation. PCC significantly reduces the design problem of coded computation to finding polynomials satisfying the above decodability constraint while minimizing the degree of the decomposed polynomial. It has been shown that PCC achieves a great success in providing exact optimal coding constructions for large classes of computation tasks including: pairwise product [10], convolution [10, 15], inner product [16, 15], element-wise product [15], and general batch multivariate-polynomial evaluation [6].

An important problem in distributed matrix multiplication is to consider the case where the inputs are encoded and multiplied in a block-wise manner. This setup generalizes the problem formulated in [10] to enable a more flexible tradeoff between resources such as storage, computation and communication, and has been studied in [15, 16, 17, 18, 19, 20, 21]. For straggler mitigation, the state of the art is achieved by two versions of the entangled polynomial code, both first presented in [15], which characterizes the optimum recovery threshold within a factor of . For brevity, we refer the collection of them as entangled polynomial codes. One significance of entangled polynomial codes is that it maps non-straggler-mitigating linear coded computing schemes to bilinear-complexity decompositions, which bridges the areas of fast matrix multiplication and coded computation, enables utilizing techniques developed in the rich literature (e.g., [22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41]). Moreover, this connection reduces block-wise matrix multiplication to computing element-wise products, for which we developed the optimal strategy for straggler mitigation. The coding gain achieve by entangled polynomial codes extended to fault-tolerant computing [42] and it is shown in [6] that security against Byzantine adversaries can also be provided the same way.

The goal of this paper is to extend entangled polynomial codes to three main problems: secure, private, and batch distributed matrix multiplication. In secure distributed matrix multiplication [11, 12, 43, 44, 19, 17, 13, 45, 14, 46, 47, 48, 49, 18, 50], the goal is to compute a single matrix product while preserving the privacy of input matrices against eavesdropping adversaries; in private distributed matrix multiplication [13, 14, 18, 51], the goal is to multiply a single pair from two lists of matrices while keeping the request (indices) private; batch distributed matrix multiplication [48, 20, 21] considers a scenario where more than one pair of matrices are to be multiplied.

There are recent works on each of these problems that considered general block-wise partitioning of input matrices [17, 18, 19, 20, 21]. However, all results presented in prior works are limited by a “cubic” barrier. Explicitly, when the input matrices to be multiplied are partitioned into -by- and -by- subblocks, all state of the arts require the workers computing at least products of coded submatrices per each multiplication task.

We demonstrate how entangled polynomial codes can be extended to break the cubic barrier in all three problems. We show that the coding ideas of entangled polynomial codes and PCC can be applied to provide unified solutions with needed security and privacy, as well as efficiently handling batch evaluation. Moreover, we achieve order-wise improvements upon state of the arts with explicit coding constructions.

Ii Preliminaries

For block-partitioned coded matrix multiplication, the goal is to distributedly multiply matrices with sizes of and , for a sufficiently large field , with a set of workers each can multiply a pair of possibly coded matrices with sizes and .

Explicitly, in a basic setting, given a pair of input matrices , , each worker is assigned a pair of possibly coded matrices and , which are encoded based on some (possibly random) functions of the input matrices respectively. The workers can each compute and return them to the master. The master tries to recover the final product based on results from possibly a subset of workers using some decoding functions. We say a coded computing scheme achieves a recovery threshold of , if the master can correctly decode the final output given the computing results from any subset of workers.

The state of the art for straggler mitigation is entangled polynomial codes, which achieves a recovery threshold of . Here denotes the bilinear complexity [52] for multiplying two matrices of sizes -by- and -by-. It is well know that is subcubic, i.e, when , , and are large. Hence, it order-wise outperforms other block-partitioning based schemes in related works for straggler mitigation (e.g., [16]).

Note that even for cases where is not yet know, one can still obtain explicit coding constructions by swapping in any upper bound constructions (e.g, [23, 24, 25, 27, 28, 29, 30, 31, 32, 34, 35, 36, 38, 39, 40]). Subcubic recovery thresholds can still be achieved for any sufficiently large , , and even one only applies the well known Strassen’s construction [23]. Hence, for simplicity, in this work we present all results in terms of , and explicit subcubic constructions can be obtained in the same way.

We focus on linear codes, defined similarly as in [42, 6], which guarantees linear coding complexities w.r.t. the sizes of input matrices, and are dimension independent. Precisely, in a linear coding design, the input matrix (or each input for more general settings) is partitioned into -by-

subblocks of equals sizes (and possibly padded with a list of i.i.d. uniformly random matrices of same sizes, referred to as random keys).

111To make sure the setting is well defined, we assume is finite whenever data-security or privacy is taken into account. Matrix (or matrices) are partitioned similarly. Each worker is then assigned a pair of linear combinations of these two lists of submatrices as coded inputs. Moreover, the master uses decoding functions that computes linear combinations of received computing results.222Note that by relaxing certain assumptions made in the paper, such as allowing the decoder to access inputs and random keys and allowing extra computational cost at workers or master, one can further reduce the recovery threshold (e.g., see discussions in [48, 53]).

All results presented in this paper for distributed matrix multiplication directly extends to general codes with possibly non-linear constructions, by swapping any upper bound of into the number workers required by any computing scheme, as we illustrated in [42].

Iii Main Results

We show that entangled polynomial codes can be adapted to the settings of secure, private and batch distributed matrix multiplication, achieving order-wise improvement with subcubic recovery thresholds while meeting the systems’ requirements. To demonstrate the coding gain, we focus on applying the second version of the entangled polynomial code, the one that achieves a recovery thrshold of for straggler mitigation.

Iii-a Secure Distributed Matrix Multiplication

Secure distributed matrix multiplication follows a similar setup discussed in Section II, where the goal to multiply a single pair of matrices, with additional constraints that either one or both of the input matrices are information-theoretic private to the workers, even if up to a certain number of them can collude. In particular, we say an encoding scheme is one-sided -secure, if

(1)

for any subset with size of at most , where is generated uniformly at random. Similarly, we say an encoding scheme is fully -secure, if instead

(2)

is satisfied for any , for uniformly randomly generated and .

Secure distributed matrix multiplication has been studied in [11, 12, 43, 44, 19, 17, 13, 45, 14, 46, 47, 48, 49, 18, 50]. In particular, [17, 18, 19] presented coded computing designs for general block-wise partitioning of the input matrices, all requiring at least workers’ computation.333In addition, at least extra workers are needed per each input matrix required to be stored securely. Entangled polynomial codes achieves subcubic recovery thresholds for both one-sided and fully secure settings, formally stated in the following theorem.

Theorem 1.

For secure distributed matrix multiplication, there are one-sided -secure linear coding schemes that achieves a recovery threshold of , and fully -secure linear coding schemes that achieves a recovery threshold of .

Remark 1.

Entangled polynomial codes order-wise improves the state of the arts for general block-wise partitioning [17, 18, 19], by providing explicit constructions that require subcubic number of workers. Moreover, entangled polynomial codes simultaneously handles data security and straggler issues by tolerating arbitrarily many stragglers while maintaining the same recovery threshold and privacy level.

Remark 2.

Following similar converse proof steps we developed in [42, 6], one can show that any linear code that is either one-sided -secure or fully -secure requires using at least workers. Hence, entangled polynomial codes enables achieving optimal recovery thresholds within a factor of for both settings.

Iii-B Private Distributed Matrix Multiplication

Private distributed matrix multiplication has been studied in [13, 14, 18, 51], where the goal is to instead multiply a matrix by one of the matrices from while keeping the request private to the workers. In particular, the master send a (possibly random) query to each worker based the request . Then the matrices are encoded by each worker into a coded submatrix based on . The matrix is encoded the same as the basic setting, and each worker computes the product of their coded matrices.

The index should be kept private to any single worker in the sense that444Note that a stronger privacy condition can still be achieved, if one use the scheme for private and secure distributed matrix multiplication presented later in this paper.

(3)

for any , where are sampled uniformly at random. The master can decode the final output based on the returned results, the request and query ’s.

Moreover, in some related works [18, 13, 14], the encoding of is also required to be secure against any single curious worker. I.e.,

(4)

for any if is sampled uniformly random. This setting is referred to as private and secure distributed matrix multiplication.

The state of the art for private and secure distributed matrix multiplication with general block-partitioning based designs was proposed in [18], which requires at least number of workers. Entangled polynomial codes achieves subcubic recovery thresholds, formally stated in the following theorem.

Theorem 2.

For private coded matrix multiplication, there are linear coding schemes that achieve a recovery threshold of . For private and secure distributed matrix multiplication, linear coding schemes can achieve a recovery threshold of .

Remark 3.

Entangled polynomial codes order-wise improves the state of the arts for general block-wise partitioning [18], by providing explicit constructions that achieves subcubic recovery thresholds, while simultaneously provides straggler-resiliency, data-security and privacy.

Remark 4.

Similar to the discussion in Remark 1, one can show that any linear code requires at least workers for private coded matrix multiplication and workers for private and secure distributed matrix multiplication, even if one ignore the privacy requirement. This indicates a factor-of- optimality of entangled polynomial codes for both settings.

Entangled polynomial codes also applies to a more general scenario where the encoding functions for both input matrices are assigned to the workers, which we refer to as fully private coded matrix multiplication and formulate as follows. In fully private coded matrix multiplication, we have two lists of input matrices and , and the master aims to compute given an index . We assume , because otherwise the privacy requirement is trivial.

We aim to find computation designs such that is private against any single workers. Explicitly, the master send a (possibly random) query to each worker based on the demand . Then worker encodes both and based on . We require the requests to be private in the sense that

(5)

for any , where are sampled uniformly at random. We summarize the performance of entangled polynomial codes for fully private coded matrix multiplication as follows.

Theorem 3.

For fully private coded matrix multiplication, there are linear coding schemes that achieve a recovery threshold of .

Remark 5.

Similar to earlier discussions, entangled polynomial codes provides coding constructions for fully private coded matrix multiplication with subcubic recovery thresholds. One can prove that any fully private linear code requires at least workers. Hence, the factor-of- optimality of entangled polynomial codes also holds true for fully private coded matrix multiplication.

Iii-C Batch Distributed Matrix Multiplication

The authors of [48, 20, 21] considered a scenario where the goal is to compute copies of the matrix multiplication task in one round of communication. Formally, a basic setting for batch distributed matrix multiplication is that we have two lists of input matrices and , and the master aims to compute their element-wise product . Given partitioning parameters and , each worker still computes a single multiplication of coded submatrices with sizes and .

For general block-partitioning based schemes, the state of the art design is provided in [20, 21], where the focus is to reduce the recovery threshold and no security or privacy is required. All known coding constructions presented for batch distributed matrix multiplication requires cubic number of workers per each multiplication task even no straggler presents (i.e., requiring at least workers in total).

We show that entangled polynomial codes offer a unified coding framework for batch matrix multiplication, achieving subcubic recovery thresholds while simultaneously handling all security and privacy requirements that are discussed earlier in this section. We present this result in the following theorem.555Similar to [42], in the most basic scenario with no requirements on resiliency, security, and privacy (i.e., requiring a recovery threshold of , with and ), one can directly apply any upper bound construction of bilinear complexity for batch matrix multiplication to further reduce the number of workers by a factor of . However, here we focus on demonstrating the coding gain and present the results for general scenarios. The proofs and detailed formulations can be found in Section VI.

Theorem 4.

For coded distributed batch matrix multiplication with parameters and , there are linear coding schemes that achieve a recovery threshold of . Moreover, for extended settings in batch matrix multiplication, linear coding schemes achieve the following recovery thresholds:

  • One sided -security: ,

  • Fully -security: ,

  • Privacy of request: ,

  • Security and Privacy: ,

  • Full Privacy: .

Remark 6.

Entangled polynomial codes provide coding schemes that order-wise improves the state-of-the-art schemes in [20, 21] for batch matrix multiplication when the matrices are block-wise partitioned.

Remark 7.

Note that batch-multiplying pairs of matrices is still computing a bilinear function, one can simply use similar bilinear decomposition bounds for this operation as in [42], and all earlier achievability and converse results extended to batch computation. However, to better demonstrate the achievability of subcubic recovery thresolds, we present our results based on a subadditivity upper bound.666Specifically, let denote the bilinear complexity of batch multiplying pairs of -by- and -by- matrices. We have . One can similarly prove the factor-of- optimalities for the general entangled polynomial codes framework for all settings we presented for batch matrix multiplication.

Iv Achievability Schemes for Secure Distributed Matrix Multiplication

In this section, we present coding schemes for the simple scenario where the only additional requirement for distributed matrix multiplication is to maintain the security of input matrices. This provides a proof for Theorem 1.

Given parameters , , and , we denote the partitioned uncoded input matrices by and . The encoding consists of two steps.

In Step 1, given any upper bound construction of (e.g., Strassen’s construction) with rank

and tensor tuples

, , and , we pre-encode the inputs each into a list of coded submatrices.777For detailed definitions of bilinear complexity and upper bound constructions, see [42].

(6)

As we have explained in [42], this pre-encoding essentially provides a linear coding scheme with workers that does not provide straggler-resiliency and data-security, of which we need to take into account in the second part of the encoding.

In Step 2, note that it suffice to recover the element-wise products . We can build upon optimal coding constructions for element-wise multiplication, first presented in [15] for straggler mitigation and then extended in [6] to also provide data-privacy.

We first pad the two vectors

and with uniformly random keys. If matrix needs to be stored securely against up to colluding workers, we pad the pre-coded matrices of with uniformly random matrices . Explicitly, we define

(7)

if needs to be stored securely; otherwise, we define

(8)

Similarly, we define vector for matrix in the same way. For brevity, we denote the lengths of and by and .

Then we arbitrarily select distinct elements from , denoted , and distinct elements from , denoted . We encode the inputs for each worker as follows.

(9)
(10)

As proved in [6], the above encoding scheme satisfies the requirements for both one-sided and fully -secure settings.888Such property is referred to as -private in [54, 6]

According to the PCC framework, we have encoded the input matrices using polynomials with degrees of and , where each worker is assigned their evaluations at . Hence, after the workers multiply their coded matrices, they obtain evaluations of the multiplicative product of these polynomials, which has degree . Note that evaluations of this composed polynomial at

recovers the needed element-wise products. The decodability requirement of PCC is satisfied. Consequently, the master can recover the final output by interpolating the composed polynomial after sufficiently many results from the workers are received, achieving a recovery threshold of

.

Recall that for one-sided -secure setting, we have and ; then for fully -secure setting, we have . Hence, we have obtained linear coding schemes with recovery thresholds of and for both settings respectively given any upper bound constructions of with rank . Fundamentally, there exists constructions that exactly achieves the rank , which proves the existance of coding schemes stated in Theorem 1.

Remark 8.

The coding scheme we presented for computing element-wise product with one-sided privacy naturally extends to provide optimal codes for the scenario of batch computation of multilinear functions where each of the input entries are coded to satisfy possibly different security requirements.

V Achievability Schemes for Private Distributed Matrix Multiplication

In this section, we present the coding scheme for proving Theorem 2 and 3. We start with the setting for Theorem 2, where the goal is to multiply matrix by one of the matrices from .

Similar to Section IV, we first pre-encode the input matrices into lists of vectors of length , given any upper bound construction of with rank and tensor tuples , , and . In particular, given parameters , , and , we denote the partitioned uncoded input matrices by and . We define

(11)

for each and . Then given any request , it suffices to compute the element-wise product while keeping private.

The second part of the encoding scheme is motivated by coding ideas developed in [15, 13] and earlier sections. In particular, we first pad the pre-encoded vector of with random keys for security. We define

(12)

if needs to be stored securely, where is a random key sampled from

with a uniform distribution; otherwise,

(13)

For brevity, we denote the length of by .

We arbitrarily select distinct elements from , denoted , and encode matrix by defining the following Lagrange polynomial,

(14)

We then arbitrarily select a finite subset of with at least elements, and let the master uniformly randomly generate distinct elements from , denoted . The master send to each worker , which satisfied the security of when required.

Given a request , we similarly define

(15)

where

(16)

and is a quantity to be specified later. If the encoding can be designed such that each worker essentially computes , then we can achieve the recovery thresholds stated in Theorem 2.

To construct a private computing scheme where is equivalent to , we divide by a scalar999Note that here we are exploiting the fact that each worker computes a function that is multilinear. For more general scenarios (e.g., general polynomial evaluations we considered in [6]), scaling the coded variables could affect decodability.

(17)

so that the result can be expressed as the unweighted sum of and with function defined as follows

We let the master generate i.i.d. uniformly random variables

from independent of ’s. The master send a query to each worker with for and for . Because each appears uniformly random to worker , the presented coding scheme satisfies the privacy requirement.

We let each worker encode by computing . Consequently, each encoded variable can be re-expressed as

(18)

with independent of .

After the workers multiply the coded matrices, each worker essentially returns . Because is available at the decoder, the master can decode given each worker ’s returned result by computing . Hence, by receiving results from sufficiently workers, the master can recovery the needed element-wise product by Lagrange interpolating the polynomial , and proceed to compute the final output.

Because the degree of equals , the presented coding scheme achieves a recovery threshold of . Note that when no security is required and when is stored securely. We have obtained linear coding schemes with recovery thresholds of for private coded matrix multiplication, and for private and secure distributed matrix multiplication for any upperbound construction of , which completes the proof for Theorem 2.

Remark 9.

This coding scheme naturally extends to the scenario where the encoding of is required to be -secure. A recovery threshold of can be achieved, which is optimal within a factor of .

We now present the coding scheme for the fully private setting. The matrices are pre-encoded the same way and we denote the corresponding matrices by . To recover the final output, it suffices to compute .

We arbitrarily select distinct elements from , denoted , and define the following functions

We then arbitrarily select a finite subset of with at least elements. Let the master uniformly randomly generate distinct elements from , denoted , and i.i.d. uniformly random variables from independent of ’s. The master send a query to each worker with for , and for . This query is fully private, because for each worker , appears i.i.d. uniformly random in .

Each worker encodes the input matrices as follows

(19)
(20)

After the computation result is received from any worker , by multiplying a scalar factor with function defined in equation (17), the master recovers the evaluation of the product of two Lagrange polynomials of degree at point . By interpolating this polynomial and re-evaluating it at ’s, the master can recover all needed element-wise products. This provides a coding scheme that proves Theorem 3.

Vi Achievability Schemes for Batch Distributed Matrix Multiplication

In this section, we present the coding scheme for proving Theorem 4. We start with the basic setting where no security or privacy is required. As mentioned in Section III-C, one can directly decompose the tensor characterizing the -batch matrix multiplication, and all earlier results as well as Theorem 3 in [15] extends to batch distributed matrix multiplication. However, we instead present one certain class of upper bounds based on subadditivity of tensor rank.

Explicitly, we denote the partitioned uncoded input matrices by and . Given any upper bound construction of with rank and tensor tuples , , and , we define

(21)

for each and . Note that the batch product can be recovered from the element-wise product . One can directly apply the optimal coding scheme presented in [15], which encodes the pre-encoded vectors using Lagrange polynomials. According to Corollary 1 in [15], the resulting scheme achieves a recovery threshold of , which proves the based scenario for Theorem 4.

Remark 10.

In [55], Lagrange encoding is also applied to compute inner product (sum of element-wise products) to achieve the same recovery threshold. Remarkably, [55] pointed out that the encoding can be made systematic as Lagrange polynomials passes through all uncoded inputs, as stated in [56]. It is mentioned in [55] that the main benefits of using systematic encoding designs is to enable recovery from results of a certain smaller subset of “systematic” workers, which provides backward-compatibility and potentially reduces computation and decoding latency. Based on this observation, the entangled polynomial codes can be adapted to a “systematic” version that goes beyond inner product and handles generalized block-wise partitioned matrices by choosing the same evaluation points as in [56], so that a subset of workers computes all needed “uncoded” products of the pre-encoded matrices, and all major benefits of systematic encoding are provided. This construction gives a practical solution to an open problem stated in [57], in the sense of achieving all major benefits of systematic encoding, and improving recovery thresholds for any sufficiently large values of , , and .

Now we formally state the settings with security and privacy requirements. Similar to Section III, for batch matrix multiplication with security requirement, the formulation is the same as the basic setup for batch distributed matrix multiplication, except that the inputs needs to be stored information-theoretic privately even if up to workers collude. We say a coding scheme is one-sided -secure, if

(22)

for any subset with size of at most , where is generated uniformly at random. We say an encoding scheme is fully -secure, if instead

(23)

is satisfied for any , for uniformly randomly generated and .

When privacy is taken into account, the goal is to instead batch multiply a list of matrices by one unknown subset of matrices from a set while keeping the request private to the workers. The master send a query and a coded version of with size to each worker, then each worker encodes matrices into a coded submatrix of size based on the query, the same as in private distributed matrix multiplication. We say a computing scheme for batch matrix multiplication is private, if

(24)

for any , where are sampled uniformly at random. Furthermore, we say the computing scheme is private and secure, if we also have

(25)

for any when is sampled uniformly random.

Finally, for fully private batch matrix multiplication, the goal is to batch multiply pairs of matrices given two lists of inputs and . The master aims to compute given an index , while keeping private. The rest of the computation follows similarly to the fully private and the batch distributed matrix multiplication frameworks. Explicitly, we require that

(26)

for any , where are sampled uniformly at random and denotes the query the master send to worker .

The achievability schemes for all these settings can be built based on coding ideas we presented earlier in this paper. In particular, by first pre-encoding each of the input matrices using any upper bound construction of , the task of batch-multiplying matrices is reduced to computing element-wise product of two vectors of lengths at most . Then observe that in the second parts of all coding schemes we presented in earlier sections for non-batch matrix multiplication, we essentially provided linear codes that compute element-wise products of vectors of any sizes. By directly applying those proposed designs to the extended pre-coded vectors for batch multiplication, we obtain the needed computing schemes for proving Theorem 4 where the achieved recovery threshold upper bounds are stated by swapping into .

References