Differential privacy [2, 3, 4] has become the gold standard for a rigorous privacy guarantee, and there has been the development of many differentially-private mechanisms. Some popular mechanisms include the classical Laplace mechanism  and the Exponential mechanism . In addition, there are other mechanisms that build upon these two classical ones such as those based on data partition and aggregation [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], and those based on adaptive queries [17, 18, 19, 20, 21, 22, 23]. From this observation, differentially-private mechanisms may be categorized into two groups: the basic mechanisms, and the derived mechanisms. The basic mechanisms’ privacy guarantee is self contained, whereas the derived mechanisms’ privacy guarantee is achieved through a combination of basic mechanisms, composition theorems, and the post-processing invariance property .
In this work, we consider the design of a basic mechanism for matrix-valued query functions. Existing basic mechanisms for differential privacy are designed usually for scalar-valued query functions. However, in many practical settings, the query functions are multi-dimensional and can be succinctly represented as matrix-valued functions. Examples of matrix-valued query functions in the real-world applications include the covariance matrix [25, 26, 27], the kernel matrix , the adjacency matrix , the incidence matrix , the rotation matrix , the Hessian matrix , the transition matrix , and the density matrix , which find applications in statistics 35], graph theory , differential equations , computer graphics 32], quantum mechanics , and many other fields .
One property that distinguishes the matrix-valued query functions from the scalar-valued query functions is the relationship and interconnection among the elements of the matrix. One may naively treat these matrices as merely a collection of scalar values, but that could prove sub-optimal since the structure and relationship among these scalar values are often informative and essential to the understanding and analysis of the system. For example, in graph theory, the adjacency matrix is symmetric for an undirected graph, but not for a directed graph  – an observation which is implausible to extract from simply looking at the collection of elements without considering how they are arranged in the matrix.
In differential privacy, the traditional method for dealing with a matrix-valued query function is to extend a scalar-valued mechanism by adding independent and identically distributed (i.i.d.) noise to each element of the matrix [3, 2, 37]. However, this method fails to utilize the structural characteristics of the matrix-valued noise and query function. Although some advanced methods have explored this possibility in an iterative/procedural manner [17, 18], the structural characteristics of the matrices are still largely under-investigated. This is partly due to the lack of a basic mechanism that is directly designed for matrix-valued query functions, making the utilization of matrix structures and application of available tools in matrix analysis challenging.
In this work, we formalize the study of the matrix-valued differential privacy, and present a new basic mechanism that can readily exploit the structural characteristics of the matrices – the Matrix-Variate Gaussian (MVG) mechanism. The high-level concept of the MVG mechanism is simple – it adds a matrix-variate Gaussian noise scaled to the -sensitivity of the matrix-valued query function (cf. Figure 1). We rigorously prove that the MVG mechanism guarantees -differential privacy, and show that, with the MVG mechanism, the structural characteristics of the matrix-valued query functions can readily be incorporated into the mechanism design. Specifically, we present an example of how the MVG mechanism can yield greater utility by exploiting the positive-semi definiteness of the matrix-valued query function. Moreover, due to the multi-dimensional nature of the noise and the query function, the MVG mechanism allows flexibility in the design via the novel notion of directional noise. An important consequence of the concept of directional noise is that the matrix-valued noise in the MVG mechanism can be devised to affect certain parts of the matrix-valued query function less than the others, while providing the same privacy guarantee. In practice, this property could be advantageous as the noise can be tailored to have minimal impact on the intended utility. We present simple algorithms to incorporate the directional noise into the differential privacy mechanism design, and theoretically present the optimal design for the MVG mechanism with directional noise that maximizes the power-to-noise ratio of the mechanism output.
Finally, to illustrate the effectiveness of the MVG mechanism, we conduct experiments on three privacy-sensitive real-world datasets – Liver Disorders [38, 39], Movement Prediction , and Cardiotocography [38, 41]
. The experiments include three tasks involving matrix-valued query functions – regression, finding the first principal component, and covariance estimation. The results show that the MVG mechanism can evidently outperform four prior state-of-the-art mechanisms – the Laplace mechanism, the Gaussian mechanism, the Exponential mechanism, and the JL transform – in utility in all experiments, and can provide the utility similar to that achieved with the non-private methods, while guaranteeing differential privacy.
To summarize, the main contributions are as follows.
We formalize the study of matrix-valued query functions in differential privacy and introduce the novel Matrix-Variate Gaussian (MVG) mechanism.
We rigorously prove that the MVG mechanism guarantees -differential privacy.
We show that exploiting the structural characteristic of the matrix-valued query function can improve the utility performance of the MVG mechanism.
We introduce a novel concept of directional noise, and propose two simple algorithms to implement this novel concept with the MVG mechanism.
We theoretically exhibit how the directional noise can be devised to provide the maximum utility from the MVG mechanism.
We evaluate our approach on three real-world datasets and show that our approach can outperform four prior state-of-the-art mechanisms in all experiments, and yields utility performance close to the non-private baseline.
2 Prior Works
Existing mechanisms for differential privacy may be categorized into two types: the basic [3, 2, 42, 5, 37, 43, 44, 45, 46]; and the derived mechanisms [16, 47, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 48, 49, 18, 17, 50, 51, 52, 53, 54, 55, 19, 20, 17, 23, 22]. Since our work concerns the basic mechanism design, we focus our discussion on this type, and provide a general overview of the other.
2.1 Basic Mechanisms
Basic mechanisms are those whose privacy guarantee is self-contained, i.e. it does not deduce the guarantee from another mechanism. Here, we discuss four popular existing basic mechanisms: the Laplace mechanism, the Gaussian mechanism, the Johnson-Lindenstrauss transform method, and the Exponential mechanism.
2.1.1 Laplace Mechanism
The classical Laplace mechanism  adds noise drawn from the Laplace distribution scaled to the -sensitivity of the query function. It was initially designed for a scalar-valued query function, but can be extended to a matrix-valued query function by adding i.i.d. Laplace noise to each element of the matrix. The Laplace mechanism provides the strong -differential privacy guarantee and is relatively simple to implement. However, its generalization to a matrix-valued query function does not automatically utilize the structure of the matrices involved.
2.1.2 Gaussian Mechanism
The Gaussian mechanism [37, 2, 43] uses i.i.d. additive noise drawn from the Gaussian distribution scaled to the -sensitivity of the query function. The Gaussian mechanism guarantees -differential privacy. It suffers from the same limitation as the Laplace mechanism when extended to a matrix-valued query function, i.e. it does not automatically consider the structure of the matrices.
2.1.3 Johnson-Lindenstrauss Transform
The Johnson-Lindenstrauss (JL) transform method  uses multiplicative noise to guarantee -differential privacy. It is, in fact, a rare basic mechanism designed for a matrix-valued query function. Despite its promise, previous works show that the JL transform method can be applied to queries with certain properties only, as we discuss here.
Blum and Roth  use a hash function that implicitly represents the JL transform, and the method is suitable for a sparse query.
Among these methods, Upadhyay’s works [45, 46] stand out as possibly the most general. In our experiments, we show that our approach can yield higher utility for the same privacy budget than these methods.
2.1.4 Exponential Mechanism
In contrast to additive and multiplicative noise used in previous approaches, the Exponential mechanism uses noise introduced via the sampling process 
. The Exponential mechanism draws its query answers from a custom probability density function designed to preserve-differential privacy. To provide reasonable utility, the Exponential mechanism designs its sampling distribution based on the quality function, which indicates the utility score of each possible sample. Due to its generality, the Exponential mechanism has been utilized for many types of query functions, including the matrix-valued query functions. We experimentally compare our approach to the Exponential mechanism, and show that, with slightly weaker privacy guarantee, our method can yield significant utility improvement.
Finally, we conclude that our method differs from the four existing basic mechanisms as follows. In contrast with the i.i.d. noise in the Laplace and Gaussian mechanisms, the MVG mechanism allows a non-i.i.d. noise (cf. Section 5). As opposed to the multiplicative noise in the JL transform and the sampling noise in the Exponential mechanism, the MVG mechanism uses an additive noise for matrix-valued query functions.
2.2 Derived Mechanisms
Derived mechanisms are those whose privacy guarantee is deduced from other basic mechanisms via the composition theorems and the post-processing invariance property . Derived mechanisms are often designed to provide better utility by exploiting some properties of the query function or of the data. Blocki et al.  also define a similar categorization with the term “revised algorithm”.
The general techniques used by derived mechanisms are often translatable among basic mechanisms, including our MVG mechanism. Given our focus on a novel basic mechanism, these techniques are less relevant to our work, and we leave the investigation of integrating them into the MVG framework in future work. Some of the popular techniques used by derived mechanisms are summarized here.
2.2.1 Sensitivity Control
2.2.2 Data Partition and Aggregation
This technique uses data partition and aggregation to produce more accurate query answers [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. The partition and aggregation processes are done in a differentially-private manner either via the composition theorems and the post-processing invariance property , or with a small extra privacy cost. Hay et al.  nicely summarize many works that utilize this concept.
2.2.3 Non-uniform Data Weighting
This technique lowers the level of perturbation required for the privacy protection by weighting each data sample or dataset differently [49, 18, 17, 50]. The rationale is that each sample in a dataset, or each instance of the dataset itself, has a heterogeneous contribution to the query output. Therefore, these mechanisms place a higher weight on the critical samples or instances of the database to provide better utility.
2.2.4 Data Compression
This approach reduces the level of perturbation required for differential privacy via dimensionality reduction. Various dimensionality reduction methods have been proposed. For example, Kenthapadi et al. , Xu et al. , and Li et al.  use random projection; Chanyaswad et al.  and Jiang et al. 
use principal component analysis (PCA); Xiao et al. use wavelet transform; and Acs et al. 
use lossy Fourier transform.
2.2.5 Adaptive Queries
The derived mechanisms based on adaptive queries use prior and/or auxiliary information to improve the utility of the query answers. Examples include the matrix mechanism [19, 20], the multiplicative weights mechanism [17, 18], the low-rank mechanism , boosting 
, and the sparse vector technique[37, 22].
Finally, we conclude with three main observations. First, the MVG mechanism falls into the category of basic mechanism. Second, techniques used in derived mechanisms are generally applicable to multiple basic mechanisms, including our novel MVG mechanism. Third, therefore, for fair comparison, we will compare the MVG mechanism with the four state-of-the-art basic mechanisms presented in this section.
We begin with a discussion of basic concepts pertaining to the MVG mechanism for matrix-valued query.
3.1 Matrix-Valued Query
In our analysis, we use the term dataset interchangeably with database, and represent it with the matrix . The matrix-valued query function, , has rows and columns. We define the notion of neighboring datasets as two datasets that differ by a single record, and denote it as . We note, however, that although the neighboring datasets differ by only a single record, and may differ in every element.
We denote a matrix-valued random variable with the calligraphic font, e.g., and its instance with the bold font, e.g. . Finally, as will become relevant later, we use the columns of to denote the records (samples) in the dataset.
3.2 -Differential Privacy
In the paradigm of data privacy, differential privacy [4, 2] provides a rigorous privacy guarantee, and has been widely adopted in the community . Differential privacy guarantees that the involvement of any one particular record of the dataset would not drastically change the query answer.
A mechanism on a query function is - differentially-private if for all neighboring datasets , and for all possible measurable matrix-valued outputs ,
3.3 Matrix-Variate Gaussian Distribution
One of our main innovations is the use of the noise drawn from a matrix-variate probability distribution. More specifically, in the MVG mechanism, the additive noise is drawn from the matrix-variate Gaussian distribution, defined as follows[56, 57, 58, 59, 60, 61].
Noticeably, the probability density function (pdf) of looks similar to that of the -dimensional multivariate Gaussian distribution, . Indeed, is a generalization of to a matrix-valued random variable. This leads to a few notable additions. First, the mean vector now becomes the mean matrix . Second, in addition to the traditional row-wise covariance matrix , there is also the column-wise covariance matrix . The latter addition is due to the fact that, not only could the rows of the matrix be distributed non-uniformly, but also could its columns.
We may intuitively explain this addition as follows. If we draw i.i.d. samples from denoted as , and concatenate them into a matrix , then, it can be shown that is drawn from , where
is the identity matrix. However, if we consider the case when the columns of are not i.i.d., and are distributed with the covariance instead, then, it can be shown that this is distributed according to .
3.4 Relevant Matrix Algebra Theorems
We recite major theorems in matrix algebra that are essential to the subsequent analysis and discussion as follows.
Theorem 1 (Singular value decomposition (SVD) ).
A matrix can be decomposed into two unitary matrices , and a diagonal matrix , whose diagonal elements are ordered non-increasingly downward. These diagonal elements are the singular values of denoted as , and .
Lemma 1 (Laurent-Massart ).
For a matrix-variate random variable , , and , the following inequality holds:
where is the Frobenius norm of a matrix.
Lemma 2 (Merikoski-Sarria-Tarazaga ).
The non-increasingly ordered singular values of a matrix have the values of
where is the Frobenius norm of a matrix.
Lemma 3 (von Neumann ).
Let , and let and be the non-increasingly ordered singular values of and , respectively. Then
Lemma 4 (Trace magnitude bound ).
Let , and let be the non-increasingly ordered singular values of . Then
4 MVG Mechanism: Differential Privacy with Matrix-Valued Query
Matrix-valued query functions are different from their scalar counterparts in terms of the vital information contained in how the elements are arranged in the matrix. To fully exploit these structural characteristics of matrix-valued query functions, we present a novel mechanism for matrix-valued query functions: the Matrix-Variate Gaussian (MVG) mechanism.
First, let us introduce the sensitivity of the matrix-valued query function used in the MVG mechanism.
Definition 3 (Sensitivity).
Given a matrix-valued query function , define the -sensitivity as,
where is the Frobenius norm .
Then, we present the MVG mechanism as follows.
Definition 4 (MVG mechanism).
Given a matrix-valued query function , and a matrix-valued random variable , the MVG mechanism is defined as,
where is the row-wise covariance matrix, and is the column-wise covariance matrix.
Note that so far, we have not specified how to pick and according to the sensitivity in the MVG mechanism. We discuss the explicit form of and next.
As the additive matrix-valued noise of the MVG mechanism is drawn from , the parameters to be designed for the mechanism are the covariance matrices and . In the following discussion, we derive the sufficient conditions on and such that the MVG mechanism preserves -differential privacy. Furthermore, since one of the motivations of the MVG mechanism is to facilitate the exploitation of the structural characteristics of the matrix-value query, we demonstrate how the structural knowledge about the matrix-value query can improve the sufficient condition for the MVG mechanism. The term improve here certainly requires further specification, and we provide such clarification when the context is appropriate later in this section.
Hence, the subsequent discussion proceeds as follows. First, we present a sufficient condition for the values of and to ensure that the MVG mechanism preserves -differential privacy without assuming structural knowledge about the matrix-valued query. Second, we present an alternative sufficient condition for and to ensure that the MVG mechanism preserves -differential privacy with the assumption that the matrix-value query is symmetric positive semi-definite (PSD). Finally, we rigorously prove that, by incorporating the knowledge of positive semi-definiteness about the matrix-value query, we improve the sufficient condition to guarantee -differential privacy with the MVG mechanism.
4.2 Differential Privacy Analysis for General Matrix-Valued Query (No Structural Assumption)
|database/dataset whose columns are data records and rows are attributes/features.|
|matrix-variate Gaussian distribution with zero mean, the row-wise covariance , and the column-wise covariance .|
|matrix-valued query function|
|generalized harmonic numbers of order|
|generalized harmonic numbers of order of|
|vector of non-increasing singular values of|
|vector of non-increasing singular values of|
First, we consider the most general differential privacy analysis of the MVG mechanism. More specifically, we do not make explicit structural assumption about the matrix-query function in this analysis. The following theorem presents the key result under this general setting.
Let and be the vectors of non-increasingly ordered singular values of and , respectively, and let the relevant variables be defined according to Table 1. Then, the MVG mechanism guarantees -differential privacy if and satisfy the following condition,
where , and .
The MVG mechanism guarantees differential privacy if for every pair of neighboring datasets and all possible measurable sets ,
The proof now follows by observing that (cf. Section 6.3.2),
and defining the following events:
where is defined in Theorem 1. Next, observe that
where the last inequality follows from the union bound. By Theorem 1 and the definition of the set , we have,
In the rest of the proof, we find sufficient conditions for the following inequality to hold:
this would complete the proof of differential privacy guarantee.
Using the definition of (Definition 2), this is satisfied if we have,
By inserting inside the integral on the left side, it suffices to show that
for all . With some algebraic manipulations, the left hand side of this condition can be expressed as,
where . This quantity has to be bounded by , so we present the following characteristic equation, which has to be satisfied for all possible neighboring and all , for the MVG mechanism to guarantee -differential privacy:
Specifically, we want to show that this inequality holds with probability .
From the characteristic equation, the proof analyzes the four terms in the sum separately since the trace is additive.
The first term: . First, let us denote , where and are any possible instances of the query and the noise, respectively. Then, we can rewrite the first term as, . The earlier part can be bounded from Lemma 3:
Lemma 2 can then be used to bound each singular value. In more detail,
Applying the same steps to the other singular value, and using Definition 3, we can write,
Substituting the two singular value bounds, the earlier part of the first term can then be bounded by,
The latter part of the first term is more complicated since it involves , so we will derive the bound in more detail. First, let us define to be drawn from , so we can write in terms of using affine transformation : . To specify and , we solve the following linear equations, respectively,
This can be readily solved with SVD (cf. [62, p. 440]); hence, , and , where , and from SVD. Therefore, can be written as,
Substituting into the latter part of the first term yields,
This can be bounded by Lemma 3 as,
The two singular values can then be bounded by Lemma 2. For the first singular value,
By definition, , where is the 1-norm. By norm relation, . With similar derivation for and with Theorem 1, the singular value can be bounded with probability as,
Meanwhile, the other singular value can be readily bounded with Lemma 2 as . Hence, the latter part of the first term is bounded with probability as,
Since the parameter appears a lot in the derivation, let us define
The second term: . By following the same steps as in the first term, it can be shown that the second term has the exact same bound as the first terms, i.e.
The fourth term: . Since this term has the negative sign, we consider the absolute value instead. Using Lemma 4,
Then, using the singular value bound in Lemma 2,
Hence, the fourth term can be bounded by,
Four terms combined: by combining the four terms and rearranging them, the characteristic equation becomes,
This is a quadratic equation, of which the solution is . Since we know , due to the axiom of the norm, we only have the one-sided solution,
which immediately implies the criterion in Theorem 2. ∎
In Theorem 2, we assume that the Frobenius norm of the query function is bounded for all possible datasets by . This assumption is valid in practice because real-world data are rarely unbounded (cf. ), and it is a common assumption in the analysis of differential privacy for multi-dimensional query functions (cf. [3, 27, 70, 25]).
The values of the generalized harmonic numbers – , and – can be obtained from the table lookup for a given value of , or can easily be computed recursively .
The sufficient condition in Theorem 2 yields an important observation: the privacy guarantee by the MVG mechanism depends only on the singular values of and through their norm. In other words, we may have multiple instances of that yield the exact same privacy guarantee (cf. Figure 2). This phenomenon gives rise to an interesting novel concept of directional noise, which will be discussed in Section 5.
We emphasize again that, in Theorem 2, we derive the sufficient condition for the MVG mechanism to guarantee -differential privacy without making structural assumption about the matrix-valued query function. In the next two sections, we illustrate how incorporating the knowledge about the intrinsic structural characteristic of the matrix-valued query function of interest can yield an alternative sufficient condition. Then, we prove that such alternative sufficient condition can provide the better utility, when compared to the analysis without using the structural knowledge.
4.3 Differential Privacy Analysis for Symmetric Positive Semi-Definite (PSD) Matrix-Valued Query
To provide a concrete example of how the structural characteristics of the matrix-valued query function can be exploited via the MVG mechanism, we consider a matrix-valued query function that is symmetric positive semi-definite (PSD). To avoid being cumbersome, we will drop the explicit ’symmetric’ in the subsequent references, but the readers should keep in mind that we work with symmetric matrices here. First, let us define a positive semi-definite matrix in our context.
A symmetric matrix is positive semi-definite (PSD) if for all non-zero .
Conceptually, we can think of a positive semi-definite matrix in matrix analysis as the similar notion to a non-negative number in scalar-valued analysis. More importantly, positive semi-definite matrices occur regularly in practical settings. Examples of positive semi-definite matrices in practice include the maximum likelihood estimate of the covariance matrix [62, chapter 7], the Hessian matrix [62, chapter 7], the kernel matrix in machine learning [28, 72], and the Laplacian matrix of a graph .
With respect to differential privacy analysis, the assumption of positive semi-definiteness on the query function is only applicable if it holds for every possible instance of the datasets. Fortunately, this is true for all of the aforementioned matrix-valued query functions because the positive semi-definiteness is the intrinsic nature of such functions. In other words, if a user queries the maximum likelihood estimate of the covariance matrix of the dataset, the (non-private) matrix-valued query answer would always be positive semi-definite regardless of the dataset from which it is computed. The same property applies to other examples given. We refer to this type of property as intrinsic to the matrix-valued query function since it holds due only to the nature of the query function regardless of the nature of the dataset. This is clearly crucial in differential privacy analysis as differential privacy considers the worst-case scenario, so any assumption made would only be valid if it applies even in such scenario.
Before presenting the main result, we emphasize that the intrinsic nature phenomenon is not unique to positive semi-definite matrix. In other words, there are many other structural properties of the matrix-valued query function that are also intrinsic. For example, the adjacency matrix for an undirected graph is always symmetric 
, and the bi-stochastic matrix always has all non-negative entries with each row and column sums up to one. Therefore, the idea of exploiting structural characteristics of the matrix-valued query function is very applicable in practice under the setting of privacy-aware analysis.
Returning to the main result, we consider the MVG mechanism on a query function that is symmetric positive semi-definite. Due to the definitive symmetry of the query function output, it is reasonable to impose the design choice on the MVG mechanism. The rationale is that, since the query function is symmetric, its row-wise covariance and column-wise covariance are necessarily equal if we view the query output as a random variable. Hence, it is reasonable to employ the matrix-valued noise with the same symmetry. As a result, this helps restrict our design space to that with . With this setting, we present the following theorem which states the sufficient condition for the MVG mechanism to preserve -differential privacy when the query function is symmetric positive semi-definite.
Given a symmetric positive semi-definite (PSD) matrix-valued query function , let be the vectors of non-increasingly ordered singular values of , let , and let the relevant variables be defined according to Table 1. Then, the MVG mechanism guarantees -differential privacy if satisfy the following condition,
where , and .
The proof starts from the same characteristic equation (Eq. (2)) as in Theorem 2. However, since the query function is symmetric, . Furthermore, since we impose , the characteristic equation can be simplified as,
Again, this condition needs to be met with probability for the MVG mechanism to preserve -differential privacy.
with probability . We note two minor differences between Eq. (5) and Eq. (10). First, the factor of becomes . This is simply due to the fact that in the current setup with a PSD query function. Second, the variable in Eq. (5) becomes simply in Eq. (10). This is due to the fact that with the current PSD setting. Apart from these, Eq. (5) and Eq. (10) are equivalent.
Next, consider the third term and fourth term combined:
. Let us denote for a momentand . Then, this combined term can be re-written as, . Next, we show that
by starting from the right hand side and proceeding to equate it to the left hand side.
whereas the first-to-second line uses the additive property of the trace, and the second-to-third line uses the commutative property of the trace. Therefore, from this equation, we can write
where . Then, we can use Lemma 3 to write,
Next, we use Lemma 2 to bound the two sets of singular values as follows. For the first set of singular values,
whereas the last step follows from the fact that . For the second set of singular values,