The problem of secret key agreement via public discussion was first formulated for two terminals by Maurer  and Ahlswede and Csiszár , and subsequently extended to multiple terminals by Csiszár and Narayan . In the set-up of this problem, the terminals involved must agree upon a secret key based on correlated observations from a source, using interactive public discussion. The key must be kept information-theoretically secure from an eavesdropper having access to the public discussion. The conventional setting allows unlimited public discussion, and the aim is to agree upon a secret key of largest possible length. The problem formulation is in fact asymptotic in nature: the terminals observe an infinite sequence of i.i.d. realizations of the correlated source random variables, and the asymptotic secret key rate (number of symbols of secret key generated per source realization) must be as large as possible. The largest possible asymptotic key rate, termed the secrecy capacity, is by now quite well understood [3, 4].
A more difficult problem is to determine the secrecy capacity under a constraint on the amount or rate of public discussion allowed. Specifically, when the (asymptotic) rate of public discussion is bounded above by , the problem is to determine the maximum achievable secret key rate , which we term the rate-constrained secrecy capacity. This problem was considered in the case of two terminals by Tyagi  and Liu et al. . The primary focus of Tyagi  was on the related problem of characterizing what we will call the communication complexity , which is the least discussion rate needed to achieve the (unconstrained) secrecy capacity; he left open the rate-constrained secrecy capacity problem. Liu et al.  gave a characterization of the achievable region of key and discussion rate pairs using a notion of -concave envelopes that they develop. They used their methods to give a precise description of the ratio in the regime of .
The multiterminal and problems were considered in our prior works [7, 8, 9, 10]. Among our contributions there were some general outer bounds on the achievable rate region, and upper and lower bounds on ; in the special case of the hypergraphical source model, we derived tighter upper bounds on and the ratio valid for all . In the important special case of the pairwise independent network (PIN) model (see e.g. ), our bounds were good enough to precisely characterize and .
In this paper, we make further progress on these problems by focusing on the (multiterminal) finite linear source model , which generalizes the hypergraphical source and PIN models. In the finite linear model, the observation of each terminal is a linear function of an underlying random vector composed of finitely many i.i.d. uniform random variables. Furthermore, we consider a non-asymptotic, single-shot version of the secret key agreement problem as opposed to the asymptotic version in 
. In this version, the terminals observe only one realization of the source, and after some public discussion, must agree (with probability) upon a secret key that is statistically independent of the public communication. Single-shot analogues of the and problems can be formulated in this setting — see Section II. We study these problems with a view towards extending the results obtained for the single-shot setting to the asymptotic model.
Courtade and Halford  formulated and analyzed the single-shot secret key generation problem for hypergraphical sources. They made a key assumption to facilitate their analysis, namely, that the communication from each terminal is a linear function of its observations. Under this restriction, they effectively resolved the single-shot and problems for hypergraphical sources. Note that linear discussion was also considered in [12, 14] for finite linear sources, but the objective was to achieve the unconstrained secrecy capacity of the asymptotic model perfectly with a finite block length, so as to avoid excessive delay in generating the secret key.
Taking inspiration from , we too restrict the public discussion to be a linear function of the terminals’ observations. Under the linear discussion model, for finite linear sources, we obtain a characterization (Corollary 2 in Section IV) of the communication complexity of generating a secret key of maximum length. The minimum discussion is achieved by a non-interactive protocol in which each terminal first does a linear processing of its own private observations, following which the terminals all execute a (single-shot) discussion-optimal communication-for-omniscience protocol on their linearly processed observations. At the end of this, each terminal is able to recover the observations of all the other terminals (omniscience), and it then applies a linear function to the entire vector of observations to obtain a maximum-length secret key.
The rest of the paper is organized as follows. Section II contains the formal problem formulation, Section III presents an illustrative example, and Section IV contains statements of the main results, complete proofs of which can be found in the appendices. The paper ends in Section V with a discussion of the possible ways in which the results could be extended to settings beyond that of our problem formulation.
Ii Problem formulation
We use the sans serif font to represent a random variable with distribution and taking values from a set . We use the boldface uppercase for matrices and boldface lowercase san serif font for random row vectors where denotes the length of the vector. We assume all the entries take values from the same finite field of order . We take logarithm to base and so all the information quantities are in the units of bits. For a finite set , we use to denote a row vector obtained by concatenating the row vectors ’s for some enumeration of the set . We use the notation
to mean that there exists a deterministic matrix such that .
As in , the multiterminal secret key agreement problem consists of a finite set of users who want to share a secret key after some public discussion that can be eavesdropped by a wiretapper. The one-shot perfect linear secret key agreement (SKA) scheme consists of the following phases.
One-shot private observation: Each user observes the component of a given finite linear source defined in  with the requirement that
for some uniformly random vector over . is referred to as the base of . In the special case when is a subvector of , is called the hypergraphical source , which is the source model considered in . Unlike the model in  and  where each user observes i.i.d. samples of the source, we consider the one-shot model as in [13, 15] where each user only observes one sample.
Private randomization: Each user privately generates a random vector over independent of the source , i.e.,
Note that there is no restriction on the length nor the distribution of , and so the requirement that it must be a vector over does not lose generality. Note also that such a randomization was not explicitly considered in the formulations of [3, 12, 13].
Linear public discussion: Each user publicly reveals the message
Hence, everyone including the wiretapper observes . Unlike , the discussion above is non-interactive as interaction is unnecessary for linear discussion as explained in .111Suppose the discussion is interactive, i.e., a message, say , revealed in public by some user is a linear function of the private observations of user as well as all the previously discussed messages denoted by . By linearity, we can rewrite as where denotes an all-zero vector of an appropriate length. Note that, given , there is a bijection between and , and so user can reveal instead of in public without loss of generality, since can be recovered from and other discussion messages . As does not depend on , we can convert any interactive discussion to a non-interactive discussion by replacing every discussion message by the corresponding .
Secret key agreement After the public discussion, each user attempts to agree on a secret key satisfying
where (4) is the recoverability constraint that requires the secret key to be perfectly recoverable by every user and (5) is the secrecy constraint that requires the key to be uniformly random and perfectly independent of the entire public discussion. Note that we do not assume apriori that is a linear function of the private source, and so the key length is not required to be an integer.222Nevertheless, it will follow from Theorem 1 that can be chosen to be a linear function of the private source without loss of optimality, and so the key length must be an integer.
The objective is to characterize the set of achievable key lengths and discussion lengths. In particular, a quantity of interest is the constrained secrecy capacity defined as
where the maximization is over all possible secret key agreement schemes subject to a constraint on the total public discussion length, . (Note that we omit the argument if there is no ambiguity.) Characterizing the entire curve of is difficult even in the case of linear discussion, but some points on the curve can be characterized, such as considered in . As in [3, 12, 13], we also consider the unconstrained secrecy capacity defined as
which is the secrecy capacity without the constraint on the discussion length. The smallest discussion length required to achieve is denoted by
and referred to as the communication complexity. As in [3, 13], we will characterize and using the closely related problem of communication for omniscience defined as follows. The problem under the one-shot model for hypergraphical and finite linear sources is proposed in [18, 15] and referred to as the cooperative data exchange.
Omniscience: We say that the public discussion achieves omniscience of if
The smallest length of communication for omniscience is defined as
where the minimization is over all public discussion schemes subject to (9) in place of (5) and (4). In , the secret key agreement scheme that achieves the capacity is by first achieving omniscience of and then extracting the secret key as a function of , implying that the rate of communication for omniscience is no smaller than the communication complexity. We say that can be achieved via omniscience of .
Iii Motivating Example
We will use the following example to illustrate the problem formulation and motivate our main results. Consider and a finite linear source (see (1)) over the binary field with a base of length as follows:
A feasible secret key agreement scheme is to choose
but without any private randomizations and discussions and by users and . The secret key is perfectly recoverable by every user, i.e., satisfying (4), since users and directly observes the key bit , which can also be computed by users and using their private sources and public discussion as follows
The secrecy constraint (5) also holds because , which follows from the definition of the base that is uniformly random and independent of , , and .
Note that the above scheme does not achieve the omniscience condition in (9) because users , and cannot recover after the discussion. However, it is easy to show that omniscience can be achieved if we further set , i.e., with an additional bit of discussion by user . Since bit of secret key can be achieved with bits of public discussion and omniscience can be further achieved with an additional bit of discussion, we have
Iv Main results
We start with some rather general admissible conditions that simplify the secret key agreement scheme significantly without loss of optimality.
remains unchanged even if we set
which mean respectively that private randomizations are not needed and that the secret key can be chosen to be linear function of the private source.
must be integer, non-decreasing and right continuous in .
For the corollary, the fact that must be an integer follows from (14b) that the key can be linear and therefore a uniformly random vector by the secrecy constraint (5). Monontonicity and continuity follows directly from the definition (6). The proof of the theorem is more involved and given in Appendix A-A.
For instance, the example in Section III considers such a secret key agreement scheme without private randomization. The secret key is also linear in the private source trivially because it is directly observed by users and . Note that our formulation allows the private randomizations to have arbitrary length and distribution, and the key to be arbitrary random variables that need not be linear in the private source. The above admissible constraints (1) makes the problem tractable as it significantly reduces the space of secret key agreement schemes we need to consider to characterize . Indeed, since there is only a finite number of linear functions of , there is only a finite number of admissible secret key agreement scheme. It is worth noting that the constraints (1) were assumed in the formulation of  for the hypergraphical source model, and our result implies that such constraints are admissible since hypergraphical sources are special case of the finite linear sources.
For the general source model in , the admissible constraint (14a) that private randomization does not help improve remains a plausible conjecture. However, it is clear that the constraint (14b) is not admissible for some sources that are not finite linear. Nevertheless, this constraint is essential in bringing the existing characterizations of the capacity from the general source model to the one-shot finite linear source model as follows.
in the extreme cases with and respectively unbounded discussion lengths are
where the maximization is over the choices of random vector , and the minimization is over the collection of partitions of into at least two non-empty disjoint sets. Furthermore, can be achieved via communication for omniscience of at the smallest length
which implies the upper bound on .
See Appendix A-B.
For the running example given in Section III, it is straightforward to evaluate the above expressions (1), (2) and (3) to yield , and . In particular, an optimal solution to (2) can be shown to be . This implies the optimality of the omniscience scheme in Section III in achieving both and .
The above result follows quite directly from existing works for the asymptotic model. For instance, the r.h.s. of (1) is the multivariate Gács-Körner common information evaluated for the finite linear source model. is called the maximum common function of for . was shown in  but for the asymptotic model instead. It is straightforward to extend this result to the current one-shot model.
The duality (3) between secret key agreement and communication for omniscience also follows directly from the asymptotic model in , which is specialized to the asymptotic finite linear source model in . The characterization (2) of is the same as that of the asymptotic model [3, 4] except for the floor operation, since the minimization in (2) may not be integer but must be integer by Corollary 1. The characterization of for the one-shot finite linear source model is given in [15, 19], which focus primarily on the omniscience problem instead of the secret key agreement problem.
Note that one can summarize the theorem by saying that , and for the one-shot model is the same as those of the asymptotic model for finite linear source but with an additional integer constraint: is already an integer for the asymptotic model while we can take the floor and the ceiling respectively for and to turn them into integer achievable lengths. It therefore appears reasonable to conjecture that for the one-shot model is the same as the for the asymptotic model for finite linear source but with an additional floor operation as in (2) to satisfy the integer constraint in Corollary 1. The following result resolves this partially at the communication complexity .
If , then there exists with
is said to be reduced source of (by linear processing), since the above implies .
The communication complexity is
achieved via omniscience of the linearly reduced source .
The corollary follows immediately from theorem by repeatedly linearly reducing the source until . This is possible since the theorem guarantees linear processing of the source exists that can reduce without changing whenever . For the proof of the theorem, see Appendix A-C.
For the running example in Section III, the omniscience scheme does not achieve as by the secret key agreement scheme without omniscience described in Section III. As , the theorem above guarantees a linear processing of the source that reduces without changing . Such a linearly reduced source can be obtained with
and for . It is straightforward to show that by (2) and by (3). Note the source is reduced in the sense that . By going through all possible independent linear processings of ’s, which is possible as there is only a finite number of possibilities, one can show that the above defined is optimal to (8) achieving the minimum , and so as desired by the above corollary.
Note that the characterization of remains open for the asymptotic model  but we believe that it can be resolved for finite linear source model by extending the above result to the asymptotic case. This means in particular that for the one-shot model is the ceiling of the for the asymptotic model. In Section V, we outline the challenges involved in such extension.
In this work, we considered the one-shot secret key agreement problem under a finite linear source model with linear public discussion, perfect secrecy and recoverability. However, we believe that all the results can be extended without assuming the discussion is linear. In particular, extending Theorem 2 is straightforward as the converse parts follow from those of the asymptotic model without requiring the discussion to be linear. Extending Theorem 1 and Theorem 3 appears challenging. The current proofs rely on the linearity of discussion.
Another possible extension of the current results is to the asymptotic model where users observe i.i.d. samples of the private source, and the constrained secrecy capacity and discussion rate is per sample of the observation, i.e.,
for a sequence in of secret key agreement schemes and some functions for that user uses to recover the secret key. As mentioned below Theorem 2, the characterizations of , and are already known for the asymptotic model and they are indeed used to derive the corresponding characterizations for the one-shot model. We believe that the other results in Theorem 1 and Theorem 3 can be extended. The current proofs can be directly extended if we impose perfect recoverability instead, i.e., with for sufficiently large . However, the proofs without assuming perfect recoverability remain elusive. What we desire is a proof that perfect recoverability is admissible and can therefore be be assumed without loss of optimality. In the similar vein, we also desire a proof that can be achieved exactly, i.e., for sufficiently large , there exists a secret key agreement scheme with and , and that linear public discussion is admissible.
-  U. M. Maurer, “Secret key agreement by public discussion from common information,” IEEE Transactions on Information Theory, vol. 39, no. 3, pp. 733–742, 1993.
-  R. Ahlswede and I. Csiszár, “Common randomness in information theory and cryptography—Part I: Secret sharing,” IEEE Transactions on Information Theory, vol. 39, no. 4, pp. 1121–1132, Jul. 1993.
-  I. Csiszár and P. Narayan, “Secrecy capacities for multiple terminals,” IEEE Transactions on Information Theory, vol. 50, no. 12, pp. 3047–3061, Dec. 2004.
-  C. Chan and L. Zheng, “Mutual dependence for secret key agreement,” in Proceedings of 44th Annual Conference on Information Sciences and Systems, 2010.
-  H. Tyagi, “Common information and secret key capacity,” IEEE Transactions on Information Theory, vol. 59, no. 9, pp. 5627–5640, Sep. 2013.
-  J. Liu, P. Cuff, and S. Verdú, “Secret key generation with limited interaction,” IEEE Transactions on Information Theory, vol. 63, no. 11, pp. 7358–7381, Nov. 2017.
-  M. Mukherjee, N. Kashyap, and Y. Sankarasubramaniam, “On the public communication needed to achieve sk capacity in the multiterminal source model,” IEEE Transactions on Information Theory, vol. 62, no. 7, pp. 3811–3830, July 2016.
-  C. Chan, M. Mukherjee, N. Kashyap, and Q. Zhou, “Secret key agreement under discussion rate constraints,” in IEEE International Symposium on Information Theory Proceedings (ISIT), June 2017, pp. 1519–1523.
-  ——, “On the optimality of secret key agreement via omniscience,” IEEE Transactions on Information Theory, vol. 64, no. 4, pp. 3811–3830, April 2018.
-  ——, “Upper bounds via lamination on the constrained secrecy capacity of hypergraphical sources,” CoRR, 2018. [Online]. Available: http://arxiv.org/abs/1805.01115
-  S. Nitinawarat and P. Narayan, “Perfect omniscience, perfect secrecy, and Steiner tree packing,” IEEE Transactions on Information Theory, vol. 56, no. 12, pp. 6490–6500, Dec. 2010.
-  C. Chan, “Linear perfect secret key agreement,” in Information Theory Workshop (ITW), 2011 IEEE. IEEE, 2011, pp. 723–726.
-  T. A. Courtade and T. R. Halford, “Coded cooperative data exchange for a secret key,” IEEE Transactions on Information Theory, vol. 62, no. 7, pp. 3785–3795, July 2016.
-  C. Chan, “Delay of linear perfect secret key agreement,” in Forty-Ninth Annual Allerton Conference on Communication, Control, and Computing, Sep. 2011.
-  N. Milosavljevic, S. Pawar, S. E. Rouayheb, M. Gastpar, and K. Ramchandran, “Efficient algorithms for the data exchange problem,” IEEE Transactions on Information Theory, vol. 62, no. 4, pp. 1878 – 1896, Apr. 2016.
-  C. Chan, “Generating secret in a network,” Ph.D. dissertation, Massachusetts Institute of Technology, 2010.
-  C. Chan, M. Mukherjee, N. Kashyap, and Q. Zhou, “Multiterminal secret key agreement at asymptotically zero discussion rate,” in 2018 IEEE International Symposium on Information Theory (ISIT), June 2018, pp. 2654–2658.
-  N. Milosavljevic, S. Pawar, S. El Rouayheb, M. Gastpar, and K. Ramchandran, “Deterministic algorithm for the cooperative data exchange problem,” in IEEE International Symposium on Information Theory Proceedings (ISIT), Jul. 2011.
-  N. Ding, C. Chan, Q. Zhou, R. A. Kennedy, and P. Sadeghi, “Determining optimal rates for communication for omniscience,” IEEE Trans. Inf. Theory, vol. 64, no. 3, pp. 1919–1944, Mar. 2018.
-  P. Gács and J. Körner, “Common information is far less than mutual information,” Problems of Control and Information Theory, vol. 2, no. 2, pp. 149–162, Feb. 1972.
Appendix A Proofs
A-a Proof of Theorem 1
First we will show the admission constraint (14b), i.e., remains unchanged when the secret key is chosen to be linear function of the private source. Consider any optimal SKA scheme with a fixed discussion length , i.e., with secret key having length and discussion having length . Define
By the recoverability constraint (4) of the secret key, is a common function of for . Trivially, is also a common function of . Let be a maximum common function, as defined for Theorem 2, of ’s instead of ’s. From , we know that any common function of is a function of . Hence is a function of i.e.,
It was shown in  that is a linear function for a finite linear source, i.e., . Therefore,
In fact, is a linear function of because of the linearity of the communication, , and the linearity of the maximal common function . We can therefore write, .
We will show that the secret key rate remains unchanged if we choose the secret key to be the linear function such that
Note that the above choice of , if exists, is a feasible choice of secret key because (11) implies the secrecy constraint (5) while the recoverability constraint (4) follows from the fact that is a function of the maximum common function .
To show that exists, consider for some matrix and set where is a matrix whose column space is the left null space of . (10) then follows from the fact that is a bijection as has full column rank. To show (11), note that the columns of cannot be spanned by the columns of , and so is independent of , i.e.,
It remains to show that
is uniformly distributed, i.e.,
Notice we can choose to have full column rank, in which case is uniformly distributed if is. Since , we can write as for some matrix . We can choose to have full column rank, which then implies that is uniformly distributed as desired because is.
Finally, we argue as follows that the secret key rate is not diminished if is used as the secret key instead.
Next, we impose the linearity (14b) of the key and show that that the other constraint (14a) is admissible, i.e., private randomization is not needed. Consider any user , and rewrite as by reordering the components such that is a component of and contains the rest of the components of and . We will argue that can be removed without affecting the secret key rate or the discussion rate. Consider the case where does not depend on . Then, also cannot depend on , or the recoverability constraint (4) fails. Hence can be removed as desired. Consider the non-trivial case where there is a component of such that
for some and linear function . We can also write
for some linear functions . Define for ,
Note that both and are independent of . Furthermore,
where (a) follows from the fact that is a component of and (b) is because for all . Hence is an optimal SKA scheme which does not depend on , i.e., can be removed.
A-B Proof of Theorem 2
With no public discussion, i.e., , by the recoverability constrain (4), the secret key must be a function of the Gács–Körner common information  of . For finite linear source, [17, Theorem 4.2] showed that the latter quantity equals to the r.h.s. of (1), thereby establishing for (1). To prove the reverse inequality, note that, we can write the solution to (1) as for some matrix because . Furthermore, can be chosen to have have full column rank without loss of optimality, and so can be uniformly distributed. Then, the desired secrecy capacity can be achieved with , where the uniformity of implies the secrecy constraint (5) and the recoverability constraint (4) also follows trivially from the fact that is a common function.
Converse Part. By Thorem 1, we can assume and without loss of optimality. Now, since and are functions of ,
the previous equality yields
Next, we show that , where
To that end,
where the last inequality is because
and by perfect recoverability. Now, since is an integer, for ,
Achievability Part. For any vector , it was shown by [15, Theorem 1] that there exists a corresponding linear noninteractive discussion scheme which renders omniscience. Further, it was shown by [19, Corollary 6] that
Consider any optimal solution to the l.h.s. of (13). Denote the corresponding linear noninteractive discuss that attains omniscience by .
Now, it remains to extract a perfect secret key from the omniscience obtained above. For each realization of , let be the set of all which generate . By the definition of the finite linear source, observe that each entry of is i.i.d. uniform from , i.e., . By the linearity of the discussion above, it is easy to see that is the same for all realizations of . Since has dimension , . Set . For each , label each with a unique element in . Then, upon observing , every user which knows by omniscience picks the label of as the secret key. Since is uniformly distributed and has the same size, it follows that this random label is uniformly distributed over and independent of , thereby constituting a perfect secret key. Therefore,
This completes the proof.
A-C Proof of Theorem 3
To prove this theorem, we will need the following lemma. We say that can be simulated by the source if .
If can be simulated by , then .
Any secret key generation protocol for can be simulated by .
Now consider the finite linear source on the set of users , defined by or equivalently, for each , where is a matrix over . Set , so that
We assume, without loss of generality, that has full row rank, so that . We will let (resp. ) denote the row-space of (resp. ). Since , we have , so that .
Let be a (non-interactive) linear communication protocol, i.e., , that generates a secret key using symbols from as public communication. Since , the communication is insufficient for omniscience. Assume that user cannot recover all of from . Then, via linearity of the source and the communication, there exists an observation
such that .
Fix a basis for , with as above. Let be the matrix having , in that order, as its rows. Thus, . There is then an invertible (change-of-basis) matrix such that . We can now write
where . Since are i.i.d. rvs, and is invertible,