SATE: Robust and Private Allegation Escrows

10/23/2018 ∙ by Venkat Arun, et al. ∙ 0

For fear of retribution, the victim of a crime may be willing to report the crime only if others victimized by the same perpetrator also step forward. Common examples include identifying oneself as the victim of sexual harassment by a person in a position of authority or accusing an influential politician, an authoritarian government or ones own employer of corruption. To handle such situations, legal literature has proposed the concept of an allegation escrow, a neutral third-party that collects allegations anonymously, matches allegations against each other, and de-anonymizes allegers only after de-anonymity thresholds (in terms of number of allegers), pre-specified by the allegers, are reached. An allegation escrow can be realized as a single trusted third party; however, such a party is exposed to attacks on the confidentiality of accusations and the anonymity of accusers. To address this problem, this paper introduces split, anonymizing, threshold escrows (SATEs). A SATE is a group of parties with independent interests and motives, acting jointly as an escrow for collecting allegations from individuals, matching the allegations, and revealing the allegations when designated thresholds are reached. By design, SATEs provide a very strong property: No less than a majority of parties constituting a SATE can de-anonymize or disclose the content of an allegation without a sufficient number of matching allegations (even in collusion with any number of other allegers). Once a sufficient number of matching allegations exist, all parties can simultaneously disclose the allegation with a verifiable proof of the allegers' identities. We describe how SATEs can be constructed using a novel anonymous authentication protocol and an allegation thresholding and matching algorithm. We give formal proofs of the security, and evaluate a prototype implementation, demonstrating feasibility in practice.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In many cases, the victim or the witness of a crime may be too afraid to accuse the perpetrator for fear of retribution by the perpetrator. In other cases, particularly those involving sexual harassment, the survivor may not report the crime anticipating negative social consequences or further harassment by the perpetrator. In such situations, the victim (or the witness) may find it easier to act against the perpetrator if others also accuse the perpetrator of similar crimes. Examples of this abound, a notable example being the recent Me Too movement [1], which led to many public allegations of sexual abuse in the US film industry and elsewhere, all triggered by the courage of an initial few.

An allegation escrow is a system that aids such collective allegations, by matching allegations against a common perpetrator confidentially. Technically, an allegation escrow allows a victim or witness of a crime to file a confidential allegation, which is to be released to a designated authority once a pre-defined number of matching allegations against the same party have been filed. The identities of the accusers and the accused, as well as the content of the allegation, remain confidential until the release condition holds.

Besides helping fearful or embarrassed victims to report crimes (safe in the knowledge that their accusation will be revealed only as part of a larger group), allegation escrows help improve reporting in cases where the victim is uncertain if the perpetrator’s actions constitute a crime. Escrowed allegations also enjoy higher credibility since, to all appearances, they are filed independently of each other (as opposed to public allegations, where the credibility of subsequent allegations may be questioned). In technical terms, allegation escrows have been shown to mitigate the first-mover disadvantage that perpetrators typically benefit from [2].

A number of allegation escrow services are available now. For example, Project Callisto [3] is an allegation escrow system that has been deployed in 13 universities with over 100k students, to help report sexual assault on college campuses. A victim can instruct the system to release the allegation only when another allegation against the same person exists. Sexual assault survivors who visit the Callisto website of their college are 5 times more likely to report the crime than those who do not, and Callisto has reduced the average time taken by a student to report an assault from 11 to 4 months [4]. This makes a very strong case for the usefulness of allegation escrows.

However, existing allegation escrows such as Project Callisto are implemented as a single trusted third-party, similar to ombuds-offices in many organizations. Although technically simple and effective in many cases, the use of a single party may raise concerns about the escrow’s trustworthiness, impartiality and fallibility to influential perpetrators, thus driving away potential users. In the case of a university or corporate escrow, students or employees may be unsure that an allegation against a high-ranking official would be treated with integrity. A commercial escrow may raise concerns about its independence from funding sources and long-term security, just as a government-run escrow may raise concerns about its independence from high-ups in law enforcement and the judiciary. In all these cases, users may not trust the escrow enough to file accusations against people them deem to have the power to coerce, compromise or influence the escrow. When they do file accusations, strong perpetrators may actually abuse their power to prematurely discover escrowed accusations, suppress and alter the accusations, or even seek retribution against the victims. Finally, even if a victim trusts an escrow, other victims of the same perpetrator may not, making it impossible for the escrow to match their accusations.

This suggests the need for allegation escrows based on several independent parties, none of which in itself is a single point of coercion or attack by strong adversaries. In this paper, we present the cryptographic design of such escrows. Our escrows, called SATEs (short for Split, Anonymizing, Threshold Escrows), distribute client secrets—confidential allegations and identities of the accusers and accused—among several parties by threshold secret-sharing [5]. These parties, called -escrows, act together and perform multi-party computations (MPCs) to provide the same functionality as a single-party allegation escrow, but compromising one or even half of the -escrows provides no information about escrowed allegations, accusers or the accused. The -escrows can span diverse administrative, political and geographic domains, mitigating the chances of simultaneous attacks over a majority by the same adversary.

The first key technical contribution of our work is an algorithm for matching allegations to each other, even when each -escrow only has shares of the allegation. For this, we rely on a novel construction of distributed pseudorandom functions over shared secrets, as well as a novel bucketing algorithm to connect matching allegations to each other.

A further novelty of our design is that it allows each filer to decide how many other allegations should match their allegation before it is revealed. In contrast to other work [6] that uses the same match threshold for all allegations, this allows each filer more flexibility according to their level of comfort, but complicates our matching protocol even further.

Additionally, we designed SATEs to provide a strong accountability property: Every filed allegation can be linked to a real-world (strong) identity, which is revealed to the concerned authority once the allegation has found enough matches. This discourages the filing of fake allegations and deters probing attacks that all allegation escrows are fundamentally susceptible to (see §II and §IV-D). Although providing strong accountability in a single-party escrow is trivial, doing so in a multi-party escrow like SATE is difficult. Specifically, this requires a nontrivial authentication protocol for filing allegations, which ensures that the -escrows collectively learn the identity of the filing user, but no minority set learns the same identity (else, the adversary can also learn the identity by compromising the minority set). Our second key technical contributions is the design of such a protocol. The protocol again relies on our construction of distributed pseudorandom functions.

We formally prove the end-to-end security of our SATE cryptographic design in the universal composability (UC) framework [7]. Specifically, we present an ideal functionality which, by definition, captures the expected security and accountability properties of a SATE, and then show that our cryptographic design realizes this functionality. We also implement a prototype of the SATE design to understand the latency and throughput of user-facing operations. We find that our design is efficient enough for typical use-conditions of allegation escrows.

To summarize, the contributions of our work are:

  • The concept of SATE, a distributed allegation escrow, that is robust to compromise or coercion of minority subsets of constituting parties.

  • A cryptographic realization of SATEs using secret-sharing and efficient multi-party computation protocols. In particular, new protocols for user authentication and matching allegations.

  • A formal security analysis of our cryptographic realization.

  • A prototype implementation and empirical evidence of reasonable performance in practice.

The rest of this paper is structured as follows. In §II, we describe the properties that SATEs provide, our threat model and an overview of the various protocols and algorithms that a SATE uses. §III recaps cryptographic preliminaries, and the distributed pseudorandom function construction that our other protocols rely on. §IV is the technical core of the paper that describes all SATE protocols, including the aforementioned authentication and matching protocols. §V presents the formal proof of security of SATEs, including the ideal functionality. §VI describes our empirical evaluation. §VII discusses related work and §VIII concludes the paper. An appendix contains details of our security proof.

Ii SATE Design

Basic design and properties

An allegation escrow like SATE allows users—also called allegers—to file allegations against accusees. It holds an allegation in escrow until a desired number of matching allegations against the same accusee have been filed. After a match is found, all the matching allegations, along with the identities of the matched allegers and the accusee, are passed on to a designated authority (e.g., an arbitrator or a counselor) for further action. This further action might involve informing the matched allegers and, possibly, coordinated action against the accusee.

An allegation escrow should, at the least, provide the following properties.

  • Allegation secrecy The escrow should hold each allegation secret until enough matches are found. An allegation should be released only as part of a group of matching allegations.

  • Alleger anonymity Similar to the previous point, the escrow should hold each alleger’s identity secret until enough matches are found.

Additionally, allegation escrows are most useful in asymmetric situations, where individual allegers are at a disadvantage compared to the accusee. Allegation escrows enable the allegers to build “strength in numbers” without fear of premature retaliation. However, the very information held by allegation escrows motivate powerful attacks against them, since the accusee can gain by learning about allegers before a large enough group has formed. Thus, allegation escrows should expect to be a target of coercion attacks. This leads to the following meta-property, that spans the previous properties.

  • Robustness The escrow should resist coercion and compromise attacks. It should continue to provide the properties above even if some constituent parts fail, are compromised or willingly cooperate with some accusees.

A single trusted third-party can implement an allegation escrow and trivially satisfy provide the allegation secrecy and alleger anonymity properties, but such a design is fundamentally not robust as any single trusted principal represents a single point of failure, coercion, and attack. If the accusees are powerful enough, then they may be able to coerce, via legal or personal threats, any (however well-meaning) third-party [8, 9]. A single party is also a point of corruption, in that the design inhibits allegations against allies of the party. For instance, allegers may be hesitant to use an organization’s ombudsperson to complain against somebody closely associated with the ombuds-office.

To attain robustness, SATEs function fundamentally differently from existing, single-party escrows. A SATE internally consists of independent parties called -escrows, that together form a single virtual entity acting as the allegation escrow. Information about allegations and the identities of allegers are cryptographically divided among the -escrows, so that coercing or compromising a minority of them does not reveal any information about existing allegations or allegers. The -escrows may span administrative jurisdictions and geographic boundaries to make simultaneous coercion very difficult even for a determined, very powerful accusee.

In addition to the above basic properties, our design also provides the following property.

  • Accountability Each allegation is bound to a strong, real-world identity. Once a match is found, the real identities of the matched allegers are revealed to the designated authority.

Although not as fundamental as the earlier properties, accountability discourages fake and bogus allegations, and acknowledges that the primary source of authenticity of an allegation, escrowed or otherwise, is the human backing it.

[width=0.85]Figures/protocol-overview.pdf

Fig. 1: An overview of the SATE protocol. The figure shows a) the user registration phase and b) the allegation filing phase. Numbers/letters indicate the order in which operations are performed. Thick-continuous and thick-dotted lines indicate one-to-many and many-to-one communication respectively.
Threat model and assumptions

A SATE adversary is interested in prematurely learning the identities of one or more allegers or discovering unrevealed allegations. For instance, the adversary may be a guilty perpetrator, interested in determining whether there is any allegation against them. To this end, an adversary may coerce or compromise some -escrows into revealing information they hold and/or not following the SATE protocol correctly. By design, SATEs are robust to such attacks on up to half the -escrows simultaneously: allegation secrecy, alleger anonymity and accountability hold even if the adversary learns all cryptographic and allegation-related material possessed by up to half the -escrows, and causes them to behave arbitrarily. Additionally, if the coerced -escrows are malicious but cautious, i.e., they continue to follow the SATE protocol (say, to avoid detection by other -escrows), then the SATE remains live—it continues to offer the expected functionality.

We make the standard assumption that adversaries cannot break cryptography. Technically, adversaries are probabilistic polynomial time (PPT) algorithms with respect to a chosen security parameter . We assume, as usual, that uncompromised parties (-escrows and allegers) keep their long-term secrets safe. To this end, allegers can discard the private and symmetric keys used to file an allegation immediately after filing the allegation, but they must store any unused keys safely until they are used (see §II-A and §IV).

For alleger anonymity, we assume that allegers do not reveal any information beyond that explicitly mentioned in our protocols (described later). For example, they should hide their IP addresses. For this, they can use standard network anonymity solutions like Tor [10].

All allegation escrows (not just SATEs) are fundamentally vulnerable to probing attacks where a guilty perpetrator files fake probe allegations against itself in the hope of revealing other genuine allegations before sufficiently many genuine matching allegations have been filed. While the ultimate defense against such attacks lies in preventing this kind of abuse by non-technical means (e.g., by criminalizing probe allegations), SATEs aid such defenses through the property of accountability, which ensures that the real-world identities of all allegers, including fake allegers, are revealed to the designated authority after a match. To provide accountability, a SATE allows a user to file an allegation only after they present evidence of their real-world identity. Further, the SATE protocol disincentivizes filing of probe allegations with very high thresholds (that would likely never be reached) by making a probe useful for discovering only those allegations that would be revealed at the same time as the probe (see §IV-D).

Ii-a Protocol Overview

Figure 1 shows an overview of the SATE protocol. From the perspective of the user, the protocol consists of two phases: (a) user registration and (b) allegation filing. In the backend, SATE’s -escrows user other protocols to match allegations to each other and to reveal matched allegations.

Registration

SATE uses real-world (strong) identities to ensure accountability. To incorporate real identities, SATE uses a registration phase. Prior to registering with SATE, a user proves their real identity to a certifying authority (CA) and gets a signature on their public key. The CA may be the user’s employer or university registering all its employees and students into the system, or even an independent entity verifying physical identities like passports.

To register with a SATE, the user authenticates to all -escrows using the CA certificate. The -escrows and the user then run a cryptographic protocol during which the user gets the SATE’s individual authentication tokens (in particular, MACs) on a fixed number of fresh public keys. Each of these keys can be used to file a single allegation later. Importantly, the -escrows only learn individual shares of these keys, but neither the full keys, nor the MACs on them. This prevents the -escrows from learning the identity of a user when the user files an allegation later, but allows a majority of -escrows to reconstruct the identity (by pooling their shares of the public key) when an allegation has to be revealed.

For their own benefit, users should register ahead of time, even when they see no need to file an allegation. This prevents timing correlation channels. For example, if an accusee is expecting an allegation due to a recent incident, and colludes with a -escrow, then the act of registration by the potential alleger may provide a strong hint of pending allegation. Ahead of time registration removes this channel of inference and could be enforced, for instance, by a company asking its employees to register with an allegation escrow service as soon as they join the company.111In some settings, it may be possible to use recently proposed blind certificate authorities [11] to remove this inference channel.

Allegation filing

When the user wants to file an allegation, they contact the -escrows, providing one of the public keys and the MAC on it, which the -escrows can verify. The verification tells each -escrow that this user has registered before, but doesn’t immediately reveal the identity of the user, since no -escrow has seen the full public key or the MAC on it in cleartext before. After this, the user provides the allegation’s text along with some meta-data in a specific cryptographic form, and a reveal threshold—the minimum number of allegations that must match before this one is revealed.

Matching, thresholding and revelation

The material provided with each allegation is fed into a matching and thresholding algorithm that the -escrows run in the background continuously. This algorithm matches allegations to each and other and as soon as a set of matching allegations, each with a reveal threshold less than (the size of ), is found, all these allegations are revealed to a designated authority for further action. The revelation contains the real identities of the allegers and the full texts of their allegations. The designated authority can then take appropriate action.

We describe the individual protocols for each of these stages in §IV. Before that, we describe cryptographic building-blocks that the individual protocols rely on.

Iii Distributed Cryptographic Tools

SATE employs distributed (or threshold) cryptography [12] for authentication and privately matching allegations. Specifically, we rely heavily on -threshold cryptography. The key idea is to distribute a secret among parties (the -escrows in our system) such that any subset of more than parties can jointly reveal the secret. Any parties together can also perform arbitrary computations (e.g., MAC generation, private matching) securely in the presence of a malicious adversary which controls up to parties.

In this section, we present distributed cryptographic protocols that we use in SATE. We first describe the necessary cryptographic primitives: distributed key generation (DKG) and multi-party computation (MPC) primitives. We then design the distributed versions of the signing and private matching protocols that we use in SATE.

Iii-a Multi-Party Computation (MPC)

An MPC protocol enables a set of parties to jointly compute a function on their private inputs in a privacy-preserving manner [13, 14, 15, 16]. More formally, every party holds a secret input value , and agree on some function that takes inputs. Their goal is to compute and provide to a recipient while making sure that the following two conditions are satisfied: Correctness: the correct value of is computed; Secrecy: the output is the only new information that is released to the recipient.

An Shamir secret sharing scheme [5] allows a dealer to distribute shares of a secret among parties such that any number of shares at or below the threshold reveal no information about the secret itself, while an arbitrary subset of shares above the threshold allows full reconstruction of the shared secret. Since in some secret sharing applications the dealer may benefit from behaving maliciously, parties also require a mechanism to confirm that each subset of shares combine to form the same value. To solve this problem, Chor et al. [17] introduced verifiability in secret sharing, which led to the concept of verifiable secret sharing (VSS) [18, 19, 20, 21].

In our construction we use the MPC protocol by Gennaro et al. [20]. It uses verifiable secret sharing, where Pedersen commitments [18] on the Shamir shares are provided to all parties. It works on secrets in a prime-order ring and a multiplicative group of order in which the discrete log problem is hard. We choose this protocol because threshold secret sharing allows us to provide availability even when a minority of nodes fail. Further, since it uses arithmetic circuits, exponentiation is inexpensive as discussed below.

Notation

In the rest of the paper, we denote the shares of a secret value by , where represents the VSS share of party .

The secret sharing scheme is additively homomorphic, operations , , and can be computed by each locally using her shares and any public constant . The computation of from given , is an interactive process and requires cooperation from parties [20].

Note: For systems with , such as SATEs in this work, multiplication requires the cooperation of all parties. (A minority of) parties that have been compromised by the adversary may refuse to do so. In this case, to maintain availability, the remaining majority of parties can expel the offending parties from the SATE and reshare secrets with a smaller threshold. This resharing can be done lazily. This does not affect security, since only corrupted parties are removed from the SATE.

Given MPC addition and multiplication, we can efficiently perform some complex operations. In addition, we can use the nature of commitments in the verifiable secret sharing scheme to efficiently perform ‘public exponentiations’. For SATE, we use the following MPC operations:

  • CombineShares() Combine shares of enough parties to reveal/reconstruct the secret .

  • RandomCoinToss() Return share of the result of a fair coin toss to the calling party. The result is chosen uniformly at random from the field of operation. We use this for distributed key generation [22, 23].

  • PublicExponentiate Exponentiate a public value to a shared value. This can be done efficiently with interaction, but the result is revealed in clear-text to all parties, not in a secret-shared form.

  • SendPublicExponentiate Same as PublicExponentiate, except that the parties don’t receive the result. Instead it is sent to another designated receiver (here, the user of the escrow in SATE).

As described in §III-C, we use the above operations to construct MPC protocols for computing PRFs and verifiable PRFs in a distributed fashion.

Iii-B Bilinear Pairings

Let be multiplicative, cyclic groups of prime order . Let be generators of respectively. A map is called bilinear if it has the following properties. (1) Non-degenerate: . (2) Bilinear: For all , . (3) Computable: There is an efficient algorithm to compute for all . For ease of exposition, we assume that the pairing employed is symmetric, i.e.,  [24, 25].

Iii-C Distributed Cryptographic Protocols

We need a distributed protocol for computing a verifiable pseudo-random function (VRF). However, we could not use distributed VRF (DVRF) schemes in [26, 27, 28] as, in SATE, the VRF computing parties (the -escrows) know the input only in a secret-shared formed (this will become clear in §IV). So, we design a DVRF with secret-shared (or distributed) input messages. Our constructions may be of independent interest to other distributed security systems.

Distributed-Input DVRF

We use distributed pseudorandom functions (DPRF) (with distributed input messages) for matching accusations (§IV-D) as well as discovering alleger identities during allegation revelation (§IV-C). We use distributed verifiable pseudorandom functions (DVRF) for identity verification. A verifiable pseudorandom function (VRF) is like a pseudorandom function (PRF) except that it also provides a proof of correctness.

VRFs cannot be distinguished from a random function by a computationally bounded adversary that does not have access to the proof. For our purposes, we adopt the following formal definition of a VRF from [29]. Let and be functions computable in time222Except when takes the value , which means the VRF is defined for inputs of all length.. is a family of VRFs if there exists a PPT (probabilistic polynomial time computable) algorithm and deterministic algorithms and such that outputs a pair of keys ; computes , where is a proof of correctness; and verifies that . They satisfy the following properties: 1) Uniqueness: No values satisfy when , 2) Provability: If , then and, 3) Pseudorandomness: For any PPT algorithm that does not query its oracle on , the following holds

where is a negligible function.

In SATE, we need to compute VRFs in a multi-party computation where both the key and the input values (tags) are available in a secret shared form. Any VRF scheme can be transformed using general purpose MPC to work with shared key and shared input tags. However, keeping efficiency and practicality in mind, we choose a VRF construction by Dodis and Yampolskiy from [29].

In this construction, if a q-Decisional Bilinear Diffie Hellman Inversion (q-DBDHI) assumption holds in a bilinear group with generator , then

(1)

is a PRF. When coupled with a proof , it is a VRF. Here, is a private key chosen randomly from , and . To verify whether , we can test whether and whether .

Distributed PRF and VRF

A set of escrows can efficiently compute if each has a share of and as shown in Algorithm 1. Here can be a generator in or in . To compute a VRF in a distributed setting, we first compute using Algorithm 1, then each individual node can compute locally. If verifiability is not required, we simply compute for . This is also often more efficient as we use instead of .

function (, )
     
      RandomCoinToss()
     
      CombineShares()
     
     return PublicExponentiate(g, )
end function
Algorithm 1 Computing in a distributed setting given , where is a group generator. If , then , if then .

Algorithm 1 first inverts which takes two multiplications, and then exponentiates it. The only values available in clear-text (i.e., not information-theoretically hidden by the secret sharing) are and the final output.

is uniformly distributed and independent of the input, since it is blinded. Hence this algorithm does not reveal any information about the inputs beyond what is revealed by the output.

During user registration (§IV-C), we shall need to ensure that only the user, and no individual -escrow, learns the final VRF value. In this case, we replace the call to PublicExponentiate in Algorithm 1 with a call to SendPublicExponentiate.

In our setting, distributed pseudo-random functions (DPRFs) without the verifiability property, also suffice. This is because verification is done by the escrows, who have shares of the secret key. Thus they can verify the PRF by simply recomputing the PRF and comparing with the claimed PRF. Candidate PRFs which are readily computable by MPC protocols are the Naor-Reingold PRF [30], which is based on the decisional Diffie-Hellman assumption, or a PRF based on the Legendre symbol [31] as in [32].

Nevertheless, we use VRFs to avoid the MPC operation of re-computing the PRF. In addition to being more efficient, it protects against a fatal DoS attack where the attacker can trigger an unbounded number of expensive MPC operations. We ensure in our protocol that each registered real identity can trigger only a bounded number of MPC operations (§IV-E). Since real identities are limited in number, such simple, but fatal, DoS attacks are not possible on our system.

Iv Construction Details

In this section, we describe the cryptographic protocols we use to implement a secure allegation escrow. Figure 1 provides an overview of our protocol. Figure 2 summarizes the technical details of the protocol.

Initialization

  1. The -escrows execute RandomCoinToss() to generate secret-shared private keys for computing MAC and revealing identity respectively. The public component of is computed using PublicExponentiate and revealed to all -escrows. Here is a generator.

  2. Shared secret keys for bucket , are generated using RandomCoinToss(). The public component is not needed for this. The keys are generated lazily as and when required by the bucketing algorithm.

Registration

  1. The alleger connects to each individual -escrow using a secure, authenticated channel. They prove their identity using , a certificate of real identity of the alleger from a CA, to all escrows. The -escrows’ identity is established using a standard PKI.

  2. The registration process starts when all -escrows confirm receipt of the proof of .

  3. The alleger generates new public-private key pairs and secret shares the public parts among the -escrows.

  4. The -escrows use Algorithm 1 to return to the alleger, while only retaining shares of the values, , themselves. serves as a MAC on .

  5. They also compute for use when revealing allegations.

Allegation filing

  1. The alleger connects to each individual escrow using a private and anonymous channel, where the -escrows’ identity is known (and verified), but the allegers’ identity is not.

  2. The alleger randomly picks one of its unused keys generated during registration and broadcasts to all -escrows. is the reveal threshold. It encrypts and broadcasts the allegation text , a free-form field, and secret shares the key among the escrows. It also secret shares a collision-resistant hash of the meta-data of the allegation. Each of this is signed with , and each -escrow ensures that has never been used before.

  3. Each -escrow locally verifies the MAC on . If it passes, identity is verified and the -escrows take instructions from the bucketing algorithm described in Algorithm 2 for the next step. Else they send FAIL to the alleger.

  4. Every time the bucketing algorithm adds a set of allegations to a new bucket , all -escrows use Algorithm 1 to compute the PRF , where is the secret key for the bucket and is the meta-data of the allegations. The PRF is revealed in clear-text to all -escrows (§III-C). Since all allegations in a set have the same , we need to compute only once. The -escrows compute equalities between allegations in a bucket by matching their . The bucketing algorithm takes these equality relationships to decide which bucket to move the allegations next.

Allegation reveal

  1. When a set of allegations reach the bottom bucket, they need to be revealed.

  2. When an allegation is to be revealed, all -escrows cooperate to reveal the allegation text and to compute , where is the public key used while filing that allegation.

  3. They compare to all the values they got during step 5 of registration to get the alleger’s real identity

Fig. 2: The SATE protocol. The -escrows employ Algorithm 1 to compute and

Iv-a Format of an Allegation

An allegation escrow must have some mechanism to determine whether or not two allegations match. To allow this, along with free-form text describing their allegation, allegers provide structured meta-data describing the allegation. -escrows deem that two allegations match if their meta-data are identical. Although simple, this mechanism is quite effective—it is also used in other escrows like Callisto [33].

Allegation meta-data is a formatted string containing specific fields. For instance, it could contain: 1) identity of the accusee and, 2) the type and intensity of a crime. The identity can be specified either as a name or as a unique identifier, if available. In an institutional setting for instance, the user could select from a drop-down list of other employees/students in that institute. The ‘type and intensity’ of crime is selected from a drop-down list containing entries like ‘sexual harassment’, ‘sexual assault’, ‘petty theft’, ‘fraud ()’, ‘fraud ()’, ‘fraud ()’ and ‘racial discrimination by a person in power’. When multiple descriptions fit the same allegation, e.g., when it fits more than one category of ‘type of crime’, allegers can provide more than one meta-data string.333Our protocols are parametric in the format of the meta-data and the test used to match meta-data of allegations, pairwise. Consequently, they are compatible with more sophisticated matching algorithms.

Along with the meta-data and free-form text, the user also submits a reveal threshold—the lowest number of matching allegations that must exist before this one can be revealed. Unlike other work [33, 34], which only supports a single matching threshold throughout the system, we allow the user to pick a threshold to their own satisfaction with each allegation.

Iv-B Initialization

A SATE consists of independently run -escrows, where

is a small odd number (e.g., between 3 and 11). During initialization, these

-escrows are given individual shares of several keys, which are described in Figure 2. These keys are later used to register users, file allegations, match allegations to each other and reveal allegations. All shares use a fixed recombination threshold of , so a majority of the escrows must cooperate to perform operations with these keys, and any minority can be compromised by an adversary without violating any of SATE’s properties from §II.

Iv-C User Registration, Allegation Filing and Revelation

Registration

During registration, the user provides a certificate of real identity from an appropriate certificate authority (e.g., their employer). This authority is trusted to verify the identity of the user in the real world. The user also generates random one-time public-private key pairs and secret shares the public parts among the -escrows. Each of these public keys, denoted , can be used to file one allegation later.

The -escrows compute a MAC on each of these public keys , using a SATE private key, , which is secret-shared among the -escrows during initialization. The public component of , called , is publicly known. The MAC is simply the VRF . It is computed in the distributed manner described in §III-C so that each -escrow learns only its share of the public key and its share of the computed MAC, while the registering individual learns the full MAC.

The -escrows also compute a PRF using a different, previously secret-shared private key . Individual -escrows learn the PRF, but nothing else. Each -escrow stores the association between the user’s real-world identity and in a local map. This association is used when revealing allegations later.

At the end of the registration, every -escrow knows the real user, but knows only one share of each of the public keys the user provided and one share of the MAC computed on it. Consequently, when presented with one of these public keys and its MAC later, no minority of -escrows can link the key back to a specific registered user.

Allegation filing

A registered alleger files an allegation by connecting to the -escrows over an anonymous channel such as Tor [10]. During the filing, the alleger submits 1) a previously registered public key , 2) , the -escrows’ MAC on it, 3) the allegation’s full text encrypted with a one-time symmetric key that is immediately secret-shared among the -escrows, 4) shares of a (collision resistant) hash of the allegation’s meta-data IV-A), 5) a reveal threshold for the allegation picked by the alleger, and 6) signatures on all the above fields’ shares using the private key corresponding to .

Since no -escrow has seen the whole public key or the entire MAC on it before, no -escrow can link it back to any specific user. However, all -escrows can verify that the MAC on the public key is legitimate and, hence, that the public key comes from a user who has previously registered. This verification only requires local computation by each -escrow and no MPC, which improves efficiency.

Note that no -escrow has enough information to reconstruct the allegation, its meta-data or the identity of the alleger. In fact, a majority must cooperate to reconstruct any of these. This ensures the properties of allegation secrecy and alleger anonymity (§II), even if up to half of the -escrows cooperate with the adversary.

Allegation revelation

Allegations are matched using a dedicated algorithm by the -escrows. The algorithm is described in §IV-D. Once a majority of -escrows determine that a set of matching allegations can be revealed, i.e., they all have thresholds , the -escrows combine their shares to decode the keys used to encrypt the texts of the allegations in . These texts are provided to a designated authority for further action.

Along with the allegation texts, the -escrows also reveal the real-world identities of the allegers who filed . To obtain the identity of an alleger, the -escrows compute the PRF (using the algorithm described in §III-C), on the public key the alleger used to file the allegation. Recall that the -escrows also computed this PRF when the alleger registered and mapped the PRF to the victim’s identity in a local store. Hence, to discover the user’s identity, they merely need to look up the PRF in the store. This search is done in clear-text locally by each individual -escrow and is efficient.444Note, that we don’t use the verifiability property of our VRF here.

Providing the real-world identities of the matched allegers to the designated authority allows the authority to reach out to the allegers and also provides the accountability property from §II.

Registered public keys must not be used twice

As just described, after an allegation filed with public key has been matched and revealed, the -escrows map to the strong identity of the individual. Consequently, the key should not be used to file a second allegation unless the alleger wishes to de-anonymize itself to the -escrows. To allow users to file multiple allegations anonymously, a user registers different keys during a single registration. This can be repeated periodically, allowing for allegation filings for every user within each period. For instance, every participating individual may register public keys every year, thus allowing every user allegation filings every year.

Iv-D Matching and Thresholding

The -escrows match allegations to each other and reveal sets of matching allegations when thresholds are met. The remaining protocol (described above) is agnostic to the definition of a “match”. Here, we describe one efficient protocol for matching based on syntactic equality of the meta-data hash.

Matching protocol

We describe a simple MPC protocol that matches two allegations when their meta-data hashes are equal. We start by noting that, by design, our matching protocol does not allow any minority set of -escrows to match two allegations on their own. Recall that each -escrow receives only a share of the hash of the meta-data, , of each allegation. The shares are randomized, so a minority of -escrows cannot check the equality of and using the shares alone. This property is important, else, an adversary who corrupts a minority of -escrows can probe existing allegations to discover if an allegation against a specific individual exists. They can do this without any honest parties being aware of such probing.

To compare a set of allegations for equality, the -escrows (at least a majority is needed) participate in a multi-party computation protocol (§III-C) to compute a pseudo-random function for all allegations in the set. The resulting PRF is revealed in the clear to all -escrows, but and aren’t. is a shared secret specially generated for each set of allegations being compared. The sets are determined by the thresholding protocol described below.

Since the PRF is bijective when the range of , and are equal if and only if . Hence, each -escrow can locally determine which allegations match which others. Further, is a PRF whose secret-key is not used for any other purpose, so no additional information about is revealed. Thus all pairs that match in a set of allegations can be computed efficiently in linear-time.

Thresholding

A collection of matching allegations should be revealed when every allegation in has a reveal threshold no more than the size of (written ). One way to find such collections would be to run the above matching protocol on the set of all allegations irrespective of their thresholds. However, this design is susceptible to a probing attack where an adversary interested in probing for the existence of a specific allegation files the same allegation with a very high threshold. By corrupting just one of the -escrows, the adversary could then compare this allegation to all other allegations in the system, without any risk that its own false allegation would ever be revealed (since the allegation has a very high threshold). To deter such attacks, we control cryptographically which allegations can be compared to each other. We ensure that if two allegations can ever be compared by a minority of escrows, then they will be revealed at the same time, if at all. That is, two allegations can be compared by a minority only if they are waiting for the same number of matching allegations. Now, if the adversary tries to probe with a fake allegation, the fake allegation and the adversary’s real-world identity is exactly as likely to be revealed as the alleger’s actual matching allegation.

To keep track of how many matches each allegation needs, each -escrow independently maintains buckets numbered , , , , …. The th bucket contains all allegations that can be revealed more allegations match it. Note that one allegation may be present in more than one bucket. Bucket contains a list of allegations that have been revealed previously. Algorithm 2 controls which allegation occupies which buckets.

Only allegations within a bucket can be matched to each other to deter the probing attack explained above. To ensure this, each bucket is associated with a secret key , which is shared among the -escrows ( is generated lazily when bucket is first used). When an allegation is added to bucket , the -escrows compute for that allegation using the MPC protocol described above. Since this computed value is available for all allegations in a bucket, any two allegations in a bucket can be matched locally by any -escrow. Since, by design, if , and cannot be compared using and when . Allegations that are known to match each other, either directly because they are in the same bucket or indirectly by transitivity, are said to belong to the same ‘collection’. When allegations from two different collections are found to match, the collections coalesce into one. The resulting collection spans the union of buckets spanned by the parent collections and contains the union of allegations. Every allegation belongs to exactly one collection at any given time. To copy all allegations in a collection into a new bucket, the PRF for only one allegation’s meta-data needs to be computed, since all allegations in a collection have identical meta-data.

Apply the following rules repeatedly (in any order) till no further rules apply. Rules 2,3 and 4 only apply to collections that haven’t been revealed.

  1. [noitemsep,topsep=0pt,parsep=0pt,partopsep=0pt]

  2. When an allegation with threshold is filed, it forms a singleton collection and is added to bucket (since other allegations must match the allegation before it is revealed).

  3. If is the smallest bucket occupied by a collection and every allegation in has a threshold , is copied to bucket . Note that still occupies the buckets it used to occupy. Copying merely adds the collection to a new bucket.

  4. When two collections overlap and occupy the same bucket, and their allegations are found to match (III-C), the coalesce into one collection.

  5. When a collection reaches bucket , all of its allegations are revealed as described in §IV-C.

  6. If a collection is revealed, we make sure it occupies buckets , even as grows. This enables future matching allegations to be revealed.

Algorithm 2 Secure thresholding algorithm. It reveals a set of allegations if and only if all of their thresholds are satisfied by that set.

This algorithm trivially satisfies the property that, once two allegations are known to be equal each other, they belong to the same collection and are revealed together (if at all). This deters the probing attacks described above that motivated this elaborate mechanism. We also prove that the thresholding algorithm is ‘correct’:

Theorem 1 (Correctness).

Algorithm 2 reveals a collection if and only if the thresholds of all allegations in it are satisfied.

Proof.

Let and be the maximum and minimum buckets occupied by collection . We begin by proving that the following three properties hold whenever all five rules of Algorithm 2 have been applied to saturation (meaning no further rule applies). (1) every collection spans a contiguous range of buckets, (2) every collection spans buckets, i.e. , (3) every allegation in a collection has a threshold and hence can be revealed if more matches are available.

The first property can be proved as an invariant that is trivially maintained by rules 2, 4 and 5 with rule 1 as the base case. Now, two collections coalesce only if they share a bucket (and hence their allegations may be compared). Since the union of contiguous, overlapping segments is contiguous, rule 3 also maintains the invariant.

To prove the second property, note that in any collection , all allegations have a threshold by definition. If , since . Hence rule 2 can be applied repeatedly until increases to equal . Hence . We now prove that is an invariant with rule 1 as the base case. While applying Rule 3 to create out of and , we use induction. spans a union of the parent’s buckets, hence because and are disjoint. Hence the invariant is maintained. Rule 2 would not apply if it causes the invariant to be broken, as there is at-least one allegation with threshold if A is not yet revealed (which is when rule 2 applies). The threshold condition for this allegation will not be met if , as it implies the threshold . Rules 4 and 5 trivially maintain the invariant.

The third property is explicitly maintained as an invariant by rule 2 and is trivially satisfied by rules 1 and 5. Rule 3 is applicable in two ways. First, when a new allegation arrives in between an older collection, the property is not broken. Second, if two existing collections, and , coalesce into by rule 3, one is ‘above’ another. Let , without loss of generality. Then, , hence allegations in satisfy the property. The drop in for allegations in is , which is compensated by a corresponding increase in size of the collection by .

We now use these properties to prove correctness. The third property implies that when a collection is revealed, the threshold condition is satisfied for all revealed allegations, since . To prove the other direction, let there be allegations such that all their thresholds are . Assume for contradiction that they are not revealed. This means that they all belong to buckets . By the pigeonhole principle, there will be one bucket with multiple allegations which will start coalescing with rules 2 and 3. If the process stops with a collection of size , buckets will be left with allegations, because property 2 ensures the size of a collection equals its span. Again, by the pigeonhole principle, the coalescing process starts. This continues till there is only one collection with allegations that spans buckets and all allegations get revealed. Hence, a set of matching allegations are revealed if and only if all their thresholds are . ∎

Iv-E PRF computation cost

The computationally expensive steps in the SATE protocol are the VRF/PRF computations that require interaction between the -escrows as well as multiplication and exponentiation over shared secrets. However, the number of PRF computations scales very well (linearly) with the number of users as well as the number of allegations filed.

Registering a new user requires two VRF/PRF computations for every public key that the user provides, one to compute the MAC on the key one to compute .

Filing an allegation does not, of itself, require any PRF computation. The thresholding protocol, if implemented naively, requires computing the PRF function, times for an allegation with threshold (once for each of the buckets ). However, this cost can be reduced by observing that to move a collection into a new bucket, we only need to compute PRF for one of the allegations, since they all have identical meta-data. Using this idea, we can prove that, in an amortized sense, we need to compute the PRF at most twice for each allegation, independent of its threshold. Finally, revealing an allegation requires one more PRF computation (to discover the identity of the alleger). To summarize, on average, every filed allegation requires 2 PRF computations if it is never revealed, and 3 PRF computations if it is revealed.

DoS attacks on PRF computation

In general, systems that use expensive MPC are susceptible to crippling denial-of-service (DoS) attacks that trigger repeated MPC operations. However, the previous discussion implies that SATEs are resistant to such DoS attacks against their PRF computation. Recall that only real users can register in a SATE. Further, each user is only allowed to register a fixed number of public keys in any issue period and each key can only be used to file one allegation. So, a registered user can cause at most PRF computations in any issue period. For and an issue period of one year, this amounts to at most 50 PRF computations per year per real user, which is an extremely low rate for an effective DoS attack.

V Security Analysis

In this section, we formally show that our scheme is secure in the UC framework [7]. We start by presenting an ideal functionality, which, by definition captures the security and privacy properties we expect of our protocol (§II). Then we prove that our protocol (§IV) realizes this ideal functionality.

V-a Ideal world

Figure 3 describes an ideal functionality , which models the intended behavior a SATE, in terms of functionality and security properties. Agents (allegers and

-escrows) are modeled as interactive Turing machines communicating with the ideal functionality

via secure and authenticated channels, due to which knows the identities of the agents. The adversary is a probabilistic polynomial-time Turing machine that has additional interfaces to corrupt a minority of -escrows and add-and-corrupt allegers. has access to the internal state of corrupted agents and all their communication is routed through .

All allegers have certificates of real identity from a trusted offline authority. For the real protocol, we model anonymous communication between an alleger and a -escrow as an ideal functionality , as proposed in [35]. Moreover, we assume the existence of a broadcast channel for allegers to reliably communicate with all -escrows and we model this as a bulletin board (such as [36]) with an ideal functionality . Our idealized process uses and as subroutines, i.e., our protocol is specified in the -hybrid model. We omit the handling of session IDs (SIDs) in to reduce clutter. Messages are assumed to be implicitly associated with SIDs.

Initialization

  1. Initialize empty lists of registered users, allegations and thresholding buckets.

Registration

  1. User with identity sends Register to

  2. sends to all the -escrows, where is the alleger’s real (strong) identity.

  3. If all -escrows send OK back to , registration succeeds and sends OK to the alleger and all -escrows. adds user to its list of registered users.

  4. If any -escrow sends FAIL instead, registration fails and sends FAIL to -escrows and user.

Allegation filing

  1. Alleger with identity sends to . is the meta-data, is the allegation text and is the threshold.

  2. If alleger didn’t register previously or has already filed allegations, returns FAIL to everybody and aborts.

  3. Else sends to all -escrows. is a unique identifier that assigns for this allegation.

  4. adds to its list of allegations.

  5. now runs the bucketing protocol in Algorithm 2. As this allegation moves across buckets and matches other allegations, it sends information about which allegations match which others to all -escrows. But before a set of allegations goes to a new bucket , sends to all -escrows. They are expected to respond with OK. If anybody responds with FAIL, the matching protocol is aborted.

  6. If, in the matching protocol, a set of allegations reaches the bottom bucket, sends to all -escrows for every allegation in the set. and are the allegation text and strong identity respectively. is a unique identifier for each allegation that was filed.

Fig. 3: The ideal functionality for SATEs, .
Discussion

satisfies the allegation secrecy, alleger anonymity and accountability properties described in §II, relative to our threat model. We briefly describe why.

Allegation secrecy and alleger anonymity are ensured because reveals information about an allegation only in the following scenarios: (1) reveals a user’s identity then they register into the system. This is harmless since users register irrespective of whether or not they currently intend to file an allegation. (2) As the bucketing protocol progresses, reveals which allegations match which others: if all of a set of matching allegations are filed by honest users, the adversary learns nothing but statistics about how many allegations match each other. (3) It reveals the threshold of an allegation when it is filed. (4) Finally it reveals the entire allegation when its threshold is met and is ready to be revealed.

Because of (2) above, this ideal functionality admits a somewhat surprising attack: If an adversary files an allegation, it immediately learns whether other matching allegations exist. Our actual protocol (§IV) allows an adversary to realize this attack by compromising any -escrow and observing its thresholding protocol. These attacks are consistent with our threat model, which allows for probing attacks by adversaries. Also, as explained in §IV-D, our thresholding protocol is carefully designed to disincentivize these attacks.

Accountability is ensured since, if a user files an allegation, reveals their real identity () as soon as their threshold is met. Note that we already proved the thresholding protocol correct in §IV-D, Theorem 1.

V-B UC-Security Analysis

Let be the ensemble of the outputs of the environment when interacting with the adversary and parties running the protocol (over the random coins of all the involved machines).

Definition 2 (UC-Security).

A protocol UC-realizes an ideal functionality if for any adversary there exists a simulator such that for any environment the ensembles and are computationally indistinguishable.

We prove UC-security in the -hybrid model. Theorem 3 holds for any UC-secure realization of and .

Theorem 3 (UC-Security).

Let be a secure DKG protocol, let be a secure distributed input DVRF protocol, and let be a non-committing symmetric encryption scheme. Then the SATE protocol UC-realizes the ideal functionality defined in Figure 3 in the-hybrid model.

We provide a proof sketch of Theorem 3 in Appendix A.

Vi Evaluation

We evaluate the performance of the SATE protocol in two ways. First, we count the number of cryptographic operations and the number of rounds of communication used in our implementation. This is shown in Figure 4 and provides an abstract overview of the complexity of the implementation.

Second, we implement the protocol and evaluate its performance empirically. We build our prototype in Java using SCAPI [37] version 2.3 bindings for OpenSSL [38] version 1.1 and a Pairing Based Cryptography library, jPBC [39] version 2.0. We use a MySQL database to store all of a -escrow’s persistent state. We pre-populate the databases with 1 million allegations from 1 million distinct users. These numbers are chosen represent an extreme worst-case for a SATE deployment.

Compute VRF Verify VRF
Round Complexity 0
Non-precomputable exponentiations 0
Precomputable exponentiations 1
Multiplications 1
Bilinear Pairing 1
Randomness bits 0
Fig. 4: Number of network rounds and breakdown of cryptographic operations required per -escrow for VRF computation and verification by our implementation. is the number of -escrows and is the number of shares required to reconstruct shared secrets. Typically we choose . Multiplications and exponentiations are group operations (of order ). Pre-computable exponentiations are ones where the base is known beforehand, and precomputing the exponents reduces the online running time. One ‘round’ of communication may involve each pair of servers exchanging information. PRF computations have the same complexity as VRF computations, except the operations are performed on group and not . Registering a key and filing an allegation, the two expensive parts of the SATE protocol, require 2 VRF/PRF computations each in an amortized sense.
Latency and throughput

We first measure the latency and throughput of user-SATE interaction in a realistic setting, where -escrows are geographically distributed. We set up to 9 -escrows on Amazon AWS cloud servers, chosen to maximize geographical extent. In an experiment involving -escrows, the -escrows run on servers in the first of Virginia, Frankfurt, Sydney, N. California, Singapore, Sao Paulo, London, Seoul, and Mumbai. Each -escrow runs on a M4.large AWS instance. At the time of the experiments, this provided 2 vCPUs, 8GB of RAM, and ‘moderate’ network performance. Each server runs up to 60 threads, the maximum supported on the machines; each thread handles one concurrent client request. Note that the SATE protocol is embarrassingly parallel with respect to client requests—cost is dominated by network latencies and MPC computation, which require no syncing across client requests; such synchronization is needed only for database operations. We use up to 60 client replicas, all hosted on a single c4.4xlarge instance of AWS in …555We have anonymized the precise location for review. At the time of our experiments, this provides 16 vCPUs, 30GB RAM and ‘High’ network performance.

Latency: Figure 5 (top) shows the average latency for registering a new key as the number of -escrows varies, in two configurations: When the -escrows are lightly loaded (no concurrent requests) and when they are heavily loaded (60 concurrent clients). There are three notable aspects here. First, as expected, the latency increases with the number of -escrows (since the MPC becomes more complex). Second, increasing the number of concurrent clients does not increase the latency significantly. This suggests that the cost is dominated by the number of -escrows and inter-escrow network latencies. Finally, even though the absolute latency numbers might look high (of the order of 10s of seconds), they are acceptable since user interaction with SATEs is relatively infrequent. In particular, users register new keys once every few months, so such latencies seem quite practical. The other interactive operation—filing an allegation—does not require any MPC and has an even lower latency.

Throughput: Next, we measure the throughput of SATE in terms of the number of key registrations and allegation filings it can handle per second. Here, we use 60 concurrent clients. In the first experiment, each client registers new keys sequentially. The pink line in Figure 5 (bottom) shows the average number of key registrations the -escrows can handle per second as a function of the number of -escrows. As expected, this number decreases with the number of -escrows, from 2.5 ops/s for to 1 ops/s for .

In the second experiment, each client repeatedly files allegations with thresholds varying between 2 and 20, chosen from a truncated exponential distribution with mean 5. When a threshold of

is chosen,

matching allegations are created with 50% probability, and

matching allegations are created the rest of the time. These, respectively, represent the cases where the allegation is eventually revealed and the worst-case (for performance) when the allegation is not actually revealed. The green line in Figure 5 (bottom) shows the average number of allegation filings the -escrows can handle per second. Again, this number decreases with the number of -escrows and varies from 3 ops/s for to 2 ops/s for .

We believe that these throughputs are acceptable for SATE, since user operations are expected to be very infrequent. Moreover, each -escrow can be separately replicated on several servers to get proportionally higher throughput.

[width=]Figures/real-eval.pdf

Fig. 5: Performance when the SATE protocol is deployed on

-escrow servers running on AWS instances in different locations. The top figure shows the average latency and variance for registering a key when the servers are unloaded and when they are loaded with requests saturating 60 parallel threads. The bottom figure shows the number of key registrations and allegations that 60 threads running in parallel can support per second. Note that this does not increase the latency for registering keys too much, as shown in the top graph. Hence, if a user wishes to register multiple keys, they can do so in parallel, with almost the same latency.

[width=]Figures/vary-rtt.pdf

Fig. 6: End-to-end latency for registering a key and filing an allegation as the communication delay between every pair of -escrows on an emulated network is varied. Values are shown for number of -escrows varying from 3 to 9.
Impact of network latency

We conducted a further experiment to measure the impact of inter-escrow network latency on user-perceived latency. To get predictable latencies, we conduct this experiment on an emulated network using Linux qdiscs. We run the experiments on a single Amazon AWS c4.4xlarge instance. The -escrow servers and our client occupy one core each. Every pair of -escrow servers is given an emulated 100 Mbps link and 1 bandwidthdelay worth of buffer (1 Bandwidth Delay Product of buffer is the recommended buffer size for full link utilization by TCP and minimal delay). We vary the latency of the emulated network links and plot the latencies of a) registering a key, and b) processing one allegation completely with no matches. These require two and one PRF computations, respectively. (Note that the user perceived latency of allegation filing is different from that of processing the allegation. Filing does not require any PRF computation.)

Figure 6 shows the results. As expected, the client latency increases linearly with the network latencies, and the rate of increase also increases with the number of -escrows.

Vii Related Work

We described prior work on relevant cryptographic primitives in §III. Here, we discuss related work on allegation escrows and other systems that are similar to SATE.

In a Michigan Law Review article, Ayres and Unkovic [2] discuss the utility of allegation escrows in encouraging reporting of sexual misconduct. Project Callisto [3] is a non-profit partnership involving several US universities based on this idea. The project maintains a web site where sexual assault victims can file timestamped reports, with the option of automatically forwarding the report to campus authorities as soon as a second victim files a report accusing the same person. The anonymity of allegers and accused, as well as the confidentiality of the allegations, however, rests on the integrity of the website and its administrators.

Concurrent to our work, Project Callisto has developed a cryptographic solution to distribute the trust assumptions [33, 6]. In contrast to our solution which is provably secure, Callisto’s security analysis is informal. The solution uses two servers—a key server and a database server, and uses OPRFs to match allegations, but in a way different from ours. Callisto’s threat model is weaker than ours. First, in Callisto’s protocol, the alleger’s identity is revealed to the key server while filing an allegation. This makes the key server a weak point of attack: Coercing or compromising just this server can reveal all alleger identities to the adversary. Further, if a perpetrator learns that one of their victims filed an escrowed allegation soon after the crime, it is easy to deduce the probable content of the allegation. In contrast, in SATE, no -escrow learns the identity of any alleger until enough matches are found, and coercing up to a minority set of -escrows affords the adversary no information in this regard. Second, Callisto’s solution provides no defense against probing attacks, while SATE’s accountability mechanism provides a deterrent to these attacks. Additionally, Callisto’s cryptographic solution forces the same match thresholds on all allegations (Callisto’s current implementation supports only the threshold 2). In contrast, SATE allows per-allegation thresholds, which allegers can pick to their own satisfaction.

In recent work, Harnik et al. [34] use a hardware-backed secure enclave (built on Intel SGX) to isolate a fully autonomous, single-party allegation escrow. Although such an enclave resists coercion attacks on its administrator, the code within the enclave may still be vulnerable to hacks and exploits on its interface. This solution can be combined with ideas from SATE to obtain a threat model stronger than that of either: SATE’s -escrows can be hosted in SGX enclaves to provide a second line of defense even when the administrators of a majority of -escrows are coerced by the adversary.

SATE’s functionality shares some attributes with covert computation [40, 41]. In SATE, allegers wish to perform computation like matching and thresholding, but only want the other party to know the result under certain conditions (i.e, when thresholds of all allegations in a matching set are satisfied). Otherwise even the intent to compute remains hidden. Covert multi-party computation protocols perform computation and reveal the result only if all parties participated and the result is favorable. Our system differ in significant ways. First, SATE is not meant to be a general-purpose solution for covert computation. On the other hand, SATE is efficient for its specific application, and scales well to a very large number of participants. Second, in SATE, if the result is revealed, each participant gets a third-party auditable signature from the other participants. The signature is on their inputs to the protocol and on the fact that they participated. On the flip side, participants in our system need to trust that half of the escrows are trustworthy.

At a very high-level, DC-nets [42] have a system structure similar to SATEs in that a small set of cooperating authorities serve the privacy needs of a large number of users. However, the specific goals of DC-nets and SATE are very different. DC-nets are used to provide user anonymity in communication, while SATEs provide allegation escrows. Technically, DC-nets solve the dining cryptographers problem to securely compute the bitwise OR of every authority’s input. Many other schemes build on the basic idea of DC-nets to efficiently implement a scalable anonymizing network, even in the presence of corrupted parties who try to jam the communication [43, 44].

Viii Conclusion

We have presented SATE, a robust system that implements an allegation escrow with strong cryptographic security guarantees, and showed that it is practical. SATE keeps accusations and the identities of allegers and accusees confidential until alleger-specified match-thresholds are reached. The system’s security and privacy guarantees provably hold as long as a majority of the escrow parties are uncorrupted. Our empirical evaluation suggests that SATEs are efficient enough to be used in practice.

Ix Acknowledgements

We are grateful to Krishna Gummadi, Prabhanjan Ananth, Hari Balakrishnan, Derek Leung, Omer Paneth, Malte Schwarzkopf, Frank Wang and Nickolai Zeldovich for many interesting discussions and valuable feedback on the paper. We would also like to thank the anonymous reviewers of past submissions for their inputs.

References

  • [1] “Me Too movement,” https://metoomvmt.org/, accessed: 2018-10-01.
  • [2] I. Ayres and C. Unkovic, “Information escrows,” Michigan Law Review, pp. 145–196, 2012.
  • [3] “Callisto: Tech to combat sexual assault,” http://projectcallisto.org, accessed: http://projectcallisto.org.
  • [4] “Project callisto 2016-2017 school year report,” https://www.projectcallisto.org/Callisto_Year_2_highres.pdf, accessed: 2018-10-01.
  • [5] A. Shamir, “How to share a secret,” Communications of the ACM, vol. 22, no. 11, pp. 612–613, 1979.
  • [6] A. Rajan, L. Qin, D. Archer, D. Boneh, and T. L. M. Varia, “Callisto: A cryptographic approach to detect serial predators of sexual misconduct,” 2018, online at: https://www.projectcallisto.org/callisto-cryptographic-approach.pdf.
  • [7] R. Canetti, “Universally composable security: A new paradigm for cryptographic protocols,” Cryptology ePrint Archive, Report 2000/067 (Revision July 2013), 2013, https://eprint.iacr.org/2000/067/20130717:020004.
  • [8] J. A. King, “‘Say nothing’: silenced records and the boston college subpoenas,” Archives and Records, vol. 35, no. 1, pp. 28–42, 2014.
  • [9] K. Moore, V. Zhong, and A. Gandhi, “Healthy minds study survey data informed 2016 senior house decisions,” https://www.thetech.com/2017/07/26/healthy-minds-data-used-in-2016-senior-house-decisions, accessed: 2017-11-27.
  • [10] R. Dingledine, N. Mathewson, and P. Syverson, “Tor: The second-generation onion router,” DTIC Document, Tech. Rep., 2004.
  • [11] L. Wang, G. Asharov, R. P. T. Ristenpart, and abhi shelat, “Blind certificate authorities,” in Oakland S&P’2019, June 2018.
  • [12] Y. Desmedt and Y. Frankel, “Threshold Cryptosystems,” in Advances in Cryptology—CRYPTO’89, 1989, pp. 307–315.
  • [13] A. C.-C. Yao, “Protocols for Secure Computations (Extended Abstract),” in Proc. 23rd IEEE Symposium on Foundations of Computer Science (FOCS).   IEEE Computer Society Press, 1982, pp. 160–164.
  • [14] D. Chaum, C. Crépeau, and I. Damgård, “Multiparty Unconditionally Secure Protocols,” in

    Proc. 20th Annual ACM Symposium on Theory of Computing (STOC)

    .   ACM Press, 1988, pp. 11–19.
  • [15] M. Ben-Or, S. Goldwasser, and A. Wigderson, “Completeness Theorems for Non-Cryptographic Fault-Tolerant Distributed Computation,” in Proc. 20th Annual ACM Symposium on Theory of Computing (STOC), 1988, pp. 1–10.
  • [16] O. Goldreich, S. Micali, and A. Wigderson, “How to Play any Mental Game or A Completeness Theorem for Protocols with Honest Majority,” in Proc. 19th Annual ACM Symposium on Theory of Computing (STOC), 1987, pp. 218–229.
  • [17] B. Chor, S. Goldwasser, S. Micali, and B. Awerbuch, “Verifiable secret sharing and achieving simultaneity in the presence of faults,” in Foundations of Computer Science, 1985., 26th Annual Symposium on.   IEEE, 1985, pp. 383–395.
  • [18] T. P. Pedersen, “Non-interactive and information-theoretic secure verifiable secret sharing,” in Annual International Cryptology Conference.   Springer, 1991, pp. 129–140.
  • [19] P. Feldman, “A practical scheme for non-interactive verifiable secret sharing,” in 28th Annual Symposium on Foundations of Computer Science, Los Angeles, California, USA, 27-29 October 1987, 1987, pp. 427–437.
  • [20] R. Gennaro, M. O. Rabin, and T. Rabin, “Simplified vss and fast-track multiparty computations with applications to threshold cryptography,” in Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing.   ACM, 1998, pp. 101–111.
  • [21] M. Backes, A. Kate, and A. Patra, “Computational verifiable secret sharing revisited,” in Advances in Cryptology - ASIACRYPT 2011 - 17th International Conference on the Theory and Application of Cryptology and Information Security, Seoul, South Korea, December 4-8, 2011. Proceedings, 2011, pp. 590–609.
  • [22] R. Gennaro, S. Jarecki, H. Krawczyk, and T. Rabin, “Secure distributed key generation for discrete-log based cryptosystems,” J. Cryptology, vol. 20, no. 1, pp. 51–83, 2007.
  • [23] A. Kate and I. Goldberg, “Distributed key generation for the internet,” in 2009 29th IEEE International Conference on Distributed Computing Systems, 2009, pp. 119–128.
  • [24] S. D. Galbraith, K. G. Paterson, and N. P. Smart, “Pairings for cryptographers,” Discrete Applied Mathematics, vol. 156, no. 16, pp. 3113–3121, 2008.
  • [25] A. Joux, “The Weil and Tate Pairings as Building Blocks for Public Key Cryptosystems,” in ANTS-V, 2002, pp. 20–32.
  • [26] C. Cachin, K. Kursawe, and V. Shoup, “Random oracles in constantinople: Practical asynchronous byzantine agreement using cryptography,” Journal of Cryptology, vol. 18, no. 3, pp. 219–246, 2005.
  • [27] M. Naor, B. Pinkas, and O. Reingold, “Distributed pseudo-random functions and kdcs,” in International Conference on the Theory and Applications of Cryptographic Techniques.   Springer, 1999, pp. 327–346.
  • [28] D. Boneh, K. Lewi, H. Montgomery, and A. Raghunathan, “Key homomorphic prfs and their applications,” in Advances in Cryptology–CRYPTO 2013.   Springer, 2013, pp. 410–428.
  • [29] Y. Dodis and A. Yampolskiy, “A verifiable random function with short proofs and keys,” in International Workshop on Public Key Cryptography.   Springer, 2005, pp. 416–431.
  • [30] M. Naor and O. Reingold, “Number-theoretic constructions of efficient pseudo-random functions,” Journal of the ACM (JACM), vol. 51, no. 2, pp. 231–262, 2004.
  • [31] I. B. Damgård, “On the randomness of legendre and jacobi sequences,” in Conference on the Theory and Application of Cryptography.   Springer, 1988, pp. 163–172.
  • [32] L. Grassi, C. Rechberger, D. Rotaru, P. Scholl, and N. P. Smart, “Mpc-friendly symmetric key primitives,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016, 2016, pp. 430–443.
  • [33] A. Rajan, L. Qin, D. W. Archer, D. Boneh, T. Lepoint, and M. Varia, “Callisto: A cryptographic approach to detecting serial perpetrators of sexual misconduct,” in ACM COMPASS, 2018, pp. 49:1–49:4.
  • [34] D. Harnik, P. Ta-Shma, and E. Tsfadia, “It takes two to# metoo-using enclaves to build autonomous trusted systems,” arXiv preprint arXiv:1808.02708, 2018.
  • [35] J. Camenisch and A. Lysyanskaya, “A formal treatment of onion routing,” in Advances in Cryptology—CRYPTO, 2005.
  • [36] D. Wikström, “A universally composable mix-net,” in Theory of Cryptography Conference, M. Naor, Ed., 2004.
  • [37] B.-I. University, “Scapi - secure computation api,” https://crypto.biu.ac.il/scapi/, 2.3.
  • [38] OpenSSL Software Foundation, “Openssl,” https://openssl.org, 1.0.2.
  • [39] A. De Caro and V. Iovino, “jpbc: Java pairing based cryptography,” in Proceedings of the 16th IEEE Symposium on Computers and Communications, ISCC 2011.   Kerkyra, Corfu, Greece, June 28 - July 1: IEEE, 2011, pp. 850–855. [Online]. Available: http://gas.dia.unisa.it/projects/jpbc/
  • [40] L. Von Ahn, N. Hopper, and J. Langford, “Covert two-party computation,” in Proceedings of the thirty-seventh annual ACM symposium on Theory of computing.   ACM, 2005, pp. 513–522.
  • [41] N. Chandran, V. Goyal, R. Ostrovsky, and A. Sahai, “Covert multi-party computation,” in Foundations of Computer Science, 2007. FOCS’07. 48th Annual IEEE Symposium on.   IEEE, 2007, pp. 238–248.
  • [42] D. Chaum, “The dining cryptographers problem: Unconditional sender and recipient untraceability,” Journal of cryptology, vol. 1, no. 1, pp. 65–75, 1988.
  • [43] P. Golle and A. Juels, “Dining cryptographers revisited,” in International Conference on the Theory and Applications of Cryptographic Techniques.   Springer, 2004, pp. 456–473.
  • [44] D. I. Wolinsky, H. Corrigan-Gibbs, B. Ford, and A. Johnson, “Dissent in numbers: Making strong anonymity scale,” in Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), 2012, pp. 179–182.
  • [45] T. Chou and C. Orlandi, “The simplest protocol for oblivious transfer,” in LATINCRYPT 2015, 2015, pp. 40–58.

Appendix A Postponed Security Analysis

Definition 4.

[45]. An symmetric encryption scheme is non committing if there exist two PPT algorithms s.t. and are computationally indistinguishable when , , and for all where denote key, message and ciphertext spaces respectively

We refer [45] for simple construction.

Proof Sketch for Proof Theorem 3.

Our proof strategy consists of the description of a simulator that handles users corrupted by the attacker and simulates the real world execution protocol while interacting with the ideal functionality .

The simulator spawns honest users at adversarial will and impersonates them until the environment makes a corruption query on one of the users: At this point hands over to the internal state of the target user and routes all of the subsequent communications to , who can reply arbitrarily. For operations exclusively among corrupted users, the environment does not expect any interaction with the simulator. Similarly, interactions exclusively among honest nodes happen through secure channels and therefore the attacker does not gather any additional information other than the fact that the interactions took place. The simulator simulates the following honest nodes: 1) the honest -escrows, 2) the honest users, 3) the CA for users’ real identities. For simplicity, we omit these operations in the description of our simulator. Next, we describes how the simulator behaves at various points of the protocol.

At several points in the SATE protocol, DKG is required. namely, used to compute MACs on identities, used for revealing alleger identity and for each bucket used for thresholding. To simulate this with a minority of statically corrupted -escrows, chooses a random key pair, performs DKG simulation [22, Theorem 1], and sends the the public key to the corrupted -escrows. As this simulation is exactly the distribution in the real protocol [22, Theorem 1], and hence is indistinguishable from it. Notice that the simulator knows the DKG secret keys here. The simulator also generates the public-private key pairs for all the honest users and generates certificates for them from the CA.

For allegation filing and registration, we consider two cases depending on whether or not the alleger is honest.

Case 1: Honest alleger, corrupted minority of -escrows

When an honest alleger registers, sends to the the simulator. The simulator proves the honest alleger’s identity to the corrupted -escrows. This is possible because it simulates the CA and can generate arbitrary certificates. Then it generates new public keys and secret shares them among the -escrows and participates in the distributed computation of and as described in §III-C (note, the simulator knows and ). If the adversary refuses to participate in this computation, the simulator sends FAIL to . Else it sends OK. As in the real protocol, the adversary obtains , but not . So far, this is exactly what happens in the real protocol, except that DKG and the honest parties’ private keys are chosen by the simulator, but from the same distribution. Hence it is indistinguishable from the real execution.

When an honest alleger files an allegation, sends to the simulator. The simulator chooses a random public key whose private component it knows, generates a MAC on it and sends to the corrupted escrows signed using the private part of , where the is a random non-committing encryption ciphertext. The simulator generates a random meta-data and distributes a minority of shares among the corrupted escrows. The distribution of meta-data doesn’t matter since it is information theoretically hidden. Since the adversary has not seen the honest alleger’s public key before, the simulator can choose a random one. now runs the bucketing protocol, and returns the resulting equality relations. Specifically, for every new bucket this allegation moves to, it returns which other allegations match this one. The bucketing protocol in the real protocol goes through a similar motion and computes PRFs on the way. Say it asks for the PRF on bucket . would also have put this allegation in bucket and hence would have sent the matching allegations to the simulator. If no matching allegations existed in the bucket prior to this one, the simulator returns a new random number to the adversary. Else, it returns the same number as it returned for a preexistent matching allegation. Since is a PRF, the adversary cannot distinguish between its output and truly random numbers. Note all matching allegations have the same meta-data by definition. Note also that if at any point, the adversary refuses to cooperate in distributed-input DPRF computation, the protocol is aborted, and the simulator sends FAIL to , which also halts execution. Else it sends OK each time to move the protocol forward.

When the allegation of an honest alleger is to be revealed, sends to the simulator. The simulator sends shares of the (non-committing) symmetric encryption key from honest -escrows such that the ciphertext open to to the corrupted -escrows. To reveal identity in the real protocol, the -escrows compute , where was the public key used during allegation. To simulate this, the simulator picks a random unused key it chose when was registered. It simulates the other -escrows’ behavior such that, if the adversary cooperates, it gets . Note, the simulator knows . Allegation reveal now succeeds.

Case 2: Corrupted alleger, corrupted minority of -escrows

During registration, the adversary provides a proof of from a CA to the simulator. It also sends the honest -escrows’ shares of public keys to the simulator. If the proof of is invalid, or the shares are inappropriate, the simulator sends FAIL to the adversary. Else, it sends to from the corrupted allegers’ . Note, the simulator has a majority of shares of and can hence reconstruct them. It also knows the secret key . Hence it can participate in the computation of on the public keys to produce the correct result.

When filing an allegation, the alleger sends to the simulator for broadcasting. It also encrypts and broadcasts the allegation text and secret shares the key. Finally, it secret-shares a collision-resistant hash if the meta-data . The simulator verifies that has not been used before and verifies the MAC on it. If the check fails, the simulator sends FAIL from the honest escrows to the corrupted alleger. If verification succeeds, the simulator sends to , which responds with 666The simulator knows since it has a majority of the necessary shares. Again, if the shares are invalid, it sends FAIL to the adversary as verifiable secret-sharing is used.. Now the bucketing algorithm takes place, the simulation process for which is identical to the honest alleger case. returns matching allegations for various buckets, and we simulate for the corrupted escrows, a pseudo-random function on the meta-data. This is possible since we know, for the relevant buckets, meta-data of which allegations match.

When an allegation filed by a corrupted party is to be revealed sends to the simulator. The simulator sends to the corrupted escrows, where is the corresponding key used to file allegation identified by . ∎