Security and Privacy Enhancement for Outsourced Biometric Identification

09/12/2018 ∙ by Kai Zhou, et al. ∙ Michigan State University 0

A lot of research has been focused on secure outsourcing of biometric identification in the context of cloud computing. In such schemes, both the encrypted biometric database and the identification process are outsourced to the cloud. The ultimate goal is to protect the security and privacy of the biometric database and the query templates. Security analysis shows that previous schemes suffer from the enrolment attack and unnecessarily expose more information than needed. In this paper, we propose a new secure outsourcing scheme aims at enhancing the security from these two aspects. First, besides all the attacks discussed in previous schemes, our proposed scheme is also secure against the enrolment attack. Second, we model the identification process as a fixed radius similarity query problem instead of the kNN search problem. Such a modelling is able to reduce the exposed information thus enhancing the privacy of the biometric database. Our comprehensive security and complexity analysis show that our scheme is able to enhance the security and privacy of the biometric database and query templates while maintaining the same computational savings from outsourcing.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Remote Storage and computation outsourcing are two integer services provided by cloud computing. Data owners such as individuals or organizational administrators are able to outsource to the cloud their private data for storage as well as some computational intensive tasks for computation. Various works [10, 9, 8] have been devoted to securing the outsource process, i.e., ensuring the security of the private data while still enjoying the convenience provided by cloud computing.

Among various applications, outsourced biometric identification is of special interest. This is because, on one hand, the biometric database itself is of huge size thus making cloud storage an appealing solution. On the other hand, the identification process is computationally expensive due to the large database size. As a result, several schemes have been proposed to outsource biometric identification to the cloud. The ultimate goal of these schemes is to protect the security and privacy of the biometric templates in the database as well as the query templates under different attacks.

Securing outsourcing of biometric identification has attracted much research effort. In [1], the authors proposed two different schemes where both single-server model and multiple-servers model are considered. However, the scheme for the single-server model still has prohibitive computational overhead for large databases thus making it less practical. In the multiple-servers model, it is required that the severs would not collude, otherwise the security of the biometric templates is compromised. Following works [2, 3] considered two non-colluding servers thus suffering from the same drawbacks. In [7], a privacy-preserving biometric identification scheme was proposed for the single-server model. However, such a scheme is not secure under an active attack where the cloud is able to collude with query users. Most recently, the authors in [6] proposed a secure outsourcing scheme that can defend against three different attacks as defined in [6], which include the attack where cloud and users are allowed to collude. The basic idea is to model the identification process as a Nearest Neighbor (NN)search problem. That is, given an encrypted query template, the nearest neighbor in the database is identified and returned to the data owner, who can later decide whether these two templates belong to the same individual.

Although the scheme in [6] is efficient and can defend against relatively severe attacks as to date, recent analysis [4] shows that it still suffers the enrollment attack. That is, if the cloud is allowed to inject selected templates into the database, then the cloud is able to recover the query templates based on the intermediate computation results. Also, we point out that modelling the identification process as kNN search problem has inherent limitations in terms of template privacy. To be more specific, in [6], given two encrypted templates and and an encrypted query template , the cloud can determine which template ( or ) is closer to . Repeating this process, the cloud is able to identify the closet template to . As a result, the relative distance information among the templates in the database is inevitably exposed.

To deal with the above two security issues, we propose a new secure outsourcing schemes for biometric identification. Especially, our proposed scheme can defend against all the attacks defined in [6] as well as the enrollment attack. Moreover, we model the identification process as a fixed-radius similarity query problem rather than kNN search problem. That is, our scheme only enables the cloud to identify the templates within a fixed radius of the query template. No more information about the distance is revealed. In this way, less information is exposed compared to the scheme in [6], thus enhancing the privacy of the biometric database.

The rest of the paper is organized as follows. We introduce the system model and threat model in Section II. Then the secure outsourcing scheme is proposed in Section III. The security and complexity analysis is given in Section IV. In Section V, we present some numeric results showing the efficiency of our proposed scheme. At last, we conclude in Section VI.

Ii Problem Formulation

Ii-a System model

We consider a system consisting of three parties: a data owner, a set of end-users and a cloud service provider. The data owner has a biometric database composed of a collection of users’ biometric templates, where each template can be represented by an

-dimensional vector

. Each template is registered by an end-user during an enrolment stage. During a preparation stage, the data owner will pre-process the templates and outsource them to the cloud. Later, in an identification stage, an end-user will submit her identification request composed of a query template to the data owner. The data owner will generate a token for each specific query template and submit the token to the cloud. Then, the cloud is responsible for identifying the template in the database such that , where is a distance measurement function and is a pre-defined threshold. In words, the cloud will identify and return the template(s) whose distance from the query template is within a threshold.

Ii-B Threat model

In this paper, we consider a semi-malicious cloud model where the cloud will follow the protocols but is allowed to collude with some malicious end-users. To be specific, the collusion happens when some malicious end-users will submit query templates to the data owner and choose to share the templates with the cloud. As a result, the cloud is able to learn pairs of template and its encrypted form.

In summary, depending on the different capabilities of the adversaries, we propose two attack models as follows.

  1. Passive Attack: the cloud is able to know the encrypted templates , for , where is the encrypted form of . Also, the cloud is able to observe a series of encrypted queries , . However, the service provider does not know the underlying templates in plaintext.

  2. Active Attack: besides the encrypted templates , the cloud is able to observe a series of encrypted queries as well as the corresponding plaintext , . As mentioned earlier, such an attack can happen when the cloud collude with malicious end-users.

Besides the above two attacks, we also allow the enrollment attack as considered in [4].

  1. Enrollment Attack: the cloud is able to inject templates in the enrollment stage. That is, the cloud is able to have a series of encrypted templates as well as the corresponding plaintext , .

Informally, the security requirement of the outsourced biometric identification against the above three attacks is that the cloud is not able to learn more information about the templates than what is allowed through the identification process. That is, the cloud is only able to decide whether the distance between two templates is within a threshold or not. It is not feasible for the service provider to derive any key information about the enrolled templates and the query templates.

Iii Secure Outsourcing of Biometric Identification

Iii-a Basic Framework

Intuitively, our proposed secure outsourcing scheme is consist of four phases. In the first phase, the data owner will generate the system parameters and a transformation key. Then, for each biometric template in the database, it is transformed to an encrypted form using the transformation key. The transformed database is then outsourced to the cloud. In the query phase, for every submitted query template, the data owner will generate a token using the same transformation key. At last, in the identification phase, the cloud will identify the template whose distance from the query template is within a pre-defined threshold.

Our proposed scheme is composed of the following five algorithms.

  • : the set up algorithm generate system parameters .

  • : the key generation algorithm will generate transformation key .

  • : given a vector and transformation key , the transformation algorithm will transform into a disguised form .

  • : given a vector and transformation key , the token generation algorithm will generate a token for .

  • : given the transformed vector and token , the evaluation algorithm will output a result satisfying

    where is the distance between and and is a pre-defined threshold.

Iii-B Secure Transformation and Evaluation

The essential part in our proposed scheme is the secure transformation and evaluation process. In a high level view, the enrolled templates are transformed through the function and the function will generate a token by transforming the query template. It is critical that given the transformed templates, it it computationally infeasible to recover the original vector. However, the function is able to reveal some information of two templates. That is, whether the distance between the two templates is within a threshold or not.

Our transformation process is similar to the techniques utilized in [6]. However, the computational models as well as the security requirements are fundamentally different. We now give some intuition about our transformation and evaluation process. The detailed construction is presented in Protocol 1. Given a vector , we first extend it to by inserting the threshold and some random numbers. Then

is transformed to a matrix from and is disguised by multiplying it with random matrix. Denote this disguised form as

. Given a query vector , the function will transform into a disguised form . The function takes in and as input and outputs , where and are one-time random positive numbers associated with and , respectively. By comparing with , the cloud is able to determine whether the inner product of and is within the threshold or not. We note that since and are one-time random numbers and are different for each template, the exact value of can be concealed from the final result. We note that the inner product is flexible to express different distance metrics between and .

Iii-C The Proposed Scheme

In this section, we give the detailed implementation of our secure outsourcing scheme in Protocol 1. In the protocol, the result in function is equal to , where is the Euclidean distance between and . We use the Euclidean distance as an example to measure the similarity between two templates. However, it is easy to design the vectors and such that the function will give other distances such as the Hamming distance. The correctness of our proposed scheme is shown in Theorem 1.

Input: .
Output: .

:

1:  Data owner sets ., where is the dimension of templates and is a pre-defined threshold.

:

1:  Randomly generates two matrices and with dimension and a permutation .
2:  Set

:

1:  Generate random numbers and .
2:  (Extend) Extend to an -dimensional vector .
3:  (Permute) Permute to obtain .
4:  Transform a diagonal matrices with being the diagonal.
5:  Generate a random lower triangular matrix with the diagonal entries fixed as . Compute .

:

1:  On receiving a query template , data owner generates random numbers and .
2:  (Extend) Extend to an -dimensional vector .
3:  (Permute) Permute to obtain .
4:  Transform to a diagonal matrix with diagonal being .
5:  Generate a random lower triangular matrix with the diagonal entries fixed as . Compute
6:  Send the token to the cloud.

:

1:  For every transformed template in the database, the cloud computes , where is the trace of a matrix.
2:  Cloud sets if , which means that the template is identified; otherwise set .
Protocol 1 Secure Outsourcing of Biometric Identification
Theorem 1

The proposed outsourcing scheme in Protocol 1 is correct. That is, , where is the Euclidean distance between and .

Proof:

For a square matrix , the trace is defined as the sum of the diagonal entries of

. Given an invertible matrix

of the same size, the transformation is called similarity transformation of . From linear algebra, we know the trace of a square matrix remains unchanged under similarity transformation. That is, . Then we have

Since , and are selected as lower triangular matrices, where all the diagonal entries are set to , the diagonal entries of , and are all the same as those of , and . Thus we have

Since , and are diagonal matrices, we have and . Thus

Iv Security and Complexity Analysis

Iv-a Security against Active Attack

We focus on the security of our proposed scheme under active attack since it implies the security under passive attack. We also utilize function as the representative since the transformation process in is similar. The basic idea is to show that an adversary cannot differentiate two transformed templates obtained from the function. Thus, the adversary cannot learn key information from the disguised form of templates. We have the following theorem.

Theorem 2

The proposed outsourcing scheme is secure against active attack, that is the cloud cannot derive key information from transformed templates.

Proof:

Consider the transformation of vector , where . The vector is first extended to . The vector is then extended to a diagonal matrix . Then, it is transformed to , where is a random lower triangular matrix. We note that the product of and will produce a lower triangular matrix denoted as . Now we focus on the product .

Denote the entries in and as and , respectively, where . For matrix , denote its non-zero entries in the lower triangular part as , where and . Then, by law of matrix multiplication, each entry in can be written in the form of

(1)

where , are polynomials. Equation (1) is obtained by summing up each terms of , and , respectively.

In the transformation process, and are fixed. , and are one-time random numbers. are chosen and can be controlled by the adversary. However, since , and are one-time random numbers, the polynomials , and all looks random to the adversary. As a result, the summation is random. This means that, for any two templates chosen by the adversary and one transformed template, the adversary cannot distinguish which template is actually transformed. As a result, the adversary cannot derive key information from transformed templates.

The other important aspect of security is to what extent the function can reveal information of the templates. It is clear that will the distance information which is necessary for identification. However, we note that every vector is associated with a one-time independent random number and every vector is associated with a one-time random number . As a result, in the active attack, what an adversary can observe through function is a series of results . Since are selected independently, the final results only reveals whether is positive or not. No more key information can be derive from .

Iv-B Security against Enrolment Attack

As mentioned earlier, an enrolment attack was proposed in [4] making the secure outsourcing scheme in [6] vulnerable. In an enrolment attack, an adversary (i.e., the cloud) is able to inject known templates into the database. During evaluation, the cloud is able to derive the following equation (i.e., Equation (3) in [4]):

where is the -th entry in a submitted query template . Since and are computable and and are selected by the cloud, the cloud is able to recover . Repeating such attack will finally recover the whole query template as demonstrated in [4].

We now show that our proposed scheme is secure under the above enrolment attack. The underlying reason that the scheme in [6] cannot defend such attack is that the evaluate function will cancel all the randomness (i.e., the random lower-triangular matrix) introduced in the encryption process. In comparison, the evaluation function in our scheme will give , where and are one-time random numbers associated with the templates and respectively. As a result, Equation (3) in [4] is modified to

Note that is a one-time random number associated with a query and is a one-time random number associated with . Thus, although the adversary is able to insert known templates into the database, it cannot derive due to the one-time randomness. In other words, our proposed outsourcing scheme is able to defend against the enrolment attack.

Iv-C The Effect of Randomness on Security

It is important to understand the effect of different randomness on security. We briefly categorize the one-time randomness utilized by our scheme into three types.

  • Type I: result-disguising randomness. In the Extend step in both and , we use random and respectively to multiple with each entry of and . Since and will remain in the decryption result, we name it as result-disguising randomness.

  • Type II: vector-extension randomness. In , we extend the vector

    and pad it with a random

    .

  • Type III: matrix-multiplication randomness. In both and , we multiple the extended matrices (, and ) with random matrices (, and ).

The function will calculate the trace of the matrix (e.g., ). We note that the function will cancel Type II and Type III randomness. However, Type I randomness will remain in the evaluation result. This is important since it will only reveal partial information of the plaintext, which is just sufficient for the purpose of biometric authentication. Also, the underlying reason that the scheme in [6] is vulnerable to enrollment attack is that it lacks Type I randomness.

Iv-D Complexity Analysis

We focus on the complexity analysis of and since they are executed repeatedly in the identification process while , and are one-time processes. As shown in Protocol 1, it is obvious that the computational bottleneck of and lies in matrix multiplication. Without loss of generality, we assume that the matrices involved in the computation all have the same dimension .

The function will take matrix multiplications. Since matrix multiplication generally has a complexity of without optimization, the complexity of is also . In the function , the trace of two matrices and need to be computed. We note that there is no need to calculate the matrix multiplication first. What needs to be computed are the main diagonals of the two matrices. Thus, has a complexity of .

In terms of communication overhead, we assume that each entry in the matrix or vector has the same size . For each template in the database, the data owner needs to outsource the encrypted template to the cloud. Thus the communication overhead is . Similarly, the communication overhead for each query template is .

V Numeric Results

The most important parameter that affects the performance of the identification process is the length of the vectors denoted as . It will determine the execution time for both and , which are executed frequently in the querying process. Another parameter is the size of the database . However, since our identification algorithm is basically a linear scan of the database, it can be predicted that the time for identification is also linear to . Thus, it is of more interest to measure the performance of and in terms of the dimension .

Fig. 1: Template transformation and evaluation time for each template

The simulation is conducted in a personal computer with 1.6 GHz Intel Core i5 CPU, 4 GB RAM and macOS Version 10.12.6. The algorithm is implemented using the Armadillo C++ linear algebra library. In the simulation, we let the length of the vector vary from 100 to 2000, which is able to cover the length of some typical biometric templates such as FingerCodes [5]. The execution time of and for each template is presented in Fig.1. We can see that both and are quite efficient. For example, it takes around 1 second to generate a token for a template with length , which is quite long in real applications. The numeric results also correspond with the complexity analysis that has complexity while has complexity.

Vi Conclusion

In this paper, we proposed a new secure outsourcing scheme for biometric identification aiming at enhancing the security and privacy for the outsourced biometric database and the query templates. Specifically, our scheme is able to defend against the enrollment attack that makes previous schemes vulnerable. By modelling identification as fixed radius similarity search problem, our scheme exposes less information than previous schemes that based on kNN search problem. In summary, our comprehensive security and complexity analysis show that our scheme is able to enhance the security and privacy of the biometric database and query templates while maintaining the same computational savings from outsourcing.

References

  • [1] Marina Blanton and Mehrdad Aliasgari. Secure outsourced computation of iris matching. Journal of Computer Security, 20(2-3):259–305, 2012.
  • [2] Hu Chun, Yousef Elmehdwi, Feng Li, Prabir Bhattacharya, and Wei Jiang. Outsourceable two-party privacy-preserving biometric authentication. In Proceedings of the 9th ACM symposium on Information, computer and communications security, pages 401–412. ACM, 2014.
  • [3] Yousef Elmehdwi, Bharath K Samanthula, and Wei Jiang. Secure k-nearest neighbor query over encrypted data in outsourced environments. In Data Engineering (ICDE), 2014 IEEE 30th International Conference on, pages 664–675. IEEE, 2014.
  • [4] Changhee Hahn and Junbeom Hur. Poster: Towards privacy-preserving biometric identification in cloud computing. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 1826–1828. ACM, 2016.
  • [5] Anil K Jain, Salil Prabhakar, Lin Hong, and Sharath Pankanti. Filterbank-based fingerprint matching. IEEE transactions on Image Processing, 9(5):846–859, 2000.
  • [6] Qian Wang, Shengshan Hu, Kui Ren, Meiqi He, Minxin Du, and Zhibo Wang. Cloudbi: Practical privacy-preserving outsourcing of biometric identification in the cloud. In European Symposium on Research in Computer Security, pages 186–205. Springer, 2015.
  • [7] Wai Kit Wong, David Wai-lok Cheung, Ben Kao, and Nikos Mamoulis. Secure knn computation on encrypted databases. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pages 139–152. ACM, 2009.
  • [8] Kai Zhou, MH Afifi, and Jian Ren. Expsos: Secure and verifiable outsourcing of exponentiation operations for mobile cloud computing. IEEE Transactions on Information Forensics and Security, 12(11):2518–2531, 2017.
  • [9] Kai Zhou and Jian Ren. Linsos: Secure outsourcing of linear computations based on affine mapping. In Communications (ICC), 2016 IEEE International Conference on, pages 1–5. IEEE, 2016.
  • [10] Kai Zhou and Jian Ren. Secure fine-grained access control of mobile user data through untrusted cloud. In Computer Communication and Networks (ICCCN), 2016 25th International Conference on, pages 1–9. IEEE, 2016.