Federated Learning of User Authentication Models

07/09/2020
by   Hossein Hosseini, et al.
Qualcomm
47

Machine learning-based User Authentication (UA) models have been widely deployed in smart devices. UA models are trained to map input data of different users to highly separable embedding vectors, which are then used to accept or reject new inputs at test time. Training UA models requires having direct access to the raw inputs and embedding vectors of users, both of which are privacy-sensitive information. In this paper, we propose Federated User Authentication (FedUA), a framework for privacy-preserving training of UA models. FedUA adopts federated learning framework to enable a group of users to jointly train a model without sharing the raw inputs. It also allows users to generate their embeddings as random binary vectors, so that, unlike the existing approach of constructing the spread out embeddings by the server, the embedding vectors are kept private as well. We show our method is privacy-preserving, scalable with number of users, and allows new users to be added to training without changing the output layer. Our experimental results on the VoxCeleb dataset for speaker verification shows our method reliably rejects data of unseen users at very high true positive rates.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/18/2021

Federated Learning of User Verification Models Without Sharing Embeddings

We consider the problem of training User Verification (UV) models in fed...
11/26/2019

Federated Learning for Ranking Browser History Suggestions

Federated Learning is a new subfield of machine learning that allows fit...
02/23/2020

Practical and Bilateral Privacy-preserving Federated Learning

Federated learning, as an emerging distributed training model of neural ...
09/27/2019

Federated User Representation Learning

Collaborative personalization, such as through learned user representati...
03/31/2020

Information Leakage in Embedding Models

Embeddings are functions that map raw input data to low-dimensional vect...
08/06/2020

Improving on-device speaker verification using federated learning with privacy

Information on speaker characteristics can be useful as side information...
10/04/2019

Privacy Preserving Stochastic Channel-Based Federated Learning with Neural Network Pruning

Artificial neural network has achieved unprecedented success in a wide v...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

There has been a recent increase in research and development of User Authentication (UA) models with various modalities such as voice (Snyder et al., 2017; Yun et al., 2019), face (Wang et al., 2018), fingerprint (Cao and Jain, 2018), or iris (Nguyen et al., 2017). Many commercial smart devices such as mobile phones, AI speakers and automotive infotainment systems have adopted machine learning-based UA features for unlocking the system or providing a user-specific service, e.g., music recommendation, schedule notification, or other configuration adjustments.

User authentication is a decision problem where a test input is accepted or rejected based on its similarity to user’s training inputs. The similarity is often computed in an embedding space, i.e., if the predicted embedding of the test input is close to the reference embedding, the input will be accepted, and otherwise rejected. Authentication models need to be trained with a large variety of users’ data so that the model learns different data characteristics and can reliably reject imposters. However, due to the privacy-sensitivity of both the raw inputs and the user embeddings, it is not possible to centrally collect users’ data to train the model. Protecting data privacy is particularly important in UA applications, since the model is likely to be trained and tested in adversarial settings. Specifically, leakage of embedding makes the authentication model vulnerable to both training- and inference-time attacks, e.g., poisoning (Biggio et al., 2012) and evasion attacks (Biggio et al., 2013; Szegedy et al., 2013).

Federated learning (FL) is a framework for training machine learning models with the local data of users by repeatedly communicating the model weights and gradients between a server and a group of users (McMahan et al., 2017a). FL enables training models without users having to share their data with the server or other users and, hence, is a natural solution for training UA models. Training UA models in the federated setting, however, poses unique challenges described in the following.

In federated learning of supervised models, typically it is assumed that users have access to pairs of inputs and outputs. In most cases, for any given input, the output is naturally derived from user interactions or can be easily obtained. For example, in the next-word prediction task, the output is simply the next word typed by the user (Hard et al., 2018). In distributed training of UA models, however, the embeddings are not pre-defined. Moreover, even when users know their own embeddings, they need to have access to the embeddings of other users, so that the model can be trained to assign predicted embeddings to be not only close to the reference one, but also far away from other embeddings.

In this paper, we propose Federated User Authentication (FedUA), a scalable and privacy-preserving framework for training UA models. Our contributions are summarized in the following.

  • [itemsep=4pt]

  • We develop a new approach for UA, where instead of learning the spread out embeddings, users construct

    the embeddings with high expected minimum separability. We propose to use random binary vectors, with the length of the vectors being determined by the server such that the minimum distance between embeddings is more than a pre-determined value with high probability. Each user then trains the model to maximize the correlation of the model outputs with their embedding vector. After training, a test input is accepted if the distance of the predicted embedding to the reference one is less than a threshold, and otherwise rejected. We develop a “warm-up phase” to determine the threshold independently for each user, in which a set of inputs are collected and then the threshold is computed so as to obtain a desired True Positive Rate (TPR).

  • We show our framework is privacy-preserving and addresses the security problems of existing approaches where embeddings are shared with other users or the server (Yu et al., 2020). Moreover, we show that using random binary embeddings enables training the UA models with significantly smaller output size than the number of users, and also allows new users to be added to training after training started without the need to change the output layer. Finally, our method has the advantage that no extra coordination is needed among users or between the users and the server apart from the communications done usually in the FL setting.

  • We present experimental results of our method on VoxCeleb dataset (Nagrani et al., 2017) for speaker verification. We train the models with the speech data of a subset of users ( out of users) and evaluate the authentication performance on the data of the rest of users. We show the models trained in the federated setting achieve high TPR at very low False Positive Rates (FPR) with different lengths of embedding vectors, . For example, with a TPR of , we obtained FPRs of and on data of new users for and , respectively.

2 Background

In this section, we provide a background on training classifiers with Federated Learning (FL) and also machine learning-based User Authentication (UA) models.

2.1 Federated Supervised Learning

Consider a setting where a set of users want to train a supervised model on their data. In FL, a server coordinates with the users to train a model in a privacy-preserving way, i.e., the data of each user will not be shared with the server or other users. Several methods have been proposed for training classifiers in the federated setting (Kairouz et al., 2019). The widely-used Federated Averaging framework, also called FedAvg, is described in Algorithm (1(McMahan et al., 2017a).

0:   FedAvg:
  Server: Initialize
  Server:
  for each global round  do
     Server: (random set of users)
     Server: Send to users
     Users :
     Server:
  
   UserUpdate():   // Done by users
   (split into batches of size )
  for

 each local epoch

from to  do
     for batch  do
        
  return and to server
Algorithm 1 (McMahan et al., 2017a) FedAvg. is the number of users, is the fraction of users selected for each round, and is the dataset of user with samples.

2.2 User Authentication with Machine Learning

User authentication is a decision problem where a test input is accepted (reference user) or rejected (imposter user) based on the characteristics of input data. The authentication is done by comparing an error value with a threshold as:

(1)

where is a distance function, is the set of training inputs and is the test sample.

The distance is usually computed in an embedding space. Let and be the set of inputs and embedding vectors, respectively, where , and is the number of training samples of user . The UA model with parameters is trained to minimize the distance of the output of the model on with the embedding vector , and maximize the distance to other embeddings

. The loss function is defined as follows:

(2)

At test time and for user , a sample is accepted if .

3 User Authentication with Federated Learning

In this section, we first outline the privacy requirements of UA applications and then review the challenges of training UA models in the federated setting.

3.1 Problem Statement

Authentication models need to be trained with a large variety of users’ data so that the model learns different data characteristics and can reliably authenticate users. For example, speaker recognition models need to be trained with the speech data of users with different ages, genders, accents, etc., to be able to reject imposters with high accuracy. One approach for training UA models is that a server collects the data of the users and trains the model centrally. This approach, however, is not privacy-preserving due to the need of having direct access to the personal data of the users. Protecting data privacy is particularly important in UA applications, where the model is likely to be trained and tested in adversarial settings.

In UA models, both the raw inputs and the embedding vectors are considered sensitive information. Specifically, sharing the raw inputs with the server, aside from exposing the user’s identity, e.g., voice or face attributes, makes the model vulnerable to test-time attacks, e.g., by authenticating copies of the original inputs. The embedding vector also needs to be kept private since it is used to authenticate a user. Leakage of the embedding vector makes the authentication model vulnerable to both training- and test-time attacks as explained in the following.

  • [itemsep=4pt]

  • Poisoning attack (Biggio et al., 2012): The server, in addition to the users’ data, trains the model with data for target user . At test time, the model outputs when queried with and thus wrongly authenticates as a true sample from user .

  • Evasion attack (Biggio et al., 2013; Szegedy et al., 2013)

    : Attacks based on adversarial examples are known to be highly effective against deep neural networks 

    (Carlini and Wagner, 2017). In the context of UA models, when a target embedding vector is known, an evasion attack can be performed to slightly perturb any given input such that the predicted embedding matches a target embedding and thus is accepted by the model.

3.2 Challenges

An alternative approach is using the FL framework, which enables training with data of a large number of users while keeping their data private by design. Training UA models in the FL setting, however, poses its own challenges described in the following.

Problem (1). In distributed training of UA models, the embedding vectors of users are not pre-defined. One approach to define embeddings is that the server assigns a unique ID to each user. Thus, user trains the model with pairs of , where is the corresponding one-hot representation of the user ID. This approach, however, has the following drawbacks:

  • [itemsep=4pt]

  • It is not privacy-preserving as the server knows the embedding vectors of users, which makes the model vulnerable against both training- and test-time attacks.

  • It is not scalable because the size of the network output will be equal to the number of users. This is a major drawback especially in the FL setting because 1) model weights and gradients must be communicated many times between the server and the users, and 2) training and inference are usually done on resource-constrained local devices.

  • The number of participants needs to be known beforehand. In typical FL settings, new users might join after training starts, hence the model design must allow for various numbers of users. However, with one-hot output encoding, the output length must be set before training and cannot be increased after training starts.

Problem (2). Even when each user knows its own embedding, they need to have access to embedding vectors of other users in order to train the model with the loss function defined in Equ. (2). Due to privacy constraints, however, embeddings cannot be shared with other users or the server.

3.3 Related work: Federated Averaging with Spreadout (FedAwS)

The loss function defined in Equation (2) causes the UA model to cluster training data such that the data of each user are placed near its corresponding embedding and far away from other embeddings. A recent paper (Yu et al., 2020) observed that, alternatively, the model could be trained to maximize the pairwise distances between different embeddings. They then proposed Federated Averaging with Spreadout (FedAwS) framework, where the server, in addition to federated averaging, performs an optimization step on the embedding vectors to ensure that different embeddings are separated from each other by at least a margin of . In particular, in each round of training, the server applies the following geometric regularization:

(3)

FedAwS solves the problem of sharing embedding vectors with other users, but still requires sharing embeddings with the server, which seriously undermines the performance of UA models in adversarial settings, specifically against poisoning and evasion attacks as explained in 3.1. Hence, the question is how can we maximize the pairwise distances between embeddings in a privacy-preserving way? In the next section, we present our framework for addressing the challenges of training UA models in the federated setting.

4 Proposed Method

There are two main requirements for training UA models, 1) from the performance perspective, the embedding vectors must be highly separable (Yu et al., 2020), and 2) the training method must be privacy-preserving, i.e., the raw inputs or the embedding vector may not be shared with other entities participating in training. In the following, we present Federated User Authentication (FedUA) framework and describe its properties.

4.1 Federated User Authentication (FedUA)

We adopt the FL framework for training UA models since it is a natural choice for training machine learning models on the data of a large number of users without having direct access to the raw inputs. Moreover, in our proposal, users train the model with random embedding vectors generated prior to the training. We show the proposed method is privacy-preserving and also provides a high degree of separability between the embedding vectors.

Training. Let be the length of the embedding vector. We propose to use random binary vectors as embeddings, i.e., , where is the -th element of the embedding vector of user and

is a Bernoulli distribution with probability

. In experiments, we observed that a Bernoulli distribution performs better than other choices of generating random vectors. A model with binary outputs can be interpreted as an ensemble of binary classifiers where each classifier independently splits users into roughly two equal groups. The length of the embedding vector is determined by the server such that the generated random vectors are sufficiently separable. We provide probabilistic lower bounds on the minimum distance of the embedding vectors of length as a function of the number of users .

For training, the output vector of the model is passed through a sigmoid layer to generate predicted embeddings in the range of . The user trains the model to maximize the correlation of the predicted and true embeddings using the following loss function:

(4)

The loss function is designed so as to encourage the predicted embedding vector to be high where and similarly to be low where .

Authentication. After training, each user deploys the model as a binary classifier to accept or reject a test sample. For an input , the authentication is done by comparing the distance of the predicted embedding to the reference embedding with a threshold as follows:

(5)

where . The threshold is determined by each user separately in a “warm-up phase,” such that the True Positive Rate (TPR) is more than a value, say . The TPR is defined as the rate that the reference user is correctly authenticated. In the warm-up phase, user inputs , are collected and the corresponding distances to reference embedding, , are computed. The threshold is then set such that a desired fraction of inputs are authenticated. Our proposed FedUA framework is described in Algorithm (2).

0:   Training:
  Server: Determine length of embedding vectors, .
  Server: Send to all users
  Each user: Generate a random binary vector of length as embedding vector
  Server and users: FedAvg,
  Return
  
   WarmUpPhase():   // Done by users
  Collect inputs
  Compute
  Set equal to the -th smallest value in where
  Return
  
   Authentication():   // Done by users
  
  if  then
     Return Accept
  else
     Return Reject
Algorithm 2 Federated User Authentication (FedUA). is the number of users, is the dataset of user , is the reference embedding, is the trained model, is the TPR, is a test sample, and FedAvg is described in Algorithm (1).

4.2 Analysis of FedUA

Minimum distance between embeddings. The following Lemma provides a probabilistic lower bound on .

Lemma 1.

Let be the number of users and be the length of the embeddings. Let also be the minimum Hamming distance between all embedding vectors. We have:

(6)

where .

Proof.

Note that is the number of vectors with distance less than to a given vector. We prove the lemma by induction. For , Equ. (6) trivially holds. Assume it also holds for . The probability that a new vector can be added such that will not decrease is , where is the space occupied by Hamming spheres with radius of previous vectors. We have , where is the overall occupied space assuming that previous Hamming spheres are disjoint. This completes the proof. ∎

Using Equ. (6), for a given number of users and desired minimum distance , the server can obtain such that the minimum distance of random embedding vectors is more than with probability of at least .

Practical advantages.

Using random binary embeddings enables training UA models with significantly smaller output size compared to the one-hot encoding and thus scales to a larger number of users. Moreover, our framework allows new users to be added to training after training started without the need to change the output layer. Although adding new users causes the effective pairwise minimum distance between embedding vectors to decrease, it helps the model performance by training with data of more users. Furthermore, the drop in minimum distance will not be significant either. As an example, assume

and and . The median values of over experiments are and , respectively, implying that even doubling the of number users only slightly reduces the minimum distance. Finally, generating embeddings randomly does not need any coordination among the users or between the users and the server, apart from the communications done usually in the FL setting.

Security analysis. In our proposed framework, neither raw inputs nor the embeddings will be shared with the server or other users, which makes the model robust against poisoning and evasion attacks. Our method, however, inherits potential privacy leakage of FL methods, where users’ data might be recovered from a trained model or the gradients (Melis et al., 2019). It has been suggested that adding noise to gradients or using secure aggregation methods improve the privacy of FL (McMahan et al., 2017b; Bonawitz et al., 2017). Such approaches can be applied to our framework too.

5 Related Work

FL has been used in a variety of applications, such as mobile keyboard prediction (Hard et al., 2018; Yang et al., 2018), keyword detection (Leroy et al., 2019), medical applications (Brisimi et al., 2018) and wireless communications (Niknam et al., 2019). Apple has also said to use FL for the vocal classifier for “Hey Siri” (Apple, 2019), the details of which however have not been published. To the best of our knowledge, our work is the first to explore using FL for privacy-preserving training of UA models.

Our approach of assigning a random binary vector to each user is related to distributed output representation (Sejnowski and Rosenberg, 1987), where a binary function is learned for each bit position. It follows (Hinton and others, 1986) in that functions are chosen to be meaningful and independent, so that each combination of concepts can be represented by a unique representation. Another related method is distributed output coding (Dietterich and Bakiri, 1991, 1994), which uses error-correcting codes (ECCs) to improve the generalization performance of classifiers, with the codes constructed such that the length of codewords is greater or equal to number of classes. We, however, use random binary vectors as user embeddings to enable privacy-preserving training of UA models in the federated setting. Moreover, we propose to generate vectors of length much smaller than the number of users to improve the scalability of the method to large number of users.

6 Experimental Results

In this section, we first describe the dataset, network and the training setup, and then provide the authentication results of UA models trained in federated setting.

6.1 Experimental Setup

Dataset. We evaluate the proposed FedUA framework on the VoxCeleb dataset (Nagrani et al., 2017) which is created for large scale text-independent speaker identification in real environments. The dataset contains speakers’ data with to number of utterances per speaker, which are generated from YouTude videos recorded in various acoustic environments.

For training UA models, usually only few samples are collected in one setting and the same environment. Hence, we selected speakers that had at least samples from a single video, which resulted in speakers. We used the first seconds of each audio file for training and validation and used the next seconds for testing the authentication performance of the model on users who participated in training. For each user, we train the model with utterances and use the remaining utterances for validation. We also generated a dataset of users that we did not select for training by choosing utterances from remaining speakers and cropping their first seconds. All -second audio files are downsampled by a factor of to obtain vectors of length for model input.

Network architecture.

The network consists of three convolutional blocks, composed of convolution, relu, average pooling and Group Normalization (GN) layers, followed by fully-connected and sigmoid layers. GN is used instead of batch-normalization (BN) following the observations that BN does not work well in non-iid data setting similar to our case 

(Hsieh et al., 2019). Table (1) provides the details of the network architecture.

Layer Output Size
Input
conv1d , relu, avg_pool1d , GN
conv1d , relu, avg_pool1d , GN
conv1d , relu, avg_pool1d , GN
Flatten
FC
sigmoid
Table 1: Network architecture for UA model trained with speech data. conv1d is one-dimensional convolutional layer with output channels and kernel size of , avg_pool1d is one-dimensional average pooling with downsampling rate of , GN is group normalization layer with groups, FC is fully-connected layer with inputs and outputs, and is the length of the embedding vector.

Training setup. We train federated models with FedAvg method with local epoch and fraction . The models are trained with SGD optimizer with learning rate of . We provide experimental results with random embeddings of lengths of and , for which the corresponding minimum distances between embedding vectors are , and , respectively.

6.2 Authentication Results

We provide the experimental results for models trained with random binary embeddings in the federated setting. The authentication performance is evaluated on different data, namely 1) training data, 2) validation data of users who participated in training, and 3) data of users who did not participate in training. Figure 1 shows the ROC curves. As expected, the authentication performance is best on training data. The performance slightly degrades when the model is evaluated on validation data of users who participated in training and further reduces on data of new users. The models, however, achieve notably high TPR at very low FPRs in all case. For example, with a TPR of , we obtained FPRs of and on data of new users for and , respectively, implying that the model can reliably reject the data of unseen users. Also, as expected, by increasing the length of the embedding vector the performance improves.

(a)
(b)
(c)
Figure 1: ROC curves for models trained with random binary embeddings in federated setting. The models are trained with data of users (out of users) of the VoxCeleb dataset (Nagrani et al., 2017). The figures show TPR vs FPR for embedding vectors with different lengths of and . The authentication performance of the models are evaluated on different data, namely 1) training data, 2) validation data of users who participated in training, and 3) data of users who did not participate in training. As can be seen, the models achieve high TPR at very low FPRs in all case. For instance, with TPR=, we obtained FPR= and on data of new users for and , respectively. Also, as expected, by increasing the length of the embedding vector the performance improves.

7 Conclusion

In this paper, we presented FedUA, a framework for training user authentication models. The proposed framework adopts federated learning and random binary embeddings to protect the privacy of raw inputs and embedding vectors, respectively. We showed our method is scalable with the number of users and does not need any coordination among the users or between the users and the server, apart from the communications done usually in the FL setting. Our experimental results on a speaker verification dataset shows the proposed method reliably rejects data of unseen users at very high true positive rates.

The proposed approach of choosing fixed random binary vectors enables training the model with highly separable embeddings, but does not take into account the characteristics of users’ data, e.g., age, gender or accent in speech inputs. In future work, we plan to extend the proposed method to adaptively update the embedding vectors during the training in a privacy-preserving way.

References

  • Apple (2019) Designing for privacy (video and slide deck). apple wwdc. Note: https://developer.apple.com/videos/play/wwdc2019/708 Cited by: §5.
  • B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, G. Giacinto, and F. Roli (2013) Evasion attacks against machine learning at test time. In Joint European conference on machine learning and knowledge discovery in databases, Cited by: §1, 2nd item.
  • B. Biggio, B. Nelson, and P. Laskov (2012)

    Poisoning attacks against support vector machines

    .
    arXiv preprint arXiv:1206.6389. Cited by: §1, 1st item.
  • K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth (2017) Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Cited by: §4.2.
  • T. S. Brisimi, R. Chen, T. Mela, A. Olshevsky, I. C. Paschalidis, and W. Shi (2018) Federated learning of predictive models from federated electronic health records. International journal of medical informatics. Cited by: §5.
  • K. Cao and A. K. Jain (2018) Automated latent fingerprint recognition. IEEE transactions on pattern analysis and machine intelligence. Cited by: §1.
  • N. Carlini and D. Wagner (2017) Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), Cited by: 2nd item.
  • T. G. Dietterich and G. Bakiri (1991) Error-correcting output codes: a general method for improving multiclass inductive learning programs. In AAAI, Cited by: §5.
  • T. G. Dietterich and G. Bakiri (1994) Solving multiclass learning problems via error-correcting output codes.

    Journal of artificial intelligence research

    .
    Cited by: §5.
  • A. Hard, K. Rao, R. Mathews, S. Ramaswamy, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage (2018) Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604. Cited by: §1, §5.
  • G. E. Hinton et al. (1986)

    Learning distributed representations of concepts

    .
    In Proceedings of the eighth annual conference of the cognitive science society, Cited by: §5.
  • K. Hsieh, A. Phanishayee, O. Mutlu, and P. B. Gibbons (2019) The non-iid data quagmire of decentralized machine learning. arXiv preprint arXiv:1910.00189. Cited by: §6.1.
  • P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, et al. (2019) Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977. Cited by: §2.1.
  • D. Leroy, A. Coucke, T. Lavril, T. Gisselbrecht, and J. Dureau (2019) Federated learning for keyword spotting. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Cited by: §5.
  • H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas (2017a) Communication-efficient learning of deep networks from decentralized data. In Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), Cited by: §1, §2.1, Algorithm 1.
  • H. B. McMahan, D. Ramage, K. Talwar, and L. Zhang (2017b) Learning differentially private recurrent language models. arXiv preprint arXiv:1710.06963. Cited by: §4.2.
  • L. Melis, C. Song, E. De Cristofaro, and V. Shmatikov (2019) Exploiting unintended feature leakage in collaborative learning. In 2019 IEEE Symposium on Security and Privacy (SP), Cited by: §4.2.
  • A. Nagrani, J. S. Chung, and A. Zisserman (2017) VoxCeleb: a large-scale speaker identification dataset. In Proceedings of the INTERSPEECH, Cited by: 3rd item, Figure 1, §6.1.
  • K. Nguyen, C. Fookes, A. Ross, and S. Sridharan (2017)

    Iris recognition with off-the-shelf cnn features: a deep learning perspective

    .
    IEEE Access. Cited by: §1.
  • S. Niknam, H. S. Dhillon, and J. H. Reed (2019) Federated learning for wireless communications: motivation, opportunities and challenges. arXiv preprint arXiv:1908.06847. Cited by: §5.
  • T. J. Sejnowski and C. R. Rosenberg (1987) Parallel networks that learn to pronounce english text. Complex systems. Cited by: §5.
  • D. Snyder, D. Garcia-Romero, D. Povey, and S. Khudanpur (2017) Deep neural network embeddings for text-independent speaker verification. In Proceedings of the INTERSPEECH, Cited by: §1.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §1, 2nd item.
  • F. Wang, J. Cheng, W. Liu, and H. Liu (2018) Additive margin softmax for face verification. IEEE Signal Processing Letters. Cited by: §1.
  • T. Yang, G. Andrew, H. Eichner, H. Sun, W. Li, N. Kong, D. Ramage, and F. Beaufays (2018) Applied federated learning: improving google keyboard query suggestions. arXiv preprint arXiv:1812.02903. Cited by: §5.
  • F. X. Yu, A. S. Rawat, A. K. Menon, and S. Kumar (2020) Federated learning with only positive labels. arXiv preprint arXiv:2004.10342. Cited by: 2nd item, §3.3, §4.
  • S. Yun, J. Cho, J. Eum, W. Chang, and K. Hwang (2019) An end-to-end text-independent speaker verification framework with a keyword adversarial network. In Proceedings of the INTERSPEECH, Cited by: §1.