Federated Learning using Smart Contracts on Blockchains, based on Reward Driven Approach

07/19/2021 ∙ by Monik Raj Behera, et al. ∙ J.P. Morgan 0

Over the recent years, Federated machine learning continues to gain interest and momentum where there is a need to draw insights from data while preserving the data provider's privacy. However, one among other existing challenges in the adoption of federated learning has been the lack of fair, transparent and universally agreed incentivization schemes for rewarding the federated learning contributors. Smart contracts on a blockchain network provide transparent, immutable and independently verifiable proofs by all participants of the network. We leverage this open and transparent nature of smart contracts on a blockchain to define incentivization rules for the contributors, which is based on a novel scalar quantity - federated contribution. Such a smart contract based reward-driven model has the potential to revolutionize the federated learning adoption in enterprises. Our contribution is two-fold: first is to show how smart contract based blockchain can be a very natural communication channel for federated learning. Second, leveraging this infrastructure, we can show how an intuitive measure of each agents' contribution can be built and integrated with the life cycle of the training and reward process.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The concept of federated machine learning was introduced around 2016[konevcny2016federated]. It relies on the principle of remote and distributed execution of machine learning algorithm, and the ability to share and aggregate individual models in a secure and anonymous manner. Therefore, it is implicit that federated machine learning would depend on availability of secure communication channels between remote participants to allow distribution of locally trained individual models.

Blockchain became popular with launch of bitcoin around 2009[nakamoto2012bitcoin]

. Blockchain is a form of distributed ledger technology that relies on honest majority members in a network to validate the accuracy of the executed transactions on the network. It accomplishes this by allowing each of its members to execute a piece of turing complete software code (a.k.a smart contract), in an independent fashion without any external influence or interventions. Although the proposed solution could be extended to other blockchains, this paper focuses primarily on Ethereum’s implementation. Therefore, blockchains can help the deployment of federated learning by both bringing dataset on a unique, structured, ledger (with potential privacy layers on it), and by guaranteeing the security, accuracy and correctness of the distribution of the model’s parameters. This could be of particular value in compliance and anti-money laundering cases requiring the reconciliation of multiple sensible dataset, and in which the use of fraud and anomaly detection models could improve manual audits and investigations

[fortunato2010community].

However, because of the distributed nature of validation, in current blockchains, unless it implements a privacy layer, all communications between any two nodes is visible to rest of the nodes of the network. Preserving privacy of transaction in a blockchain, while still allowing all nodes to participate in consensus process is a difficult problem to solve. This is an active area of research, and includes technologies such as Zero Knowledge Proof (zKP). The foundations of zKP is based on interactive proofs, as described in previous works[goldreich1991proofs] [bunz2020zether] [bowe2020zexe]. However, setup of those continues to be an onerous process, in real world implementation.

This poses a challenge for federated learning, that requires maintaining the privacy of the individual participant’s machine learning models (and model gradients too) and anonymity of the model contributor. The proposed implementation solves this challenge via dynamically generating asymmetric and symmetric keys for each federated learning round ( details to follow in subsequent sections), with a caveat that the aggregation server node is conceptually akin to a Trusted Execution Environment (TEEs), but at a consortium level[cheng2019ekiden].

However, even with consortium trusted aggregation server implementation, the risk of lack of contribution in the overall federated learning rounds from individual nodes continues to exist. In addition, the malicious nodes could potentially send a misleading model that could skew the efficacy of the aggregated models. One of the potential ways to address the above challenges could be to have the aggregation server detect somehow, such behavior and drop those contributors from the collaboration process. Unfortunately, this solution tends to centralizes the solution and lack of transparency in the overall process.

The current paper proposes a solution to avoid the above potential pitfalls by leveraging unique and transparent smart contracts design on blockchain to reward honest/active ( and penalize malicious/under performing) participants in the learning process, based on computing a novel, scalar quantity - federated contribution. In our proposed solution, smart contract is responsible for reward ( or penalty) specification and distribution (or fees) as well, through immutable federated contribution records on blockchain.

Ii Related work

Ii-a Federated Learning

Federated learning is a distributed machine learning setting where the goal is to train a high quality global model, while training is done locally and privately on individual participant (federated learning client). The training is done across large number of clients. After the local models are trained, the improved local model gradients are sent securely over the network to the federated server for federated aggregation[li2018federated].

Ii-B Blockchain

There are several implementations of blockchain - Bitcoin, Ethereum, Hyperledger Fabric etc. Ethereum[buterin2014ethereum]

is a decentralized, open-source blockchain with smart contract functionality. Ether is the native cryptocurrency of the platform. After Bitcoin, it is the second-largest cryptocurrency by market capitalization. In addition to existing public Ethereum mainnet, there exists Ethereum for Enterprises

[swan2018blockchain] that offer certain enterprise desired features. Some of these features include more control on nodes in the network, higher performance, difference in cost, node permissioning and different privacy implementation.

In general, there are three main types of Blockchains[wood2014ethereum]

  • Public Blockchain: a blockchain that anyone in the world can read, anyone in the world can send transactions to and expect to see them included if they are valid, and anyone in the world can participate in the consensus process.

  • Consortium Blockchain: a consortium blockchain is a blockchain where the consensus process is controlled by a pre-selected set of nodes.

  • Fully private Blockchains: a fully private blockchain is a blockchain where write permissions are kept centralized to one organization. Read permissions may be public or restricted to an arbitrary extent.

The proposed solution in this paper applies to the Consortium (Enterprise) blockchain where the identity of individual nodes are known to the Consortium Operator ( a.k.a. Network Operator).

Ii-C Smart Contract on blockchain

A smart contract[bogner2016decentralised] is simply a program that runs on the Ethereum blockchain. It’s a collection of code (its functions) and data (its state) that resides at a specific address on the blockchain. Smart contract are permission-less, i.e. anyone can write a smart contract and deploy it to the network. The smart contract code is visible to all participants on the network, and any participant can independently execute that code to validate the outcome[bach2018comparative]. Smart Contracts on Ethereum are written in its own programming language called Solidity. Events on Ethereum[EthereumTutorial_2020] are well defined ways of asynchronously exchanging data among the participants of blockchain network. In Solidity, events[SolidityContracts_git] are dispatched as signals, that the smart contracts can fire. DApps, which are essentially decentralized applications, or anything connected to Ethereum JSON-RPC API (an interface exposed by blockchain network for connectivity and programmatic interactions with blockchain), can listen to these events and act accordingly. Event can also be indexed, so that the event history can be searched later.

Ii-D Federated learning and blockchain

There is a growing literature on federated learning implementations through blockchain, a sign of the natural complementarity between these two technologies we are arguing for. Since [zhou2020pirate] proposed using blockchain to maintain the global model with community and reach a consensus, a number of papers [kim2019blockchain, majeed2019flchain, bao2019flchain, li2020blockchain] explored this avenue, but mainly using the blockchain as a safe and coherent storage for the global model, and fail to make full use of the potential of smart contracts to both coordinate the learning, and through that compute measurement functions of how each agent is contributing to the global model.

[drungilas2021towards] adds implementation of oracles to make it possible to deploy regressions on Hyperledger Fabric. While oracles can act as decentralized web services, that extend the capabilities of smart contracts, we would also argue in this paper that by having a federated learning life cycle that is exactly that of some smart contract improves both security, performance and conceptualization (especially on how to distinguish and reward contributors) of the federated learning task.

For that measurement of contribution framework we build on [chen2018machine] proposal to leverage the blockchain to evaluate updates from nodes, and potentially kick out underrated nodes as a defense against malicious devices, and link that to the literature on deletion based rewarding and Shapley value computations. It is our hope to further bridge these two literatures, to be able to automatically compute different variations of Shapley values through blockchain-based smart contract in federated learning settings. Our main contribution is to showcase how a natural infrastructure and life cycle could support these, leveraging the cryptographic, distributed computing, and consensus mechanisms within blockchain.

Ii-E RSA and AES algorithm

Asymmetric key cryptographic algorithm are popular cryptography techniques, which focus on using public-private key pair for encryption. In RSA algorithm[forouzan2015cryptography], the fundamental idea is to use a computationally impossible, long prime numbers based public key and private key. Public key can be used to encrypt data, where as it can only be deciphered using private key, which is kept privately and securely with the owner. In case of symmetric cryptography, like of AES algorithm[singh2013study], the same key would be used for encryption and decryption. Thus special care needs to be given to safeguard the symmetric key itself.

Ii-F Measuring contributions in federated learning

In recent work[9006179], measurement of contributions towards improvising the global model in federated learning has been described, for both horizontal and vertical class of federated learning. Authors have discussed the approach of using ‘deletion method’ for horizontal federated learning approach, where the change in testing accuracy is considered by various iterations of training, with each iteration, we remove the data points from a single client. This way, we measure the degradation in model performance, and then infer the contributions accordingly by the federated clients. In case of vertical federated learning, Shapley values are computed for each feature. Shapley values gives a strong quantification of each feature’s importance, followed by a mathematical approach of inferring contributions of parties in vertical federated learning. There are, however, a multiplicity of ways in which the Shapley value can be implemented, with very disparate results, as shown in [sundararajan2020many]

. On the other hand, deletion based methods have the merits of simplicity (whereas Shapley values rely on utility functions, and for that reason can be hard to compute), the uniform value attributed to each data point make deletion based rewards very easily biased toward providers of outliers and irrelevant data. Our contribution will aim at both drawing from the simplicity of the deletion based approach (but by focusing on weights differences, instead of training data differences) and the conceptualization of Shapley values (as in our case accuracy of the resulting global model can be a proxy for agents’ utility function).

Iii Blockchain for federated communication

Earlier works have proved REST, RPC, gRPC[ventre2018sdn] as popular choices for communication between federated server and clients. For more secure communication, various battle-tested approaches include network firewalls, SSL and token based authentication have been leveraged. Though these methods provide a secure medium, they lack the transparency in the communication medium itself, which can be well established by using blockchain. This paper proposes use of blockchain and smart contracts (ethereum based), which provides a decentralized mechanism of communication between peers and the federated server. Using smart contracts, the communications are secured at blockchain level, which uses various asymmetric cryptography techniques for encryption and modification of state on blockchain network. This also entrusts the emission of events for communication between federated server and client nodes with extended level of security and transparency.

Owing to the “consortium blockchain network”, where individual participants are trusted enterprises and organizations, there is a distant possibility of few of the participants acting as a bad actors. Some of the participants are only consuming the resources, but not contributing enough, or vice versa. These are few of the possible scenarios in consortium governed federated learning network. To manage and counter these measures through incentives (both reward and penalty), smart contracts provide a transparent and immutable way of maintaining contributions record on the blockchain. Based on the contribution record from blockchain, a given participant would be either rewarded or penalized through on-chain blockchain tokens, or through off-chain mechanism established by consortium.

Iii-a Life-cycle of a federated aggregation event

Federated learning in a consortium network comprises of a consortium trusted and security hardened aggregation server. All the participants, who are clients in the federated learning ecosystem are listening the blockchain events from a smart contract (ethereum smart contracts, as ethereum is the blockchain network being used). Table I gives an overview of a typical federated learning round. All the events on the blockchain network, regarding federated learning is triggered by federated aggregating server. This design makes it possible for individual participants to be lightweight, in terms of responsibilities, and only focus on local model improvisation. The possible event initiation from federated learning server includes initial distribution of base machine learning model, subsequent federated learning cycles, contribution announcement on the chain and contribution fees notification.


Time Event Action on- Publish?
Initiate FL round Server
Receive Initiate Event Client(s)
Encrypt local model; with key received in previous step Client(s)
Publish encrypted local model Client(s)
Aggregate encrypted local models Server
Broadcast new global model Server
TABLE I: Life-cycle of a federated learning communication event. Time column represents relative timestamp, where increment in timestamp suggests relative increment, not the absolute increment. Event is the respective event type on communication round. ‘Action On’ column shows the expected action for which role, whether client or server. ‘Publish’ column depicts whether the event is published as broadcast on blockchain or not.

Iii-B Encryption of event data

In the previous sections, we discussed about how blockchain ensures required security and privacy at its core. Though data on blockchain is immutable, the privacy is not guaranteed, because of the very nature of how blockchain works. A consortium network with enterprise participants, there is a high possibility of participants not comfortable with letting peer participants knowing about the model weight gradients. If local model weights (and gradients) are revealed, this posses risk of revealing statistical properties of the data from individual participant, if not the actual data.

In order to privately send model weights from participant to federated server on blockchain, in a consortium network, asymmetric encryption using RSA cryptography is proposed in the paper. Every time a new federated learning round is published as blockchain event, through smart contract, federated aggregation server generates a new set of RSA key pair. The private key of the pair stays with the server, as this will be used by the server in later stages to decrypt the encrypted event message cipher to plain-text. The public key is sent across the network to all the participants. Participant will generate an AES key for encrypting machine learning model, and then use the public key received from federated server to encrypt the AES key[khanezaei2014framework]. With RSA keys revolving for every new federated learning round, this decreases the chances of compromising model information over the blockchain, for any given participant on the network. One thing to note here is - the event data containing participant’s local model weights need to be encrypted only. Other event data like global model, which is sent from server to clients, initiation of federated round broadcast would be sent as plain text, as they are not sensitive data among the peers in blockchain network.

Iii-C Smart contract for communication and contribution towards federated learning

In the previous sections, smart contracts have been described as a way of executing turing complete programs on blockchains. Ethereum smart contracts have a concept of events, which are essentially simple and cheap method of emitting a broadcast message to all the nodes, with defined parameters. Events are not considered as a state change on blockchain, hence they consume very less gas price[pierro2019influence], in comparison to state change transactions on blockchain. It is important to understand that events generated, when published by the participants or the federated server, would be a broadcast. Dynamically generated encryption keys, explained in earlier section would safeguard the privacy interest of the federated clients. Federated clients would have to listen to the agreed upon ethereum event from a smart contract, in order to receive the event and take necessary actions, essentially making smart contracts a rudimentary pub-sub[o2007toward] channel of communication.

In the current paper, smart contracts are proposed as a mechanism to maintain transparent, immutable records of contributions, which improvises (or degrades) the global federated learning model. The computation to determine the contribution of individual in the federated learning round (described in subsequent section) is performed off the blockchain, considering the resource and computation constraints. After the computation is completed, the records are published on blockchain, anonymously, for everyone to see. This makes the contributions (and indirectly participants expense fees) transparent. The idea is to motivate the participants to be honest in the network, at-least to a certain extent.

Iv Federated aggregation and contribution

Federated learning with centralized aggregation server implements various ways to implement federated aggregation[bonawitz2019towards], like FedAvg, FedSGD, etc. Assuming the number of clients is quite large (in order of ), if few of the clients are acting as bad actors, or clients with noise in training data, their impact towards the global model is smoothed by averaging algorithm. But in case of consortium setting, where number of clients is not that huge (in order of ), the local weights of federated client with noisy data or malicious intention would impact the overall weight(s) of the federated learning model. In order to tackle the challenge, we are proposing a novel way of computing a novel scalar quantity, federated contribution across the network, and using smart contract to publish and store on the blockchain (as discussed in earlier section). As compared to earlier work[9006179], federated contribution establishes a way to define contribution of each participant in federated learning setting, whether they might have train data of similar statistical properties, or non overlapping (orthogonal in feature space) statistical properties.

Iv-a Federated aggregation

FedAvg[li2019convergence] is one of the popular algorithms in federated aggregation domain, which ensures complete and optimal solution, provided the learning rate and local learning contribution is accurately considered. Inspired from the previous works, our problem formulation for non-IID (or IID) data, which assumes more real world problem setup, in a consortium blockchain network, we have formulated our problem as described below -

where as index for client, as the set of model weight parameters for any generic machine learning algorithm, which can be modified for federated learning setting. is the global objective function, is the local objective function for client , is the learning importance factor - a scalar value to determine individual client’s, local model’s relative importance, and is the total number of clients. Considering as learning importance factor in federated learning, it is assumed that -

With fedAvg as algorithm for aggregation, the individual weight updates for iteration, layer can be defined as -

where is weight for iteration at layer and is the learning rate parameter.

Iv-B Federated learning contribution

In previous sections, motivation for federated contribution is discussed. Intuitively, federated computation is a scalar quantity, which depicts the deviation, or divergence of two machine learning models. We will now define the federated contribution mathematically.

where is the absolute federated contribution of client , represents norm, represents Frobenius norm[dhillon2005generalized], represents the final layer’s weight matrix of a generic machine learning model. represents difference of model weight parameter matrix for layer of client. represents model weight for layer of client at iteration. is the model weight for layer of global model at iteration. is relative federated contribution of client .

For any generic machine learning algorithm, like linear regression, logistic regression, neural networks, etc, the primary intent is to find a weight matrix, or a set of weight matrices. These weight matrices are computed using loss functions and gradient descent approaches, in various formats depending upon the algorithm. Assuming the representational vector, defined

, with minimum size of the set as , we can define an each element of as Frobenius norm of . Here, , as defined previously, represents the element wise difference of weight matrices (order of subtraction doesn’t matter, as Frobenius norm computes square of the element). We have considered the element wise difference of both the weight matrices for each layer, as it is inverse operation to the gradient descent and weight updates, where we perform addition of improvements, as described in problem formulation earlier. Frobenius norm is well established method to find the magnitude of a matrix from origin of a hyperspace . When we compute Frobenius norm for , it essentially represents the magnitude of deviation between two weight matrices(global weight matrix of previous iteration and local weight matrix of current iteration, for layer l). The vector represents a set of magnitudes of deviation of local model’s weight matrices from global model’s weight matrices respectively, as an independent and individual axis. Finally, in order to calculate value for client k, we calculate 2-Norm, which is the euclidean norm of . This represents the magnitude of distance from origin, which quantifies the combined deviation of local model (set of weights) from global model (again, set of weights) as a scalar quantity.

Inferring , in case of federated contribution, federated aggregation server calculates for each client. If the federated contribution value is relatively high, that means the given client has contributed to a higher degree. While describing contribution to a higher (or lower

) degree in federated contribution, it practically quantifies the contribution in modifying the global model, by training on larger data points, or by training over data points which are having distant statistical properties from earlier training data, or may be higher noise in data. This intuitively can be thought as - the divergence of local model after training on new data points will be more, if higher gradient descent updates are performed. This can be because of variance in new data, use of better data points for local training rounds or greater training size. In case of divergence being relatively smaller, one can infer about participant using noisy or unrelated data points, which may not be acceptable for global federated model. Our framework would thus point at a possible way to fix a key limitation of the Shapley value framework (which we hope to build the link with in a future paper) - the fact that it only provides valuations for points within a fixed data set, and does not account for statistical aspects of the data and does not give a way to reason about points outside the data set.

V Experiments

In the paper, the experiments to validate the hypothesis of using blockchain smart contracts and federated contribution has been carried out with setting up a ethereum blockchain permissioned nodes setup in AWS cloud. Within this network, one of the node acts as federated aggregation server, and other three nodes act as federated clients. It is a consortium blockchain setup, with each individual client nodes own set of data. Figure 1 shows the architecture representation of the proposed experimental setup. The federated server node runs a python daemon process which listens to the events generated by smart contracts on the network. Upon receipt of events and based on event types, it either sends the global model to every node as a broadcast, or computes the aggregated version of the global model with latest gradients of local training, from individual federated clients. On federated clients, a similar python application is being executed, which listens to events and sends the encrypted local model, as required. It also acts as a service, which serves (exposing remote REST based endpoint, which can be consumed by any other program to predict, based on input data) machine learning model, and also is responsible for re-training of new batch of data points. The applications, which are responsible for directly interacting with blockchain nodes are termed as “dApp”, as depicted in figure 1, which essentially means decentralized applications.

Federated Aggregator server node deploys both s ‘communication’ and ‘contribution records’ smart contracts. . All of the federated client nodes leverage the address of these two public contracts as a communication channel both for publishing events and listening to any generated events as well. The interface of the smart contract has been discussed in further.


Fig. 1: In the above architecture diagram, individual nodes are running decentralized applications, with processes to serve and re-train data models. Two smart contracts, one for communication and one for contribution is deployed on the blockchain. Federated server is responsible to send the dynamically generated RSA key pair’s public key, which would be used to encrypt the newly generated AES key of participant, used for model encryption. Encrypted models are publised on the blockchain. Even though all the participants can receive the private encrypted model gradients, only the federated aggregation server can decipher and use the model, as it posses the RSA private key of respective public key.

V-a Data description

In the paper, to test the hypothesis against standard data set, MNIST

[lecun-mnisthandwrittendigit-2010] and Fashion-MNIST[xiao2017fashionmnist] data sets have been used. In both the data sets, it has test data points, train data points and target classes. We split the training data as for initial, “genesis” global model and remaining as training data for federated clients. For both the data sets, we have performed our experiments in two ways, by splitting our data sets described below -

  1. Independent and Identical Data: The data is randomly split and distributed across all the federated clients, each having training data points with all the target classes.

  2. Non-Independent and Identical Data: The data is split and distributed across all the federated clients, each of the clients having training data with only target class.

V-B Blockchain smart contract interface

As discussed earlier, we have used two ethereum based smart contracts, one for communication and one for recording contributions transparently on blockchain. Interface for both of the smart contracts have been defined as follows -

1// Communication : Interface
2pragma solidity >=0.8.0 <0.9.0;
3contract Communication {
4  event BCEvent(
5    uint256 timestamp,
6    bool is_encrypted,
7    bytes event_type,
8    bytes body
9  );
10  function publish(
11    uint256 timestamp,
12    bool is_encrypted,
13    bytes memory event_type,
14    bytes memory body
15  ) public returns(uint ack) { }
16}
1// Contribution : Interface
2pragma solidity >=0.8.0 <0.9.0;
3contract Contribution {
4  uint len = 5; //5 federated clients
5  uint[] memory _clients = new uint[](5);
6  function set_contribution(
7    uint client_id,
8    uint relative_contribution
9  ) public returns(uint ack) {
10  //only owner(federated server)
11  //modifies state of _clients
12  }
13  function get_contributions()
14  public view returns (uint memory) { } }

V-C Neural network for image classification

Since the each image data in MNIST and Fashion-MNIST is gray scale image, we have used artificial neural network[deng2013new] for image classification. Figure 2 describes the architecture of the neural network in depth. We have used adam[kingma2014adam] optimizer and sparse categorical cross entropy[louizos2017learning] as loss function, with

training epochs.

Output

(1024)

(512)

(128)

(728)

(10)
Fig. 2: In the architecture, the input layer contains neurons, after flattening the image. Output layers contains neurons, as the data set has image labels or target classes. It contains fully connected, hidden layers, each having , and neurons respectively. The hidden layers used relu, and output layer used softmax

as the activation functions.

V-D Results

Based on the data, blockchain network with smart contracts and neural network architecture described in earlier sections, we performed various experiments to test our hypothesis of computing federated contribution against various settings like higher contribution of a smaller group of participants, lower contributions by a subset of participants, noise in local data, while training by individual nodes. We have also calculated the overall accuracy of image classification of the aggregated model. In our experiments, we have changed the size of training samples, as it is the variable of the experiment. For each set of conditional environment for experiments, we have considered results for different cases. The cases are as (i) MNIST data with non-IID split (ii) MNIST data with IID split (iii) Fashion-MNIST data with non-IID split and (iv) Fashion-MNIST data with IID split. A well known and common observation across all the experimental setting is increase in machine learning model’s accuracy performance, with increase number of training samples.

Figure 3 shows the performance of the aggregated machine learning model, from various clients. We do observe better and stable performance in case of data being split, across all the clients in IID manner. Figure 4 shows the federated contribution being increasing with increase in training data sample size. Also, since all the clients in this setting have contributed equally, their federated contribution values are quite close. One thing to notice, which is evident across all experiments (can be observed in the graphs) is, increase in federated contribution value, with increase in training size. This essentially validates our hypothesis of federated contribution being dependent on number of weight updates, which is directly proportional to higher training data (or training iterations). For distinct visuals, we have shown client 3 in always green colour, in all the visualizations.

Figure 5 depicts the experiment, where client 3 is under-performing by training on only of what other participants are training. We can observe how the federated contribution value for client 3 is very low, as compared to other participants. Figure 6 whereas shows the opposite, where client 3 trains on times more data points, puts in extra effort, logically, than other participants. This shows higher value in federated contribution for client 3, as relatively compared to others. Figure 7 is the final setting, where we added Gaussian noise, with noise value between and , in training data of client 3, which essentially shows depletion in federated contribution of client 3, relative to others.


Fig. 3: (a) Accuracy of aggregated model in % vs training data sample size for MNIST data, split in Non-IID format (b) Accuracy of aggregated model in % vs training data sample size for MNIST data, split in IID format (c) Accuracy of aggregated model in % vs training data sample size for Fashion-MNIST data, split in Non-IID format (d) Accuracy of aggregated model in % vs training data sample size for Fashion-MNIST data, split in IID format

Fig. 4: (a) Federated contribution of equally contributing clients vs training data sample size for MNIST data, split in Non-IID format (b)Federated contribution of equally contributing clients vs training data sample size for MNIST data, split in IID format (c) Federated contribution of equally contributing clients vs training data sample size for Fashion-MNIST data, split in Non-IID format (d) Federated contribution of equally contributing clients vs training data sample size for Fashion-MNIST data, split in IID format

Fig. 5: (a) Federated contribution of highly contributing clients, but client 3 vs training data sample size for MNIST data, split in Non-IID format (b)Federated contribution of highly contributing clients, but client 3 vs training data sample size for MNIST data, split in IID format (c) Federated contribution of highly contributing clients, but client 3 vs training data sample size for Fashion-MNIST data, split in Non-IID format (d) Federated contribution of highly contributing clients, but client 3 vs training data sample size for Fashion-MNIST data, split in IID format

Fig. 6: Federated contribution of lowly contributing clients, but client 3 vs training data sample size for MNIST data, split in Non-IID format (b)Federated contribution of lowly contributing clients, but client 3 vs training data sample size for MNIST data, split in IID format (c) Federated contribution of lowly contributing clients, but client 3 vs training data sample size for Fashion-MNIST data, split in Non-IID format (d) Federated contribution of lowly contributing clients, but client 3 vs training data sample size for Fashion-MNIST data, split in IID format

Fig. 7: (a) Federated contribution of highly contributing clients, but client 3 with noise vs training data sample size for MNIST data, split in Non-IID format (b)Federated contribution of highly contributing clients, but client 3 with noise vs training data sample size for MNIST data, split in IID format (c) Federated contribution of highly contributing clients, but client 3 with noise vs training data sample size for Fashion-MNIST data, split in Non-IID format (d) Federated contribution of highly contributing clients, but client 3 with noise vs training data sample size for Fashion-MNIST data, split in IID format

Vi Conclusion

In our paper, we studied our proposal of using smart contracts to establish a fair, transparent, secure and immutable incentivization mechanism in a consortium network for federated learning. We proposed a novel approach to calculate a unique scalar quantity, federated contribution, which quantifies the contribution of each participant for federated learning. Federated contribution is compatible with the machine learning algorithms which relies on weight parameters computed by gradient descents. We justified our proposed approach both empirically and theoretically. For future work in the given area, one can extend the federated contribution to non-gradient descents algorithms, or to heterogeneous federated learning. In the proposed method of calculating federated contribution, and using the relative federated contribution values for reward (or penalization) mechanism, we validated that it effectively penalizes under-performing participants, rewards over-performing participants and penalizes participants with noisy or malicious data points. This justifies our proposal of considering federated contribution as an adequate mechanism of quantifying participants’ contribution in the blockchain based consortium network, with accuracy of the global model as a proxy of utility in the Shapley value framework. Future work will aim at building further the conceptual bridge between our weight-based contribution measure and Shapley values, under modified axioms that reflect the specificities of federated machine learning settings.

Acknowledgment

We would like to thank our colleagues Robert Otter, Nicolas X Zhang, Jiao Y Chang, Tulasi Das Movva, Thomas Eapen for reviews and discussions during the experimentation and evaluation. We would also like to thank J.P.Morgan Chase for supporting us in carrying out this effort.

References