1 Introduction
Federated Learning (FL) mcmahan2017communication
, a privacybydesign machine learning approach, has introduced new ways to build recommender systems (RS). Unlike traditional approaches, the FL approach means that there is no longer a need to collect and store the users’ private data on central servers, while making it possible to train robust recommendation models. In practice, FL distributes the model training process to the users’ devices (i.e., the client or edge devices), thus allowing a global model to be trained using the userspecific local models. Each user updates the global model locally using their personal data and sends the local model updates to a server that aggregates them according to a predefined scheme. This is in order to update the global model.
A prominent direction of research in this domain is based on Federated Collaborative Filtering (FCF) ammad2019federated; chai2019secure; dolui2019poster that extends the standard Collaborative Filtering (CF) Hu2008 model to the federated mode. CF is one of the most frequently used matrix factorization models used to generate personalized recommendations either independently or in combination with other types of model koren2009matrix. Essentially, the CF model decomposes the useritem interaction (or rating) data into two sets of lowdimensional latent factors, namely the userfactor and itemfactor, therefore capturing the user and item specific dependencies from the interaction data respectively. The learned factors are then used to generate personalized recommendations regarding the items that the users have not interacted with before.
The FCF distributes parts of the model computation so then all of the itemfactors (i.e., the global model) are updated on the FL server and then distributed to each user. The user specific factors are updated independently and locally on each device using the user’s private data and the itemfactors received from the server. The local model updates through the gradients are then calculated for all of the items on each user’s device. This is then transmitted to the server where the updates are aggregated to update the itemfactors (also known as the update of the global model). To achieve model convergence, FCF and similar federated recommendation models require several communication rounds (of global vs. local model updates) between the FL server and the users. In each round, the computational payload (also known as the carrying capacity of a packet or transmission data unit) that is transferred (upload/download) across the network and between the server and users depends on the size of the global model (here it is the ).
Beyond the major challenges of FL systems litian2019federated; li2019federated, there exists a practical concern that arises when running largescale federated recommender systems (FRS) in production. Considering the number of factors to be fixed, the model payload increases linearly with the increase in the number of items. Table 1
demonstrates the expected payload estimations of a global model with a total number of items between 3000 – 10 million. For a largescale FRS comprised of 100,000 items, there exists a key problem of an increasing payload not only for the users but also for the broadband/mobile internet service providers and operators. The requirement to transmit huge payloads between the FL server and users over several communication rounds imposes strict limitations for a realworld largescale FL based recommender system.
# Items  3912  10k  100k  500K  1 M  10 M 

Payload (approx)  625KB  1.6 MB  16 MB  80 MB  160 MB  1.6 GB 
To tackle the payload challenge, we present a new payload optimization method for FRS as shown in Figure 1
. We adopted multiarmed bandit (MAB), a classical approach to reinforcement learning, in order to formulate our solution for minimizing the payloads. In each communication round, our optimization method intelligently selects part of the global model to be transmitted to all users. The selection process is guided by a bandit model with a novel reward policy wellsuited for FRS. In this way, instead of transmitting (uploading/downloading) the huge payload that includes the entire global model, only part of the global model with a smaller payload is transmitted over the FL network. The users perform the standard model updates as part of the FRS
ammad2019federated; 10.1007/9783030676612_20, thus avoiding any additional optimization steps (see Figure 1). As a case study, we have presented the payload optimization of a traditional FCF method. However, the proposed method can be generalized to advanced deep learningbased FL recommendation systems
qi2020privacy and it can also be applied to a generic class of matrix factorization models 10.1007/9783030676612_20. We extensively compared the results from three benchmark recommendation datasets, specifically Movielens, LastFM, and MIND. The findings confirm that the proposed method consistently performed better than the baseline method and achieved a 90% reduction in payload with an average recommendation performance degradation ranging from
4% to 8% for highly sparse datasets (LastFM and MIND).The contribution of this work is twofold: (1) We have proposed the first method to optimize the payload in FRS and (2) We have empirically demonstrated the usefulness of our proposed method by rigorously evaluating the effects of payload reduction on recommendation performance.
2 Methods
2.1 Collaborative Filtering (CF)
Given a collection of useritem interactions for users and items collected in a data matrix , the standard CF koren2009matrix is defined as a matrix factorization model:
(1) 
The CF model factorizes into a linear combination of lowdimensional latent itemfactors for and userfactors for collected in factor matrices and respectively, where is the number of factor. The cost function optimizing across all users and items is then given as:
(2) 
where a confidence parameter is introduced to account for the uncertainties arising from the unspecified interpretations of in implicit feedback scenario. Specifically, denotes that the user has interacted with the item . However, can have multiple interpretations such as the user does not like the item or maybe the user is oblivious to the existence of the th item Hu2008. Lastly, is the L2regularization parameter set to avoid overfitting.
2.2 Federated Collaborative Filtering (FCF)
FCF extends the classical CF model to the federated mode ammad2019federated; chai2019secure; dolui2019poster. FCF distributes parts of the model computation (Eq. 2) to the user’s device as illustrated in Figure 1. The key idea is to perform local training on the device so then the user’s private interaction data (e.g., ratings or clicks) is never transferred to the central server. The global model is updated on the server after the local model updates have been received from a certain number of users. Specifically, for a particular user , the federated update of the private userfactor is performed independently without requiring any other user’s private data. The optimal solution is obtained by taking , setting , from Eq. 2
(3) 
Importantly, the update depends on the itemfactor Q which is received from the FL server for each round of model updates. However, the itemfactor Q
is updated on the FL server using a stochastic gradient descent approach.
(4) 
for some gain parameter and number of federated model updates to be determined. A particular user computes the item gradients independently of all other users as
(5) 
where for item is defined as:
(6) 
Each user transmits the gradients of all items as local model updates to the FL server, where the s are aggregated to update the global model Q (see Eq. 4
). The Adaptive Moment Estimation (Adam)
kingma2015adam method is used in the context of FCF ammad2019federated; 10.1007/9783030676612_20 to better adapt the learning rate () to support faster convergence and greater stability. Finally, in order to compute the recommendations , the user downloads the global model from the FL server according to a predefined configuration setting.Importantly, in each FL training iteration, the model payloads Q and are transferred between the server and users, and viceversa. The payload scales linearly with the increasing number of items (as shown in Table 1). We next present our method for optimizing the model payloads by reducing the size of Q and to the point where it is suitable for FRS when deployed in production.
3 Payload Optimization method for Federated Collaborative Filtering
We formulate a multiarm bandit method to optimize Q model payloads for federated recommender systems (FRS). There exist numerous challenges when optimizing payloads. First, the FCF server does not know the user’s identity. Each user sends the updates which are aggregated without referencing any one user’s identity. To optimize the payload, we cannot determine the item memberships in terms of groups of users, therefore potentially relevant items may be selected in the Q model. Second, in contrast to the standard (offline) training of models, the FL training is performed online with the federated updates arriving from users in a continuously asynchronous fashion. In each iteration, Q is updated when the number of collected updates reaches a certain threshold . Several factors make the FL training computationally challenging such as a low number of users participating in the update, a lesser frequency of updates being sent by the users, and most importantly, a lousy communication over the Internet and the related network latency. In practice, the FCF model training is a complex online sequential learning problem that motivated our choice of proposed method for payload optimization. Consider a particular FCFbased recommendation model training setup where at each FL iteration ,

the FL server requests the set of items (potential –arms) from the bandit model,

the bandit model selects a subset of items among the set of available items,

the FL server only transmits the global model comprised of the selected items to users (or clients),

a user for returns feedback for as the gradients of the selected items.
In our context, the feedback is used to compute the quantity that has to be optimized aka. reward . To handle the online sequential aspect of the FL model training, our bandit solution is composed of two main ingredients: (1) a strategy recommending the items in order to select the optimal , and (2) a function to infer the rewards when using the feedback received from the FL users. We refer to the proposed method as FCFBTS (throughout the manuscript) and outline the FCFBTS algorithmic steps in Algorithm 1.
Formally, our bandit method for payload optimization is a tuple consisting of four elements :
Item is a subset of the items among the set of available items.
State is the set containing the feedback (or observations) collected by the bandit model from the FL environment. Particularly, , where includes the feedback that the item (for ) has received from the FL users at the iteration . We consider to be the feedback that contains the local model updates .
Actions is the set including the actions suggested by the bandit model. Specifically, , where denotes the action taken by the bandit to recommend the item (for ), to be included in at FL iteration .
Reward , where is the reward function. Particularly, where represents the reward for item (for ) in each FL iteration . After an action is taken by the bandit model, the user provides feedback , which is then used to estimate the reward using Eq. 13.
3.1 Sampling Strategy
As an itembased payload selection strategy, we used the widely known Bayesian Thompson Sampling (BTS) thompson1933likelihood; thompson1935theory; chapelle2011empirical; scott2010modern; kawale2015efficient approach with Gaussian priors for the rewards. We formulated a probabilistic model to sample the next set of item from the posterior distributions, which were then used for selecting
optimally. Specifically, we assumed that the model of item rewards followed a normal distribution with an unknown mean and fixed precision (
) as given by:(7) 
The prior probability for unknown
for an item is also believed to be normally distributed with parameter and precision such as:(8) 
The posterior probability distribution of the unknown
was obtained by solving the famous Bayes theorem
gelman2013bayesian:(9) 
where the updates for the posterior parameters of the prior are estimated as fink1997compendium; gelman2013bayesian:
(10) 
where is the number of times that the item has been selected as part of .
(11) 
In Eq. 10, is the estimated value of action at FL iteration (or time step) and given by:
(12) 
where (Eq. 13) is the reward obtained at FL iteration when action was taken. Essentially, in each FL iteration , we update two parameters and of the selected item . Next, we sampled from the posterior distribution (specified in Eq. 9) before selecting the items (aka. –arms) corresponding to the largest sampled values ordered by their expected rewards (Eq. 7). Our setting is similar to that of the multiple arms selection () problem in RS streeter2008online; radlinski2008learning; uchiya2010algorithms; louedec2015multiple, where numerous studies have concluded that BTS achieved a substantial reduction in running time compared to nonBayesian simpler sampling strategies gopalan2014thompson; broden2018ensemble.
3.2 Reward Function
In this section we present a novel reward function designed for FRS. At the FL iteration , the sampling strategy recommends item set , selected as part of to receive feedback (model updates or gradients denoted by ) from all of the users. For each item , the reward is optimized by integrating the immediate and gradual rate of changes in the gradients, jointly:
(13) 
where is the regularization term. The quantities and are the gradients of item from the and iterations. As stated by ADAM kingma2015adam, records an exponential decay of the past squared gradients for an item as:
(14) 
Taking inspiration from stochastic gradient approaches, our method computes a composite reward regularized by the number of FL iterations. The expression sums the reward as the function of the absolute differences in the gradients specifically modelling immediate changes during the initial FL iterations. The impact decreases as more rounds of updates have been completed. Whereas
increases the reward as the cosine similarity of the gradual changes in the gradients increases with the increasing number of FL iterations. The composite reward supplemented by the BTS strategy aims to balance exploration and exploitation. For instance, in the beginning, the item selection depends on the rate of change in the gradients. The items whose gradients changes are large are selected more often, whereas in the later phase, the selection of items is dependent on the overall similarity of the gradients in order to favor stable convergence, particularly in the online training of the recommendation model. Moreover, the regularization parameter
can be tuned to adjust the strength of the information sharing between the immediate and gradual changes, scaled by the a factor . For example, initializing restricts the method to estimate the reward by focusing on longterm gradual changes whereas pushes the function to infer the reward based on the immediate changes in the gradients.3.3 Regret
We believe that the regret of FCFBTS can be bounded with respect of the FL iterations . However, the existing works on FL (combined with stochastic gradient and BTS) do not provide sufficient tools for the proper analysis of our method. While the existing approaches provide using Gaussian priors) regret bounds agrawal2013further for BTS algorithm, they do not assume the FL problem settings. Alternatively, an informationtheoretic analysis proposed an entropybased regret bound over time steps for online learning algorithm using BTS russo2016information. However, their bound increases linearly according to the number of actions, which is typically large in our particular problem setting. An optimal regret bound for FCFBTS is one that has a sublinear dependency (or no dependency at all dong2018information) with the items (or –arms), in addition to remaining sharp within the large action spaces to duly satisfy the constraints of a privacypreserving FL recommendation environment.
To summarize, the proposed FCFBTS method offers a number of advantages in terms of production: (i) it allows for the optimizing of the payloads without collecting the user’s private or personal information such as the useritem interactions, (ii) the optimization of the payloads is performed on the serverside, thus avoiding any additional computational overhead on the user devices, (iii) no customization is needed on the userside, and the users perform a typical federated local model update step as part of the FRS, and (iv) it enables the smooth plugin/out payload optimization without making changes to the FL architecture or recommendation pipeline.
4 Related Work
The payload optimization problem and our solution to it are related to communicationefficient methodologies in federated learning. We next discuss the existing methods that promote communication efficiency and relate them to our work.
4.1 Non Recommender Systems
For traditional FL systems, our method can be viewed as a generalized approach for effective and efficient communication at each FL round DBLP:journals/corr/KonecnyMYRSB16 without assuming additional constraints on the users (or client devices), thus supporting privacysensitive applications. Several recent studies have provided practical strategies, such as the sparsification of model updates han2020adaptive and utilizing Golomb lossless encoding sattler2019robust. This is in addition to using knowledge distillation and augmentation jeong2018federated; he2020group, performing quantization DBLP:journals/corr/KonecnyMYRSB16, applying lossy compression and the dropout caldas2018expanding, and subsampling of the clientssaputra2019energy. From a theoretical perspective, these prior works have explored convergence guarantees with lowprecision training in the presence of nonidentically distributed data.
Federated Reinforcement Learning: A number of recent studies have adopted reinforcement learning, primarily to address hyperparameter optimization NEURIPS2020_6dfe08ed and to solve contextual linear bandits NEURIPS2020_4311359e in federated mode.
However, unlike our method, none of these methods address the key challenge of the largescale FRS running in production, specifically the huge payloads associated with the high number of items to be recommended.
4.2 Recommender Systems
Many studies have demonstrated promising results for FRS ammad2019federated; zhou2019privacy; chai2019secure; dolui2019poster; 10.1007/9783030676612_20; qi2020privacy; tan2020federated
. The recommendation models include factorization machine and singular value decomposition
tan2020federated, deep learning qi2020privacy and matrix factorization ammad2019federated; chai2019secure; dolui2019poster. To overcome the computation and communication costs as part of the recommendations, Chen et al.chen2018federated extended metalearning to federated mode. Muhammad et al. muhammad2020fedfastproposed a mechanism for the better sampling of users using Kmeans clustering and the efficient aggregation of local training models for faster convergence, hence favoring lesser communication rounds for FL model training. However, none of these approaches address the itemdependent payload optimization problem.
Recently, Qin et al. qin2020novel proposed a 4layer hierarchical framework to reduce the communication cost between the server and the users. Notably, their approach assumes that the useritem interaction behaviors (such as ratings or clicks) are public data that can be collected on a central server. The idea is to select a small candidate set of items for each user by sorting the items based on the recorded useritem interactions. Then it will transmit the userspecific candidate set to each user in order to train the local model and perform inference. Unlike theirs qin2020novel, our approach does not require the recording of any user sensitive interaction data and it solves the payload optimization problem in a standard federated setting with minimal computational overheads on the FL server. Our approach follows the widely accepted FRS setting without requiring any additional requirement for the users to share their sensitive data. It uses only the local model updates to solve the payload optimization problem ^{1}^{1}1Notably, we did not consider this as a baseline approach in our experiments owing to the differences in the FL architecture and the assumptions on data privacy..
To the best of our knowledge, we have proposed the first method to solve the payload optimization problem for FCF assuming an implicit feedback scenario. However, the proposed method is applicable to a wider class of FRS, particularly concerning the modelling of explicit user feedback without a loss of generality.
5 Datasets
We used three benchmark recommendation datasets to test the proposed federated payload optimization method. The datasets were processed in order to model the implicit feedback interactions in this study. The characteristics for each of the preprocessed datasets have been given in Table 2. We dropped the –timestamp information from the datasets, since we only needed the useritem interactions to analyze the proposed FCFBTS method. We selected the datasets to rigorously test FCFBTS primarily for two reasons: (i) the datasets contain a diverse set of items ranging from 3064 to 17632, and (ii) the datasets are highly sparse in nature which is typically anticipated in a production environment.
5.1 Movielens1M
Movielens1M harper2015movielens rating dataset was made publicly available by the Grouplens research group (https://grouplens.org/datasets/movielens/). The dataset contained 1,000,209 ratings of 3952 movies made by 6040 users. The rating dataset consisted of the user, movie, rating, and timestamps information. The ratings were explicit, so we converted them to implicit feedback based on the assumption that the users have watched the video that they have rated. All ratings were changed to one irrespective of their original value, and missing ratings were set to zero.
5.2 LastFM
LastFM cantador2011second rating dataset was made publicly available by the Grouplens research group (https://grouplens.org/datasets/hetrec2011/). The dataset contained 92834 listening counts of 17632 music artists by 1892 users. The listening count for each userartist pair was set to one irrespective of the original value and missing listening counts were set to zero to convert the data into implicit feedback.
5.3 Mind
MINDsmall wu2020mind news recommendations dataset was made publicly available by Github (https://msnews.github.io/). It was collected from the anonymized behavior logs of the Microsoft News website. This dataset contained the behavioral logs of 50,000 users. It was an implicit feedback dataset where 1 refers to clicked and 0 refers to nonclicked behavior. Users with at least 5 news clicks were considered. For simplicity, we denoted the MINDsmall dataset with the abbreviation “MIND” throughout the manuscript.
Datasets  # Users  # Items  # Interactions  Sparsity (%) 

Movielens1M  6040  3064  914676  96.05% 
LastFM  1892  17632  92834  99,78% 
MIND  16026  6923  163137  99,89% 
6 Experiments
To demonstrate the usefulness of the proposed bandit method, we compared the performance of FCFBTS with three other methods. As a baseline approach, we used the FCFRandom method that does not benefit from bandits for item selection. Instead, it selects a part of the global model that is comprised of items selected at random. Furthermore, to assess the advantage of optimizing the payload in a modeldriven fashion compared to the naive optimization method, we compared the FCFBTS performance with the TopList recommendation of the most popular items to every user. In addition, we used FCF ammad2019federated as an upperbound comparison to our FCFBTS method. In each FL communication round, FCF (Original) transfers (uploads/downloads) the whole global model between the server and users. This provides an estimate of the recommendation performance for each dataset, achievable when no payload optimization is performed in federated mode.
6.1 Hyperparameters
To ensure the fair treatment of all three methods, we adapted the same hyperparameter settings for FCF (as shown in Table 3) that were found to be optimal from the previous studies ammad2019federated; 10.1007/9783030676612_20. The FCFBTS specific hyperparameters of the prior were set as and the regularization of the reward was set as .
Model  K  

FCF  25  1  4  0.1  0.99  0.01  1e8 
The threshold parameter in Algorithm 1 refers to the number of federated model updates that are needed to update the global model. For each dataset, we selected as (Movielens, LastFM, MIND) relative to the total number of users ammad2019federated; 10.1007/9783030676612_20.
6.2 Model training and evaluation criteria
We followed the training and evaluation approach of Flanagan et al. 10.1007/9783030676612_20 and performed 3 rounds of model rebuilds. The training set of every user was comprised of 80% item interactions that were selected at random. The performance metrics were then computed on the remaining 20% of interactions (test set) for each user separately. Likewise, the users’ performance metrics were also aggregated to update the global metric values on the FL server. Notably, the FL server triggers the update of the global model if the , implying that in each iteration, only a subset of users sent their test set performance metrics along with the local model updates. At the 1000th iteration, we took the average of the previous ten global metric values to account for the biases that originate from the unequal test set distributions of the users sending asynchronous updates to the FL server.
We used wellknown recommendation metrics bobadilla2013recommender namely Precision, Recall, F1, and Mean Average Precision (MAP) to evaluate our models for top predicted recommendations, given the recommendation list length of 100. To implement these metrics, we adapted the formulation of Flanagan et al. 10.1007/9783030676612_20 (as described in their equations S2  S5). To make the recommendation metrics comparable, we further normalized the performance metrics using the theoretically best achievable metrics for each dataset. We computed the theoretically best metrics by recommending items from the test set of each user. However, if the user had less than 100 items in their test set, a recommendation list was formed by adding items at random with which the user has not interacted with in the past. Likewise, the TopList performance metrics were estimated using the 100 most popular items ranked by their interaction frequency in the training set.
Finally, we calculated two summary statistics to analyze the effect of the payload reduction on the recommendation performance degradation namely “Impr %" to quantify the relative performance improvement of FCFBTS compared to FCFRandom or TopList, and “Diff %" to compute the relative difference between FCFBTS and FCF (Original) performances,
(15) 
(16) 
where is the mean of the recommendation metric values across 3 model builds.
7 Results
As FCFBTS is the first payload optimization method for FRS, we used FCFRandom and TopList as the baseline comparison methods. We rigorously analyzed the effect of payload reduction on the recommendation performance degradation (loss of accuracy) using FCFBTS and FCFRandom. In particular, we analyzed the recommendation performance when 25%, 50%, 75%, 80%, 85%, 90%, 95% or 98% of the original model payload was reduced. In practice, this payload reduction implies that 75%, 50%, 25%, 20%, 15%, 10%, 5% or 2%ofitems from the total number of items has been used during the FL model training.
The effects of payload reduction on the recommendation performance degradation (loss of accuracy). The Xaxis denotes the % reduction in payload of the original model. The Yaxis (left side) represents the metric values, while the % degradation compared to the original model’s performance is shown on the right side. Each point denotes the average test set recommendation performance over three rounds of model rebuild with error bars showing the standard deviation over the mean. The proposed FCFBTS consistently outperforms FCFRandom (baseline) and demonstrates a substantial performance gain compared to the TopList recommendations while minimizing the payload by up to 90%.
The results demonstrate that the FCFBTS outperforms FCFRandom (Baseline) consistently as shown in Figure 2. We noticed a significant improvement for highly sparse datasets such as LastFM and MIND. In comparison to the upperbound method, FCFBTS closely matches the performance of FCF (Original) in the LastFM and MIND datasets as there was up to a 90% reduction in the model payload, confirming that FCFBTS achieves the required performance with an extremely small payload. The method gets close in the Movielens dataset with a 75% payload reduction. This finding implies that the use of bandits is beneficial for production datasets that are inherently sparse in nature. Most importantly, FCFBTS yields substantial performance gains compared to the TopList recommendations in the LastFM dataset while using only 2% of the model payload. It shows a comparable performance in Movielens and MIND when 5% of the items are used for model training.
Particularly, FCFBTS showed promising results with a 90% payload reduction for all three datasets as shown in Table 4. In the Movielens dataset, the performance degradation for precision, recall, F1 and MAP was 18.77%, 20.19%, 19.88% and 23.06% respectively, compared to the performance achievable by the FCF (original) model. On the other hand, FCFBTS improved precision (28.3%), recall (27.57%), F1 (27.74%) and MAP (40.75%) relative to FCFRandom (Baseline) and similarly, FCFBTS showed precision (46.53%), recall (48.19%), F1 (47.32%) and MAP (59.99%) incremental improvements compared to the TopList recommendations.
In the LastFM dataset, FCFBTS had 6.12%, 5.69%, 5.93% and 8.8% less precision, recall, F1 and MAP metrics respectively, compared to the upperbound performance metrics. FCFBTS showed an increase in precision (72.64%), recall (73.6%), F1 (73.1%) and MAP (98.85% ) over FCFRandom (Baseline). In comparison to the TopList, FCFBTS resulted in substantially better recommendations while improving precision, recall, F1, and MAP by 164.88%, 165.14% 164.93% and 233.44% respectively (see Table 4).
Precision  Recall  F1  MAP  
Movielens1M  
FCF  0.37440.00582  0.38550.00754  0.38170.00566  0.24000.00702 
FCFBTS  0.30410.00801  0.30760.01055  0.30580.00918  0.18460.00774 
FCFRandom  0.23700.01154  0.24110.00644  0.23940.00765  0.13110.00685 
TopList  0.20750.00027  0.20760.00052  0.20760.00046  0.11540.00014 
FCFBTS vs. FCF (Diff%)  18.77  20.19  19.88  23.06 
FCFBTS vs. FCFRandom (Impr%)  28.3  27.57  27.74  40.75 
FCFBTS vs. TopList (Impr%)  46.53  48.19  47.32  59.99 
LastFM  
FCF  0.21310.01128  0.21240.01044  0.21270.01086  0.13280.00745 
FCFBTS  0.20010.00523  0.20030.00502  0.20010.00512  0.12110.00456 
FCFRandom  0.11590.00487  0.11530.00479  0.11560.00482  0.06090.00218 
TopList  0.07550.00233  0.07550.00232  0.07550.00232  0.03630.00139 
FCFBTS vs. FCF (Diff%)  6.12  5.69  5.93  8.8 
FCFBTS vs. FCFRandom (Impr%)  72.64  73.6  73.1  98.85 
FCFBTS vs. TopList (Impr%)  164.88  165.14  164.93  233.44 
MIND  
FCF  0.11080.00314  0.11210.00438  0.11100.00339  0.04960.00286 
FCFBTS  0.10590.00379  0.10570.00386  0.10590.00380  0.04610.00264 
FCFRandom  0.02940.00259  0.02960.00281  0.02940.00263  0.01020.00112 
TopList  0.10020.00067  0.10030.00046  0.10030.00063  0.04180.00044 
FCFBTS vs. FCF (Diff%)  4.43  5.71  4.67  7.1 
FCFBTS vs. FCFRandom (Impr%)  260.06  256.1  259.32  352.46 
FCFBTS vs. TopList (Impr%)  5.67  5.32  5.58  10.39 
Lastly, for the MIND dataset, the performance of FCFBTS closely matched the performance achievable by the FCF (Original) model. The relative differences in precision, recall, F1 and MAP metrics were 4.43%, 5.71%, 4.67% and 7.1% respectively, which are small compared to the performance differences given by FCFRandom (Baseline). FCFBTS significantly outperformed FCFRandom (Baseline) with 260.06%, 256.1%, 259.32% and 352.46% higher precision, recall, F1 and MAP metrics, respectively. In contrast to the performance of TopList, FCFBTS demonstrates incremental increases in precision (5.67%), recall (5.32%), F1 (5.58%) and MAP (10.39%).
Next, we demonstrated that the proposed FCFBTS method converges on the optimum and closely matches the solution that is achieved by FCF (original) for the sparse dataset (LastFM and MIND). Figure 3 shows that FCF (Original) reached the optimal solution between FL iterations in all three datasets. For the LastFM and MIND datasets, we observed that the FCFBTS method converges on the optimal solution between , thus requiring additional iterations to get close to the upperbound optimal solution as shown in Figure 3. This is typically expected in any form of optimization method that uses part of the whole model (fewer parameters) in each iteration. Most importantly, it validates the fact that FCFBTS converges on the optimal solution while using only 10% of the model payload, compared to the naive FCFRandom (baseline) method. In the Movielens dataset, we realized that FCFBTS converges on the optimum in iterations similar to FCF (Original). However, the differences in performance are relatively large compared to the LastFM and MIND datasets. Nevertheless, Figure 3 illustrates the convergence stability of FCFBTS across the three datasets up to 1,000 FL iterations similar to the FCF (Original) method’s convergence. In summary, our rigorous analysis confirms that the FCFBTS solution closely matches the FCF (Original) method’s optimal solution for sparse datasets, although at a different rate. The results summarize that with a loss in the recommendation accuracy of (for highly sparse datasets) in comparison to the standard FCF method, FCFBTS makes it possible to utilize a smaller payload (reduction up to 90%) in FL model training.
8 Conclusion
In this study, we tackled the challenge of increasing payloads faced by FRS if deployed in a realworld situation. The requirement to move huge model payloads between the FL server and the user over several training rounds is neither practical nor feasible for a RS operating in production. We introduced an optimization method that addresses the payload challenge by selecting part (smaller payload) of the global model to be transmitted to all users. The selection process was guided by a bandit model optimizing a novel reward policy suitable for FRS. The proposed method was rigorously tested on three benchmark recommendation datasets and the empirical results demonstrate that our method consistently performed better compared to the simpler and naive optimization approaches. Our method achieved a 90% reduction in payload with a minimal loss of recommendation performance from 4% to 8% in highly sparse datasets. In addition, our method yielded a performance comparable to TopList with a 95% payload reduction in two out of three datasets. The results establish that the banditbased payload optimization can provide a similar quality of recommendation without increasing the computational cost for the user’s devices when participating in the FRS, particularly in production.
In future work, we intend to extend the current research work in multiple directions. We have presented the payload optimization of the standard FCF to demonstrate a proofofconcept. It would be interesting to investigate whether similar results will be achieved in the context of larger datasets and far more recent and advanced FRS methods qi2020privacy; 10.1007/9783030676612_20. In this study, we empirically validated the usefulness of the proposed optimization method. A key next step would be to study the theoretical properties reflecting upon the convergence guarantees and regret bounds for the novel reward function.
Acknowledgement:
This work was supported by Helsinki Research Center, Europe Cloud Service Competence Center, Huawei Technologies Oy (Finland) Co. Ltd.
Comments
There are no comments yet.