The information on the position of smartphones and other connected devices is relevant for several location-based services, e.g. navigation, advertisement, and social networking. However, an uncontrolled disclosure of our position and movements is a severe violation of our privacy with consequences at both personal and societal level, [Gupta2017]. Some cases related to the disclosure of position have already been reported in courts, such as covert location-based surveillance of employers, extra charges for rental car clients, circumstantial evidence gathering, and location-based profiling. Location privacy (also named geolocation privacy) has been studied from a legal standpoint in the last years [king2011personal], with a revamped interest associated to the recent gdpr act in the European Union . Also these regulations have highlighted the responsibilities of the mno, once considered as trusted.
The evolving cellular communication standard developed by the 3gpp has significantly strengthened the technical tools to locate and track ue, with the 5g network being able not only to exploit millimeter waves for localization, but also artificial intelligence for its processing through the nwdaf. Moreover, the mno can disclose the position information to ott operators through the nef, further threatening our location privacy. The trend will continue in b5g networks, where more precise localization techniques will be included, also using frequencies in the Terahertz (THz) band[Fang20].
I-a Related Literature
Several works have addressed location privacy, in most cases focusing on network and application layers. A pretty good privacy solution has been proposed in [DBLP:journals/corr/abs-2009-09035], where connectivity and authentication functionalities were decoupled. Other approaches are based on k-anonymity [Li18] or differential privacy [Yin18]. To limit the leakage of information to location-based service providers, a middle-ware can be introduced [Beresford04]. Blockchains can also be exploited to anonymize users [Qiu20], satisfying the principle of k-anonymity privacy protection without the help of trusted third-party anonymizing servers. However, in all the existing literature the mno has been assumed to be trusted and location privacy is defended only against external attackers.
Only recently, a more global vision of location privacy protection has been introduced in , considering also the mno as not trusted. In , the concept of vpmn has been first introduced, where a set of ue perform d2d communications that are not accessible to the mno, while only selected vpmn devices operate as gateways between the vpmn and the cellular network. With this approach, the mno knows only the location of the gateways and not that of the ue.
In this paper, we consider a vpmn with multiple gateways, and focus on the uplink transmission, where vpmn ue route their packets towards the gateways, which in turn forward them to the gnb of the serving cell. ue packets may arrive to the gateways either by a single hop, of passing though other vpmn ue operating as relays. In both scenarios, a first option is to apply a maximum-flow algorithm to determine the transmission rate. However, this choice creates an unbalance of data rates at the gateways, which partially reveals the location of the vpmn ue: in fact, the maximum rate solution uses close-by gateways more intensively. Therefore, we revise the maximum rate problem by adding the constraint that the same data rates are transferred by each gateway from each ue.
, only very preliminary results on the performance of a possible vpmn are reported, and here we offer a wider analysis of the vpmn, in terms of probability of connectivity of the ue in a given area, achievable rates, and localization error. Results show that the vpmn offers an interesting trade-off among rates and localization error in typical B5G cellular scenarios.
The rest of this paper is organized as follows. In Section II, we revise the vpmn, consider a possible implementation, and model the channels of the links. Section III addresses the problem of routing for vpmn, recalling the maximum-rate routing and introducing the location-privacy preserving routing. An in-depth evaluation of the considered vpmn is proposed in Section IV, before conclusions are driven in Section V.
Ii The Virtual Private Mobile Network
We focus here a vpmn including devices, namely ue and gateways, as introduced in . The objective is to prevent the localization of the ue by the mno. To this end, the ue communicate in a d2d fashion, without the intervention of the gnb. The vpmn will then communicate with the cellular network of the mno through the gateways. An example of vpmn is shown in Fig. 1.
Ii-a Security Assumptions
The mno is only aware of the list of ue in the vpmn but it is not able to decode their communications or identify the signals over the air, i.e., associate them to the sender ue. For this later aspect, suitable techniques should be deployed (see  and ). In the uplink, when a vpmn ue has to send a packet to the cellular network, it transmits it, possibly by multiple hops within the vpmn, to the gateways, which in turn transmit it to the gnb. Note that the communications among ue will be encrypted with codes not available to the gnb, to avoid localization of the ue by the gnb. In the downlink, the gnb sends each packet in broadcast to the gateways, which will re-encrypt it and send to the destination ue inside the vpmn, possibly through multiple d2d hops within the vpmn. In the following, we focus on the uplink.
Ii-B VPMN Implementation
Here we do not consider the implementation details of the vpmn: we only observe that at the moment, the 3gpp standard does not allow its direct implementation. Indeed, the d2d communication is strictly under the control of gnb and devices involved in d2d transmissions are localized and identified by the gnb. We could implement the vpmn as a femtocell, where the gateway is the femto-gnb: however, a single femto-gnb would be available, restricting us to the use of a single gateway.
A second option is to use another technology, e.g., the IEEE 802.11s standard that supports the implementation of a mesh WiFi network to implement the vpmn, and the gateway ue will interface the two networks. Although immediately deployable, this solution has the disadvantage of involving two standards, designed for two different scenarios; in particular, while cellular standards handle well mobility by the handover procedure, WiFi is designed for slowly moving terminals. In any case, we envision that a vpmn can be well defined in future versions of the cellular standard, and this work is a contribution in this sense.
Ii-C Channel Model
All vpmn devices have a single antenna111Future studies will also consider devices with multiple antennas., and both the d2d channels among vpmn ue and the gateway-gnb channels are modeled including the effects of shadowing and path loss in an urban scenario. Two devices (ue or gateways) will be connected if their channel gain value in dB satisfies
where represents a suitable threshold in dB. Let be the distance between ue and .
The channel gain in dB is obtained from the addition of the local shadowing component at both the receiver and the transmitter , and the path-loss of the link as
where the path-loss of the link is
is a normalization factor, and is the path-loss exponent. About the shadowing, we consider also its spatial correlation, and in particular the correlation of shadowing experienced at devices and is [Book-Wiley]
where is the decorrelation distance, dependent on the environment. We collect for all the devices into the matrix . Then, let
be a column vector collecting, which can be generated as , where
is a vector of independent zero-mean unitary variance Gaussian variables andis the Cholesky decomposition of .
Iii Routing For The VPMN
In the following, we will concentrate on the uplink, i.e., the transmission from the vpmn ue to the gateways and then the gnb, as it can be exploited by the mno to localize the ue.
We consider two options for the routing of packets in the vpmn: a) multihop, where packets generated by a ue can be forwarded to a gateway by multiple hops though other ue, and b) single hop, where ue transmit packets directly to the gateways. About the single hop, we observe that a ue that is not connected to any gateway is not part of the vpmn.
Time is divided into slots, and one ue transmits in each slot to avoid interference. About the multihop case, the maximum achieved rate on link between devices and is
where we assumed without restriction a unitary-power additive white Gaussian noise. For the single-hop case we have
where , are the indices of the vpmn gateways and links with non-zero rates are only those towards the gateways.
Iii-a Maximum-Rate Uplink Routing Algorithm
The routing solution that maximizes the achievable rate is the max-flow algorithm [cormen2009introduction] on a graph having a node for each vpmn device, and edges between all connected devices. The weight of the edge between devices and is its maximum rate .
Note that the maximum flow problem is defined between one node operating as the source and one node operating as the sink, while in the case of multiple gateways we have multiple sinks. In this case, to obtain the maximum flow solution, we add a fictitious node, denoted as super-sink, which is connected to all gateways through edges with infinite weight (rate). Then, we solve the maximum flow problems from each ue to the super-sink, using several solutions available, e.g., the Ford-Fulkerson algorithm [cormen2009introduction].
Iii-B Location Privacy Assessment
One main design objective for the vpmn is to prevent the localization of vpmn ue by the mno. Still, due to the fact that the ue are connected (possible by multiple hops) to the gateways and the mno can intercept packets going though the gateways, the mno has some information on the ue location. We now consider the localization error.
In the following, we assume that all communications in the vpmn are not identifiable and decodable by the mno, i.e., the mno cannot use them to obtain the location of vpmn ue. 222Note that this may require further attentions in designing communications among ue, see  for an overview of threats and possible countermeasures. Still, the mno knows the location of the gateways, using several available localization techniques . Moreover, we assume that the mno can intercept packets going through the gateways, and associate packets to its transmitting ue, e.g., by the ip address. Clearly, anonymization techniques operating at the ip level can limit these possibilities.
To assess the localization error, let be the position of ue , and let
be its position estimated by the mno.
Localization Error With A Single Gateway
In case of a single gateway, the mno sees all packets coming from a single position , that of the gateway. Therefore, the best estimate of any vpmn ue position is the gateway position, i.e., . For the single hop scenario, the average localization error
coincides with the average distance of ue from the gateway. Still, considering the correlated shadowing model, it is not straightforward to compute the average localization error. Therefore, in Section IV we will resort to simulations for its assessment.
Localization Error With Multiple Gateways
When multiple gateways are present, we can still estimate the average localization error as the average connection distance, resorting to numerical methods. However, we observe that when multiple gateways are present, the mno can intercept packets and identify the transmitting ue: by observing the data rates coming from the different gateways for a single ue, and knowing the location of the gateways, a better localization can be achieved. For example, in the single hop scenario, if each vpmn ue is connected to the nearest gateway, observing from which gateway packets come, the mno can significantly reduce the localization error. Similar considerations hold for the multihop scenario, where having multiple gateways allows also a trilateration based on the observed rates at each gateway. This will also be confirmed by simulations in Section IV.
Iii-C Privacy Preserving Maximum Rate Routing
We therefore propose a routing strategy that ensures for each ue that the same data rate is achieved through all the gateways, regardless of the ue position. To this end, we formulate an optimization problem to determine the rate on link . First, the objective function to be maximized is the rate from ue to all the gateways , i.e., (for all ue)
Furthermore, to ensure that no information on the position of ue is disclosed by the resulting flows through the gateways, we impose that all gateways have the same rate for each ue, i.e.,
We obtain the following linear programming problem:
where (10b) is the maximum flow constraint on the link, (10c) is the flow-conservation constraint (the outgoing rate should not be larger than the incoming rate for each intermediate ue), (10d) ensures non-negative rates, while (10e) is the constraint (9) on the same rate for each gateway and each ue.
Iv Numerical Results
We consider an area of 100 m x 100 m, wherein vpmn devices are uniformly randomly distributed. For channel modeling, we assume a decorrelation distance m and a path loss exponent . The path-loss normalization factor is m, corresponding to 10 dB loss every 100 m.
We now assess the performance of the vpmn according to several metrics. In particular, we evaluate the probability that all ue constitute a single vpmn (i.e., suitable connections with the gateways exist), the localization error by the mno provided by the vpmn, and the rates provided by the routing algorithms with and without location privacy constraint (achievable rates).
We consider here the probability that the randomly dropped devices constitute a vpmn with multihop routing, i.e., they are a connected component where edges are present when condition (1) is satisfied.
We recall that a conforming set is a set of edges for which the nodes are connected. We denote with the family of all conforming sets. The probability that all ue are in the same connected component is
are correlated Gaussian random variables,is related to the cumulative distribution of a set of correlated random variables, for which no close expression exists, thus we resort to numerical methods for its evaluation.
Fig. 2 shows the probability that all devices are in the same connected component , as a function of for , 10, and 20 devices. We observe that increasing the number of devices increases the connection probability: since the devices are randomly dropped in the same area, dropping more devices will make them closer, thus increasing the probability of being connected. Moreover, decreasing also increases the connection probability, with values above 0.9 for dB.
Note that can be read also as the probability that a ue finds a vpmn to connect to, given that there are devices in the area. Similar results hold also for the single-hop case, not reported here.
Iv-B Localization Error
We now consider the localization error, for the case of single and multiple gateways.
We consider a single gateway, placed at the center of our map with coordinates . Fig. 3 shows the average localization error given by (7), as a function of for , and ue. We observe that for small values of , where we also have that all ue are a connected component, the average localization error is very high. For smaller connected components (higher values of ) the localization error is reduced. For the multihop scenario instead, a higher number of nodes yields richer connected components, in turn increasing the localization error. In the single-hop scenario, the localization error does not depend on the number of ue, as the connectivity of each ue to the vpmn does not depend on other ue. Also, the single-hop scenario has a lower localization error than the multihop case, due to the need of ue to be directly connected to the gateways, i.e., being closer to each other.
When multiple gateways are present, we have observed that the mno can infer the position of the ue also by checking which fraction of the ue traffic goes through the various gateways. To better understand this point, we consider a scenario where all devices are on a line: two gateways (with indices and ) are at positions and , while
ue are on the same line, with uniformly distributed positions in [-50, 50]. The ratio between the rates (solution of the max-flow problem) going through the two gateways is
which is used to infer the position of the ue. Fig. 4
shows the contour plot of the joint probability density function (PDF)of the random vector , with being the position of the ue, and the corresponding value of the ratio (12). Both the single hop and multihop scenarios are considered. For a given observation of , its maximum likelihood estimate of the ue position is
where is the PDF of , which can be obtained from by marginalization. With this procedure, the average localization errors for the single hop and multihop scenarios are
Clearly, in the single hop scenario the PDF of Fig. 4 is much more concentrated (with smaller variance) than the PDF in a multihop scenario, since for the single hop scenario it only depends on the direct links between the ue and the gateways, which in turn mostly depend on the distance of the node from the gateways through the path-loss.
Iv-C Achievable Rate
We now consider the average rate through the gateways, namely, the average of the sum of (8) over all ue
where the average is done with respect to the ue positions and channel realizations for randomly dropped devices in the square area. Fig. 5 shows the average rate for , and 3 gateways, and different numbers of ue, . We compare the performance obtained with the unconstrained maximum-flow (UMF) algorithm and the privacy-preserving maximum flow (PPMF) solution of Section III-B. First, note that with a single gateway the UMF and PPMF solutions coincide, thus we show a single line per scenario. Then, we note that the single-hop scenario yields much lower rates than the multihop scenarios, where links among ue are exploited to increase the rate. Lastly, we observe that introducing the privacy preserving constraints has a significant impact on the rate, in particular for the single-hop scenario, where the direct links of the ue with all the gateways may not be available, and in this case the rate is zero to avoid location information disclosure. In the multihop scenario, instead, the rate reduction is much less pronounced, making this option more attractive for implementation.
For a vpmn aiming at defending the location privacy of the ue while ensuring connectivity with the cellular network, we have considered the case of multiple gateways. We have highlighted that using an internal routing algorithm that simply maximizes the vpmn data rate towards the cellular network may reveal information on the location of the vpmn ue. Therefore, we have derived a routing solution that prevents this information leakage by ensuring that data is collected from each ue with the same rate from all the gateways. We have also assessed the performance of the proposed routing solution, showing that it has a reduced performance loss with respect to the maximum-rate routing.