1 Introduction
Smart meters (SMs) are essential components of smart grids: they collect realtime consumption data of a household, and report it to the utility provider (UP). SM measurements can be used for timeofuse pricing, trading usergenerated energy, and mitigating load variations [1]. However, SM readings can also reveal details about consumer’s private activities, which they may not want to share with the UP. Various techniques have been proposed in the literature to enable SM privacy [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], which can be categorized as those based on SM data manipulation, and those based on demand shaping. While the former focuses on modifying SM measurements [2, 3], the latter directly manipulates energy consumption exploiting physical resources, such as a rechargeable battery (RB) [4, 5, 6, 7, 8] or a renewable energy source (RES) [9, 10, 11, 12, 13]. Manipulating the SM readings reduces the relevance of the reported vaues for grid management and load prediction, limiting the benefits of SMs. Moreover, the grid operator can place sensors outside a household and obtain the real consumption data, since they own and control the infrastructure. On the other hand, demand shaping tackles these issues by manipulating the real consumption. In demand shaping, instantaneous demand of a user can be supplied partially by the power grid, as the rest can be provided by the RB or RES. This effectively filters the energy consumption time series, reducing the information leakage to the UP.
In [7], information theoretic privacy with a RB is formulated as a Markov decision process (MDP). Markovian energy demand is considered, and the minimum leakage is obtained numerically through dynamic programming (DP), while a singleletter expression is obtained for an independent and identically distributed (i.i.d) demand. This approach is extended to the scenario with a RES in [9]. In [8], privacycost tradeoff is examined with a RB. Due to Markovian demand and price processes, the problem is formulated as a partially observable MDP with beliefdependent rewards (POMDP), and solved by DP for infinitehorizon.
We consider the privacycost tradeoff of a SM system with both a RB and a RES [8, 14]. We measure privacy as the mutual information rate between the user demand and the energy received from the grid. We define the cost as the average amount of energy received from the grid, and study the tradeoff between the cost and privacy by setting their weighted sum as the objective function. We formulate the problem as a MDP with a continuous belief state, and solve it numerically by DP by quantizing the belief state. Our contribution with respect to [8] is the inclusion of a RES into the system, which provides additional privacy. While [9] also considers RB and RES jointly, here we study the privacycost tradeoff, and present numerical solutions focusing on a particular renewable energy arrival process that fully recharges the RB at random time instances. We also provide two lowcomplexity policies and a lower bound, which exploit the special structure of the renewable energy process. We show numerically that the policy targeting a fixed recharge period performs very close to the infinitehorizon MDP solution, providing a lowcomplexity alternative for practical systems.
2 System Model
We consider a discrete time system model, illustrated in Fig. 1, in which the energy demand of the user and energy requested from the grid at time slot are denoted by and , respectively, where . The RB state of charge at the beginning of time slot is denoted by :=. The RB charging and discharging process is assumed ideal without any losses (see [6] for a model with energy losses). := units of energy is generated by the RES at the beginning of each time slot ; that is, when the renewable energy arrives, it completely recharges the RB, and it can be used by the appliances only through the RB. The process is assumed to be independent of , and known by the UP. We assume that and are i.i.d. with distributions and , respectively.
The appliances’ energy demand is always satisfied; that is, , . In addition, intentional energy waste to provide privacy, or selling energy to the grid are not allowed. Therefore, the battery state of charge is updated as
(1) 
where is chosen such that .
The amount of energy requested from the grid is determined by a randomized battery charging policy , where
is a conditional probability distribution
, which randomly decides on the amount of energy received from the grid at time , given the histories of demand :=, battery charge , energy generation , and grid energy . Our goal is to find an energy management policy, , which provides the best tradeoff between the privacy and cost.2.1 Privacy Measure
We measure the privacy of a policy over time slots by the information leakage rate, , defined as the average mutual information between the demand side load and initial RB charge , and the SM readings :
(2) 
where is known by the UP. It can be shown, similarly to [7], that there is no loss of optimality in considering policies of the form ; that is, it is sufficient to consider only the current demand and battery state. Hence, (2) can be rewritten in an additive form
(3) 
The Markovity of optimal actions and the additive objective function in (3) allow us to represent the privacy component of our problem as a MDP with state =. However, the leakage at time depends on and , resulting in a growing state space in time. Therefore, a belief state
is defined as the causal posterior probability distribution over the state space given
and :(4) 
The control actions chosen by randomized policies are the conditional probabilities of energy received from the grid given the state and belief, and denoted by == [9].
We follow the approach in [9] for updating the belief state, and define the perstep leakage of taking action which is incurred by the policy at each step as,
(5) 
The average leakage rate over a finitehorizon , , is equal to the original formulation in (3). Given belief and action probabilities, average information leakage rate at time is formulated as,
(6) 
2.2 Energy Cost
Energy cost is defined as the average amount of energy received from the grid over time slots,
(7) 
We remark that, differently from [8], we do not consider timevarying energy unit cost, although our model can easily be extended in this direction. In the context of [8], i.e., in the absence of a RES, our cost model would result in a deterministic energy cost, independent of policy . However, in the presence of a RES, our cost model follows [12], and incentivizes the maximum exploitation of locally generated renewable energy. For example, when privacy is not a concern, cost minimizing policy would use battery energy first, to be able to store the arriving renewable energy as much as possible. The energy cost averaged over a finitehorizon is simply . The average perstep cost can be represented in terms of belief and action probabilities as follows:
2.3 Weighted Total Privacy Leakage and Energy Cost
We have two distinct performance measures, which are not necessarily aligned. Therefore, we define the objective function as the weighted sum of the information leakage rate and the average cost over all the feasible policies, as ,
(8) 
where is determined by the user according to her preference regarding privacy and cost. Our goal is to design a policy , which is the of the right hand side of (8), satisfying the energy management rules. This problem can be modeled as an MDP with state and action . The corresponding Bellman equations can be written similarly to [7] and [9]. We include the instantaneous weighted objective function, , into the Bellman operator,
(9) 
where is the value function and the updated belief state is represented by . The mplementation of DP for infinitehorizon is as follows:

For constant [15], the value function is timehomogeneous and defined iteratively:
(10) 
Timehomogeneous optimal policy, ,
(11)
While an exact DP solution cannot be achieved due to the continuous belief, we provide an approximate numerical solution. To be able to solve the problem numerically by DP, we discretize the belief . At each value iteration, we quantize the updated belief, , by rounding it to the closest discrete belief value.
3 LowComplexity Policies
Due to the special renewable energy generation process we consider here, the problem is an episodic MDP, which resets to an initial state of full RB at every renewable energy instant. Between two consecutive energy arrivals, energy transitions occur only between the grid, RB and home appliances. Hence, for each time period between two charging instances, the system can be modeled as a SM with only a RB and no RES. Accordingly, we formulate a finitehorizon privacycost tradeoff problem for a SM system with an initially full RB, which will be used to propose a lowcomplexity policy as well as a lower bound for the original problem.
In the finitehorizon problem with no RES, as before, the user demand is always satisfied by imposing , , and RB charge is updated by =. Randomized battery charging policies, , are of the form . The information leakage rate induced by the policy over a finitehorizon between two consecutive energy arrivals is given by,
(12) 
(13) 
Similarly to the original problem, the finitehorizon problem with no RES can also be formulated as a MDP with belief . Control actions used to determine energy received from the grid are defined as ===. To solve the finitehorizon problem by DP, we use the method in [7] for belief updates, and express the average information leakage rate at time in terms of belief and action probabilities as follows:
(14) 
The average perstep energy cost for the finitehorizon problem is determined similarly to Section 2.2, and represented in terms of the belief and action probabilities as:
(15) 
The weighted objective function to be minimized for the finitehorizon privacycost tradeoff is denoted by =+. We can first quantize the belief state, and solve the resulting MDP with a finite state space by DP recursively using the Bellman operator in (9) with the corresponding changes for finitehorizon.
In the next subsection, using the solution of the finitehorizon problem above, we will propose a lowcomplexity solution for the original infinitehorizon problem with an RB and RES, and a resetting energy generation process of the form .
3.1 Threshold Policy (TP)
In TP, we fix a target horizon , and after each RB recharge instance, start employing the optimal energy management policy for this finitehorizon, derived in the previous section. We follow the optimal policy for horizon until either the battery is recharged again, in which case we restart with the same policy, or we reach the time horizon . If the RB is not recharged at time
, we assume that we simply provide all the energy demand directly from the grid, resulting in full information leakage. The intuition behind this scheme follows from the law of large numbers, which suggests that the RB will be charged after
time slots with high probability. We will consider policies with a fixed time horizon of , as well as those with an optimized time horizon. Our numerical results in Section 5 show that the performance with optimized but fixed time horizon closely follows that of the infinitehorizon solution.3.2 Battery Conditioned Policy (BCP)
We propose another lowcomplexity policy, which depends only on the current input load. In BCP, when there is no demand, we allow the RB to be recharged by the grid with a probability for each battery state =, for =. On the other hand, when there is energy demand, the RB is discharged with a probability for each battery state. As before, intentional energy waste is not allowed. When there is demand in the case of an empty RB, it is entirely supplied from the grid. We choose values that minimize (3) by an exhaustive grid search on .
4 Lower Bound
Next, we provide a lower bound on the privacycost tradeoff by assuming that the user noncausally knows the times at which the RES recharges the RB. In Fig. 2, these time instances are represented by consecutive arrows. The weighted sum of finitehorizon leakage rate and average energy cost, minimized over policy , is denoted by in Fig. 2. Given i.i.d. , the probability that the RB is recharged after time slots is given by
(16) 
If the RB recharge instances are known in advance, the problem reduces to the finitehorizon MDP for each interarrival period, and can be solved as outlined in Section 3.
Once the optimal performance is evaluated for all , the lower bound can be derived by taking their average using the probability mass function in (16):
(17) 
where the coefficient approaches zero as , while approaches the infinitehorizon privacycost tradeoff. For the numerical solution of the infinitesum indicated in (17), we perform the summation for finite = such that . To obtain the minimum satisfying this inequality, we first consider the worst case information leakage rate and average energy cost, where all the demand is supplied by the grid, , and denote the lower bound by
where represents the worst case privacycost tradeoff, in which and are the entropy and expected value of the demand, respectively. Hence, we choose the minimum value that satisfies . We can find a finite satisfying this inequality for any .
5 A simple binary example
We consider a simple scenario with ,=, = and =. We emphasize that obtaining numerical results for larger alphabets is challenging as the belief grows with the state space, and so does the computational complexity, also due to the quantization of the belief. For simplicity, demand and energy generation processes are assumed to be i.i.d. with Bernoulli = and , respectively. Extensions to Markovian process is straightforward for TP and BCP; however, the MDP formulation requires including in the state, and updating the belief accordingly. We consider a privacycost tradeoff weight of =.
The weighted total privacy leakage and energy cost for TP, BCP and infinitehorizon MDP are depicted in Fig. 3, together with the lower bound. The average weighted cost decreases with , since the demand can be mostly supplied by the RES, decreasing both the cost and leakage. The lower bound is obtained from (17) evaluated over a sufficiently long . While the lower bound is not tight in general, it also shows us the value of predicting the energy generation instances for optimizing the privacy and cost. Two plots of TP are obtained corresponding to different horizons. For the first TP plot, the finitehorizon is set to be =. Since TP leads to full information leakage when energy arrives later than the set horizon, this approach has a higher privacycost tradeoff compared to the infinitehorizon DP solution of the original problem. For the second TP plot, for each value, the best horizon value is selected by searching over the set . We observed that, the optimal fixed horizon is typically longer than , which reduces the probability of full leakage. Interestingly, the performance of TP with optimized yet fixed horizon follows that of the infinitehorizon MDP solution very closely. We remark here that the curve obtained for the infinitehorizon MDP solution is an approximation as well, due to the quantization of the belief. Finally, we observe that the performance of the BCP scheme can outperform that of fixed horizon TP policy for high values.
6 Conclusions
We have studied the privacycost tradeoff in a SM system equipped with both RB and RES. Motivated by the episodic nature of the problem, we proposed a lowcomplexity TP policy to solve this infinitehorizon problem by solving simplified finitehorizon problems with only RB and no RES. We also proposed the BCP policy, whose actions depend only on the demand. We numerically showed for a binary example that the fixedhorizon policy that ignores the RES process can achieve a nearoptimal performance. As a future work, we will try to quantify/bound the gap between the two policies.
References
 [1] G. Giaconi, D. Gündüz, and H. V. Poor, “Privacyaware smart metering: Progress and challenges,” IEEE Signal Processing Magazine, vol. 35, no. 6, pp. 59–78, Nov 2018.
 [2] F. D. Garcia and B. Jacobs, “Privacyfriendly energymetering via homomorphic encryption,” in Proc. 6th Workshop Security and Trust Management, vol. 6710, pp. 226–238, 2017.

[3]
Y. Kim, E. C. H. Ngai, and M. B. Srivastava,
“Cooperative state estimation for preserving privacy of user behaviors in smart grid,”
IEEE International Conference on Smart Grid Communications (SmartGridComm), pp. 178–183, 2011.  [4] M. Arrieta and I. Esnaola, “Smart meter privacy via the trapdoor channel,” IEEE International Conference on Smart Grid Communications (SmartGridComm), pp. 277–282, 2017.
 [5] S. Han, U. Topcu, and G. J. Pappas, “Eventbased informationtheoretic privacy: A case study of smart meters,” American Control Conference (ACC), pp. 2074–2079, July 2016.
 [6] R. R. Avula, T. J. Oechtering, and D. Månsson, “Privacypreserving smart meter control strategy including energy storage losses,” arXiv eprints, Mar. 2018.
 [7] S. Li, A. Khisti, and A. Mahajan, “Informationtheoretic privacy for smart metering systems with a rechargeable battery,” IEEE Transactions on Information Theory, vol. 64, no. 5, pp. 3679 – 3695, 2018.
 [8] J. Yao and P. Venkitasubramaniam, “On the privacycost tradeoff of an inhome power storage mechanism,” in Annual Allerton Conference on Communication, Control, and Computing (Allerton), Oct 2013, pp. 115–122.
 [9] G. Giaconi and D. Gündüz, “Smart meter privacy with renewable energy and a finite capacity battery,” in IEEE 17th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), July 2016, pp. 1–5.
 [10] G. Giaconi, D. Gündüz, and H. V. Poor, “Smart meter privacy with renewable energy and an energy storage device,” IEEE Transactions on Information Forensics and Security, vol. 13, no. 1, pp. 129–142, Jan 2018.
 [11] J. GomezVilardebo and D. Gündüz, “Smart meter privacy for multiple users in the presence of an alternative energy source,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 1, pp. 132–141, Jan 2015.
 [12] O. Tan, D. Gündüz, and H. V. Poor, “Increasing smart meter privacy through energy harvesting and storage devices,” IEEE Journal on Selected Areas in Communications, vol. 31, no. 7, pp. 1331–1341, July 2013.
 [13] J. Chin, T. Tinoco De Rubira, and G. Hug, “Privacyprotecting energy management unit through modeldistribution predictive control,” IEEE Transactions on Smart Grid, vol. 8, no. 6, pp. 3084–3093, Nov 2017.
 [14] O. Tan, J. GomezVilardebo, and D. Gündüz, “Privacycost tradeoffs in demandside management with storage,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 6, pp. 1458–1469, June 2017.
 [15] M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.
Comments
There are no comments yet.