Energy Resource Control via Privacy Preserving Data

by   Xiao Chen, et al.
Stanford University

Although the frequent monitoring of smart meters enables granular control over energy resources, it also increases the risk of leakage of private information such as income, home occupancy, and power consumption behavior that can be inferred from the data by an adversary. We propose a method of releasing modified smart meter data so specific private attributes are obscured while the utility of the data for use in an energy resource controller is preserved. The method achieves privatization by injecting noise conditional on the private attribute through a linear filter learned via a minimax optimization. The optimization contains the loss function of a classifier for the private attribute, which we maximize, and the energy resource controller's objective formulated as a canonical form optimization, which we minimize. We perform our experiment on a dataset of household consumption with solar generation and another from the Commission for Energy Regulation that contains household smart meter data with sensitive attributes such as income and home occupancy. We demonstrate that our method is able to significantly reduce the ability of an adversary to classify the private attribute while maintaining a similar objective value for an energy storage controller.



page 1

page 2

page 3

page 4


Training privacy-preserving video analytics pipelines by suppressing features that reveal information about private attributes

Deep neural networks are increasingly deployed for scene analytics, incl...

DPNCT: A Differential Private Noise Cancellation Scheme for Load Monitoring and Billing for Smart Meters

Reporting granular energy usage data from smart meters to power grid ena...

Learning Privacy Preserving Encodings through Adversarial Training

We present a framework to learn privacy-preserving encodings of images (...

Deep Directed Information-Based Learning for Privacy-Preserving Smart Meter Data Release

The explosion of data collection has raised serious privacy concerns in ...

E-DPNCT: An Enhanced Attack Resilient Differential Privacy Model For Smart Grids Using Split Noise Cancellation

High frequency reporting of energy utilization data in smart grids can b...

Privacy-Preserving and Collusion-Resistant Charging Coordination Schemes for Smart Grid

Energy storage units (ESUs) including EVs and home batteries enable seve...

A Framework for Detecting and Translating User Behavior from Smart Meter Data

The European adoption of smart electricity meters triggers the developme...

Code Repositories


Energy resource control

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Traditionally, the power grid has been managed by the producers and grid operators with information primarily exchanged among the large asset owners with little feedback from its end users. However, the push for renewable energy sources has brought about the rise of distributed energy resources (DERs) that lie under the control of many smaller and disparate users causing a paradigm shift in the flow of information in the grid. The successful operation of DERs and other smart grid technologies depends on the exchange of large amounts of data from many different end users [BilevelBernstein, ZhangCoop, NavidiCdc]. Unfortunately, it may be unrealistic to assume the data will be available without consideration of the privacy concerns the data owners may face. It has been demonstrated that the increased granularity of data required for smart grid operation enables the inference of personal information [inferDr], which suggests data owners may be reluctant to exchange their data without some effort towards privacy preservation.

Studies have investigated various approaches to protect smart meter data privacy using a number of different metrics. While detailed surveys are given in [jawurek2012sok, komninos2014survey], we briefly cover a few popular solutions here. Aggregating data or its statistics has been considered [buescher2017two, corrigan2017prio] to provide user privacy since the aggregated data does not reflect any specific meter data above a certain aggregation size. Another approach at privatization comes from differential privacy [DworkC2008Survey], which is widely adopted in privacy mechanism design and analysis in the context of energy data [sankar2012smart, han2016event, chin2017privacy, eibl2017differential, ZhouAndersonLow2019ACC]. Specifically, studies [sankar2012smart, han2016event] and [chin2017privacy] proposed several frameworks for reducing the mutual information between raw data and privatized data (e.g. power profiles). Approaches proposed in [eibl2017differential] investigated the differential privacy effect with some noise injection (e.g. Laplace noise). It showed the aggregation group size must be of the order of thousands of smart meters in order to have reasonable utility. And [ZhouAndersonLow2019ACC] explored how much noise is required to be added to the data in order to achieve a certain level of differential privacy for existing Laplace mechanism in the context of solving optimal power flow.

We distinguish our studies by focusing on developing a methodology that learns an optimal noise injection for balancing the trade off between privacy and data utility, thus, preserving as much utility in the data as possible. It differs from strict differential privacy because we use a general notion of privacy that is the reduced correlation between private attributes and the data. This general notion of privacy gives us the flexibility to maintain the utility of the data while still eliminating an adversary’s ability to recognize certain private attributes. Since many applications of smart meter data involve their use in optimization procedures, we define the utility as the performance achieved when such data is used for optimal control. We consider a scenario where individual owners of DERs, such as battery storage systems, wish to privatize their data before releasing it to a DER aggregator to make optimal control decisions on their behalf, which can have applications in the context of [BilevelBernstein, ZhangCoop].

Our primary contributions are a minimax approach to generate realistic meter data that is decorrelated from sensitive attributes while maintaining limited performance loss of a cost minimization optimal control algorithm using battery storage. Additionally, we developed a parallelized method that can be easily incorporated in modern deep learning architectures. The correlation of data privatized by our method with sensitive attributes and the performance of a control algorithm is evaluated on two real datasets of residential power demand: one with synthetic sensitive labels and one with real labels. We demonstrate that our method is able to decrease the classification accuracy of an adversary by over 20% while maintaining the performance of the optimization to within 10% over both datasets.

The rest of the paper is organized as follows: we describe the energy resource control in section II, control with privatized data generated from the minimax learning algorithm in section III, experiments and results on the two datasets in section IV, and the Conclusion in Section section V.

Ii Energy Resource Control

Ii-a Notation

We use bold letters for vectors and matrices and regular letters for scalars. Given two vectors

and , represents the element-wise order for where denotes the set . And

means all elements in the vector are not less than the scalar zero. We make the dependence on the underlying probability distribution

when we write expectations (e.g. where

denotes a random variable). The Frobenius norm of a matrix

is . We write or , where we typically mean differentiation of the loss function with respect to the parameter .

stands for Normal (or Gaussian) distribution and

denotes the non-negative real numbers. We use to represent ”define as.” All the vectors are column vectors by default unless we explicitly address otherwise in a specific context.

Ii-B Battery storage control

Control with deterministic demand: Consider a basic battery control problem with the goal of minimizing the energy cost given a prescribed price where is the time horizon that is typically if it is an hourly price. An uncontrollable electricity demand is specified as . We denote the decision variables for battery control to be and expand it into that represents the charging, discharging, and the amount of charge in storage, i.e. . The battery optimal control is formulated as follows (Problem1):

s.t. (1b)

The linear term (with respect to ) in the objective is the cost of electricity when there is no value for selling the energy back to the grid. This represents a situation where there are no net-metering incentives. The quadratic penalty terms and are added to protect the battery state of health in the horizon [liu2018customer]. The term is added to set the battery state to be close to the target value with as the battery size and . are hyper-parameters to control these penalties. and are the charging-in and discharging-out power capacities. And the parameter and denote the charging and discharging efficiency (between 0 and 1). The constraint (1b) indicates that the battery state in the next timestep equals the current battery state adding up the net charging amount (summing up charging and discharging together). Constraint (1c) sets the initial state of the battery to have . To simplify the notation, we define a set . Hence, we use to succinctly express that satisfies the battery constraints. We convert the problem (1) into canonical convex form in Appendix VI-B

and develop a paralleled algorithm making use of automatic differentiation, open-source convex solvers, and pytorch

[paszke2017automatic]–a popular deep learning framework.

Control with stochastic demand: When determining the control with an uncertain demand, we minimize the expected cost under some demand distribution . The objective is slightly changed as follows (Problem2):

s.t. (2b)

Iii Control with Privatized Demand

Protecting privacy in our context means reducing the correlation between the smart meter data and the sensitive attribute of the data owner, e.g. income or square-footage of the house. We justify why such a consideration of privacy protection is useful in practice in section III-A

Iii-a Revealing privacy from data

In this section, we consider a simple scenario that the sensitive information is a binary label, such as a small or large home, which can be inferred from smart meter data. Given the raw demand and sensitive label , the adversary builds a classifier that takes in demand

to estimate

y with a prescribed loss function . Specifically, we assume the adversary minimizes the classification loss

to infer the private information y. A popular choice of classification loss is cross-entropy loss (or log-loss) [Lin_2017_ICCV]

when y

is a binary variable. The classifier

is parameterized by

and can be a neural network that outputs an estimate of the probability of the positive label. Previous studies

[beckel2014revealing, chen2018understanding] showed that estimating a sensitive label such as income or square-footage of the house reaches

accuracy using features of smart meter data and models like support vector machine or random forest. We use an alternative neural network model that leverages the daily power consumption (demand) and achieve state of the art accuracy of the private label. More details can be found in section 


Iii-B Control with private demand

Our goal is to minimize energy cost incorporating of privacy protection. Specifically, we design a data generator that creates a perturbed version of the raw demand data in a way that increases the adversarial classification loss, while enabling an optimal controller to minimize the energy cost. From a modeling perspective, we have a minimax problem (Problem3):

s.t. (3b)

where the parameter is a matrix that affects the distribution of

. In this case, we consider a linear transformation of Gaussian noise

. Variable

is the one-hot encoding of the sensitive binary label, and

is a classifier that takes in the perturbed demand data and predicts the corresponding label private label. The stands for utility loss. It is important to note that in the objective uses the raw demand to evaluate the cost of the control decisions determined using the perturbed demand. This represents the case where the storage unit acts on the perturbed information, but the real world value is based on the original raw data.

In order to solve the non-trivial optimization (3), we simplify the constraints and make use of adversarial training that is further explained in section III-C, which is a common technique in studies of generative adversarial networks (GAN) and their applications [goodfellow2014generative, chen2018unsupervised].

We add a regularization term in the objective with an additional hyper-parameter ,


which helps convergence of the training and preserves parts of the demand that are not related to the privacy or utility loss instead of allowing them to be perturbed arbitrarily.

We can denote matrix with and . The altered demand then becomes . By denoting to be the prior distribution of one-hot labels, e.g. where

is the prior probability of a positive label, we can rewrite the distortion regularization as


Equality (i) uses the fact that has zero mean. Equality (ii) expands out as column vectors and expresses . Rearranging the expressions yields equality (iii).
Therefore, we can equivalently penalize the Frobenius norm of and norm of the vector , i.e. , instead of taking the empirical mean of the demand difference when performing the regularization. To summarize, the data generator determines the filter weight and outputs the perturbed demand , while the adversary takes in the altered demand and private labels y to try to learn a classifier.

Iii-C Minimax learning

We construct two neural networks to perform the roles of the two players, one is for the data generator and the other one is for the adversary. To train the adversary, we minimize the cross-entropy loss , i.e. , which follows the loss function mentioned in section III-A. For the generator, we decouple the training into two steps. First, we leverage the loss that is passed from the adversary to update the matrix weight , i.e.


where is the hyper-parameter that penalizes the distance between and implicitly. Equality (i) uses the log-loss as the classification loss for the binary label. The next step is to use the privatized demand to determine the control by running the following optimization:

s.t. (7b)

The optimal solution of the above convex problem (7) is , or more specifically , because it is a function of the privatized demand, which is aligned with equation (3c). The third step calculates the loss, , using and the original raw demand expressed as:


We update

using gradient descent with the gradient determined by the chain rule. Recall that the generator outputs a privatized demand with reduced correlation to the sensitive label that is also used to yield the storage control decisions. Those decisions are evaluated on the cost given the raw demand, thus, the Jacobian of



In the context of our storage control problem, the first term in (9) is


where is given in the Appendix equation (21),

is the identity matrix, and


The second term, i.e. , in (9) hinges on automatic differentiation through a convex program[amos2017optnet, agrawal2019differentiating]. Because an optimization problem can be viewed as a function mapping the problem data to the primal and dual solutions, we can convert problem (7) to a conic form and calculate the changes of the optimal solution given the perturbations of the problem data. It leverages the idea of finding a zero solution for the residual map of a homogeneous self-dual embedding derived from the KKT conditions of the convex program[agrawal2019differentiating, ye1994nl, busseti2018solution].

The third term in (9) is


since . Thus, all three terms in equation (9) can be evaluated in the backward pass of the generator training and we can update the filter weight using stochastic gradient decent[bottou2010large]: where is the iteration step and is the learning rate.
Remark: To summarize, Step 1 shown in equation (6) updates the matrix by minimizing the negative classification loss (equivalent to maximizing the classification loss) of the adversary, while maintaining the constraint determined in (5). Step 2 calculates the optimal control of the storage using the privatized demand. In Step 3, is updated by evaluating the gradient of the energy cost given the control based on the privatized demand. The updates are expressed as


which run until convergence. We set the learning rates in each step to be equal for simplicity. The training procedure is described in Algorithm 1.

Input: Demand data , label data , learning rate , parameters , and hyper parameters
Initialize , at iteration with batch size ;
while  or has not converged do
      1 draw batches of pair from demand and label datasets (), ;
      2 Sample batch of Gaussian random vectors ;
      3 ;
      4 ;
      5 where is optimal solution of (7)
       (The expected gradient value is approximated as the sample mean of the batch.)
return and
Algorithm 1 Minimax learning

Iii-D Convergence of the filter

This subsection focuses on the stability and boundedness of the iterates in our back-propagation that leverage stochastic gradient methods (or some related variants of first-order gradient methods). Using the subgradient property [boyd2004convex, Chapter 9.1], is a subgradient of at if


and assuming is a local optimal point; when we apply the step1 and step3 updates at the -th iteration, we can obtain the following relationship


Equality (i) expands the inner product of the loss gradients and iterates using for the norm of the sum of loss gradients. The inequality (ii) uses the subgradient condition in equation (13), (both for and ). Rearranging equation (14a) and equation (14e), we get


By summing iterates up to step , we get


where (iii) is valid since we take the minimum over all iterations and (iv) is derived from the summation of equation (15). Then, arranging equation (16a) and equation (16c) gives


Thus, if the 2-norm of the vectorized version of is bounded by , and with learning rate but , the right hand-side of equation (17a) becomes . Therefore, using the gradient updates in step1 and step3 minimizes the losses and converges to a local optimal point.

Iv Experiments

In this section, we evaluate the capability of our linear filter to (1) generate perturbed smart meter data that reduces the prediction accuracy of sensitive attributes; (2) maintain the minimum energy cost from an optimal control decision using the perturbed data; (3) integrate into a contemporary deep learning architecture with parallelism. The code for our experiments is available at

Iv-a Setup

We build up two neural networks to form the adversarial classifier and generator. The adversarial classifier is composed of two fully connected layers with ELU (Exponential Linear Unit) activation to estimate the sensitive attribute from demand. The first layer contains the same number of neurons as the time steps of the meter data series used by the battery optimal controller, and the second layer has half of the neuron numbers of the first layer and outputs a two dimensional vector representing the probability of the associated categories of the label. The generator module is composed of a single linear layer that takes a standard normal random vector and the private labels as inputs, and outputs noise to be added to the original demand. The parameters of the single linear layer form matrix

. Additionally, we specify to be block diagonal to reduce the number of learning parameters, i.e. where is a diagonal matrix. Given the number of columns in our weight matrix is (e.g. the for is for the solar dataset and in our residential experiments), we use uniform initialization[he2015delving] between for both the adversary and generator networks. We use 85% of the data for training and the remaining 15% for testing the performance of the filter. We set hyper-parameters throughout the experiments. The learning rate for the classifier is and the learning rate for the generator starts from and decays for every 100 steps. We present the classification accuracy to indicate the correlation, as a lower accuracy implies a lower value of mutual information[chen_xiao2019safeml], thus, there is less correlation between the demand and sensitive labels. We set the initial battery state of charge to 1% of its maximum energy capacity, i.e. . We use a time-of-use price structure with two tiers: a high price of $0.463 per KWh from 4pm-9pm and $0.202 per KWh for the rest of the day.

Iv-B Examples

Iv-B1 Integration of storage and solar generation

For our first experiment, we aggregated 24-hour demand consumption from thousands of homes into groups of 100-200 homes and added solar generation. The aggregations represent the demand seen at a secondary transformer from the perspective of a utility company. The goal is to minimize the energy cost by running the optimal charging and discharging controls for battery storage given a prescribed price. Each demand comes with a binary label indicating if the demand is from a high- or low-income group. We wish to privatize the demand before sending it to the storage operator to perform cost minimization, so the operator cannot infer any sensitive information from its customers. The left panel of Figure 1 shows the income attribute can be easily inferred from the raw demand as the height of the peaks are clearly distinguishable. The right panel of Figure 1 shows that the privatized demands are perturbed such that two labels overlap making it harder to tell which demand has high or low income.

Fig. 1: A batch of 24-hour demand with solar generation that is net negative in certain hours allowing storage to minimize the cost through an optimal charge and discharge sequence. The left panel shows the raw demand. The right panel shows the privatized demand.

However, there is a trade-off between privacy and utility when perturbing the data. We use the hyper-parameter to balance the adversarial loss and the utility loss i.e. smaller means less weight for privacy and more for utility, as shown in Figure 2. When increases from 8 to 128, the classification accuracy of the income label drops from 89.4% to 73% as we expected. The raw classification accuracy with zero weight is . The loss of performance of the cost minimization by using privatized demand instead of raw demand ranges from at to almost at on average, which shows that high privacy comes with a performance cost for this battery control problem.

Fig. 2: The trade-off between privacy and utility controlled by parameter , which places weight on the private attribute classification loss.

Iv-B2 Deployment of storage on residential users

The second experiment considers residential customers adopting batteries to minimize their energy cost without selling excess to the grid. The control of the battery is performed by an outside program, so the owner wishes to privatize their demand before sending it to the controller. The dataset is from the Irish CER Smart Metering Project[ucd, beckel2014revealing]. We select a year of meter data for meters that contain a record indicating if they belong to a large or small home and partition it into daily sequences with 48 entries for each day. We end up with 54478 records in total. Recall that our goal is to create altered demand that won’t degrade the cost savings while removing the correlation between the demand and the attribute indicating a small or large home.

Figure 3 depicts the trade-off between utility degradation and privacy gain for different weights on privacy loss. The accuracy of classifying large or small homes based on the raw demand is 77.5%. When we have low weight on the privacy loss (e.g. ), the classification accuracy only drops a little to 75%, with a greater sacrifice on cost saving performance (e.g. increased to 8% more cost on average). In the high privacy weight scenario, the classification accuracy drops down to 50% as desired, while the utility performance gap only increases up to 12%.

Fig. 3: The trade-off between the utility and privacy for the CER dataset[ucd]. The privacy label indicates a large or small home. weighs the privacy loss.

Iv-C Parallelism

The experiments in this section are run on a six-core Intel Core i7 CPU @2.2GHz. Current standard solvers like Gurobi or Mosek without support of in-batch parallelism can be computationally expensive for solving a quadratic problem. Our filter makes use of automatic differentiation for a cone program (DIFFCP)[agrawal2019differentiating] and leverages multiprocessing to speed up the forward and backward calculations.

Figure 4

displays the mean and standard deviation of running each trial 8 times, showing that our batched module outperforms Gurobi or Mosek, which are highly tuned commercial solvers for reasonable batch sizes. For a minibatch size of 128, we solve all problems in an average of 1.31 seconds, whereas Gurobi takes an average of 11.7 seconds. This speed improvement for a single minibatch makes the difference between a practical and an unusable solver in the context of training a deep learning architecture.

Fig. 4: CPU run time of a batched optimization using Gurobi v8.1.0, Mosek v8.1.0.60, and our parallel module.

V conclusion

We have presented a method for the privatization of personal data that maintains its utility in the optimal control of energy resources. Our method comprises a small linear filter that adds random noise to the data conditional on the private attributes we wish to protect. The linear filter is trained using a minimax optimization procedure that balances the trade-off between classifcation accuracy of the private attributes and the performance of an optimal controller. Additionally, we include a distortion penalty to preserve aspects of the data that are not specified by the utility or privacy functions in order to avoid adding arbitrary noise. We have demonstrated that this method is capable of removing the correlation between the released private data and the sensitive attributes while maintaining limited loss of the utility of the data using two datasets. Limitations of this method include the requirement to solve an optimization in the training loop, which can be computationally intensive for large problems; however, we suspect only a few iterations of the optimization are needed to achieve the desired gradients, which will dramatically reduce the computation required.

Vi Appendix

Vi-a Battery control details

We present a snapshot of the results for the storage control based on the raw demand and private demand. Figure 5 displays the storage control for the experiment with aggregated homes and solar generation. The upper-left and lower-left panel show the 24 hour charging and discharging decisions with each color representing one sample in a batch. The control decisions made with raw versus privatized demand are closely aligned in general, but with different charging and discharging amounts due to the perturbation. However, such an altered charging profile doesn’t deteriorate the minimum cost much as we can see from the upper-right and lower-right panels of Figure 5. The cost increases by a maximum of 22 dollars given that the highest daily cost is around 390 dollars.

Fig. 5: Analysis of storage control for the aggregated homes experiment with . The upper- and lower-left panel show the charging and discharging power in kilowatts (KW). Different colored curves represent different samples in the batch. The upper-right panel shows the daily electricity cost when operating the battery using raw or private demand (x-axis is the sample number, y-axis is in dollars ($)). The lower-right panel shows a histogram of the loss gap. (x-axis is the increased cost in $, y-axis is the counts. )
Fig. 6: Analysis of storage control for the CER data experiment with . Each panel has same meaning on x- and y-axis as Figure 5

Vi-B Quadratic problem

A canonical form of the quadratic constrained minimization problem (QP) is expressed as follows:

s.t (18b)

We first show that the basic battery storage problem can be considered as a special case of QP. We start with the 24-hour horizon storage problem in Problem1. We can express the constraints equation (1d) to equation (1f) as


We add a constraint that the net of the demand and storage is greater than or equal to 0, so we can formulate the objective as a QP. This constraint does not modify the original problem as long as it is feasible because the optimal solution will implicitly make the net of demand and storage greater than or equal to 0. The constraints in equation (1b)-equation (1c) are expressed as


with . The objective equation (1a) can be converted to a standard QP by letting


then, it is straightforward to discover that is the new form of the objective.