In 2018, the residential and commercial buildings accounted for about 40% of total U.S. energy consumption EIA2018. Even 1% of energy savings amounts to 400 trillion Btu. This is equivalent to reducing average power generation by 13 GW for the entire year. Assuming we first shut down coal-fired power plants, 100 million tons less CO will be pumped into the atmosphere in a year. With ever growing energy demands, efficient energy systems, in particular with advanced control systems, can potentially make massive positive impact on the environment.
Current challenges. Control systems in buildings are mostly rule-based and thus they are energy and cost inefficient. The use of advanced control systems that replace these rules with model-based predictive control (MPC) can potentially save more than 10% of overall energy usage Sturzenegger2016; Ma2012. MPC optimizes the performance of building energy systems taking into account weather forecast and current operating conditions while maintaining occupant comfort, and meeting required operation and safety constraints. However, MPC requires a reasonably accurate model of the building, and buildings are very complex systems to model. Thus, the traditional physics-based modeling approaches like the white-box and the grey-box techniques, although detailed, are cost and time prohibitive Sturzenegger2016. As the building characteristics change with time, the model identification must also be repeated to update the model. Moreover, such an expensive, time-consuming and complex modeling procedure is unique to every building. For all these reasons, physics-based modeling of large scale buildings suffers from practical challenges.
Combining machine learning with controls.
A promising direction that addresses the aforementioned challenges in modeling of buildings focuses on using black-box models for predictive control. The data required to learn these models are obtained from building automation systems (BAS) that store historical logs from sensors and multimeters. In the past, different machine learning algorithms have been used to learn black-box models.JainCDC2017; JainTCPS2018; SmarraAE2018; SmarraADHS2018
focus on regression tree and random forest based approaches for MPC with experiments in EnergyPlus.Buenning2019 tested this random forest based approach on a real building. JainICCPS2018 focus on the use of Gaussian processes for experiment design, stochastic MPC and online model updates with EnergyPlus as a test-bed. Zhang2019
applied model-based reinforcement learning on a building model in EnergyPlus.Gao2016 used sensitivity analysis (and not MPC) with deep neural networks for data center cooling. While data-driven MPC (MPC based on black-box models in place of physics-based) reduces the engineering effort and time required to build white and grey-box models, it poses several other challenges such as (1) quality of historical data required to learn black-box models, and (2) computational complexity of the optimization problem if the learned models are non-convex or non-differentiable, see JainICCPS2018.
In the literature, neural networks have been used for either modeling or MPC or both in different ways. A small survey that classifies these approaches can be found inAfram2017. Since most of them focus on simulated environments, we discuss here the ones that tested control performance on a real building. In Afram2017, neural networks are used to train temperature models and control effort is minimized (instead of energy consumption) for a residential HVAC system. In Huang2015, physics-based MPC is used to control a building in Terminal 1 of the Adelaide Airport. A ‘neural network feedback linearization method’ is then used to convert decision variables (energy supply) in the MPC problem to actual inputs (cooling valve operation and outdoor air damper) to the building. In the following work Huang2015b, neural networks are used to implement an optimal start-stop control rule to guarantee thermal comfort but energy optimization is not considered. To the best of the authors’ knowledge this paper shows for the first time, on a real building, that neural networks can be used to represent both energy and temperature dynamics in MPC to trade-off energy usage and occupant comfort.
Contributions. This paper makes the following contributions. First, we present an approach for predictive control based on neural networks. Using historical data from the building automation system and the weather station, we learn different neural networks that predict energy usage and zone temperatures, and then set up optimization for energy management that allows us to trade-off between energy usage and temperature setpoint tracking. Second, we demonstrate the efficacy of our approach on a 2-story building in L’Aquila, Italy that is equipped with a heating system from Mitsubishi. We show in our experiments that through supervisory control we can reduce the energy usage while keeping occupants comfortable without any modification to the existing heating system. Third, we introduce the underlying tool
that enables constrained nonlinear optimization in TensorFlow to solve the above MPC problem.
2 From predictive modeling to predictive control
Our goal is to learn neural networks of the form
where output is either energy or temperature of one of the zones, represents the disturbances, the control variables. The dynamic behavior of the output variables is captured by the lagged terms. The order of auto-regression denoted by
are hyperparameters chosen during cross validation. In the learning step, we restructure the time series of, and obtained from raw sensor data to create data samples at each time instance in the above format. For optimal decision making, we use model (1) in the MPC problem as follows
where is the control horizon, is the cost matrix, is the reference to be tracked. While (2) is an example of a tracking controller, the cost function can be changed depending upon the application.
Using this approach, for a two-story building described in Section 3, we formulate the MPC problem in Section 4, and present experimental results in Section 5. In our case, neural networks are trained in TensorFlow Tensorflow2015 with a stochastic optimization solver – Adam Kingma2014, and the optimization (2) is solved with a non-linear interior point optimization solver – IPOPT Waechter2009.
We consider a two-story building with a rooftop unit as shown on the left in Figure 1. The building is located inside the Coppito campus of the University of L’Aquila, Italy. Each floor is composed of 5 zones – 4 rooms and a small lobby, and can be independently controlled. The layout of the first floor is shown on the right in Figure 1. Enlarged layouts of both floors are attached in Appendix A. The gross area of the ground and first floor is 72 and 77, respectively.
3.1 Heating system
The building is equipped with a variable refrigerant flow heat pump (a type of rooftop unit) from Mitsubishi. The heating system comprises of (1) an outdoor unit on the roof that includes a compressor and an evaporator, and (2) an indoor unit (also called split) in each zone that includes a fan and a condenser. Heating is provided through refrigerant conduits connecting indoor and outdoor equipment. The thermal energy from the evaporation and compression phases is carried by the refrigerant. This energy is transferred by the condenser into the zones where warm air is then distributed by the fan. Additionally, each room and lobby is equipped with a temperature sensor. Power consumption of the building is measured using a multimeter. The system is configured to be programmatically controlled via M-NET (Mitsubishi network) protocol. We discuss how we extract the data from the BAS and control this building remotely in Appendix B.
3.2 Role of supervisory control
Traditional control systems in buildings rely on fixed rules for manipulation of temperature setpoints. For example, during Winter, the setpoints may be kept constant at 25C during working hours. For any chosen setpoint, the low-level controllers try to keep the zone temperatures close to the chosen setpoint. In our case, this controller is a relay controller from Mitsubishi. During Winter, when the setpoint in a zone is kept constant, the corresponding indoor unit is switched ON when the measured temperature is 1.5C below the setpoint. As the zone starts to heat up, the indoor unit is switched OFF when the measured temperature exceeds the setpoint by 0.5C. Since different rooms have different temperatures, the external unit may be kept ON for usually longer periods of time. Now, the goal of a neural network based predictive controller is to dynamically change the setpoints based on measured temperatures and external weather conditions, in order to (1) reduce the amount of time the external unit is ON, hence reduce energy consumption, or (2) provide better thermal comfort by tracking a given setpoint and reducing the variation in the measured temperature due to bang-bang behavior of the relay controller. In Section 5, we show how the dynamic changes to setpoints help achieve the aforementioned objectives.
4 Model predictive control with neural networks
This section is divided into two parts. In Section 4.1, we discuss the model training process – features (regressors) for energy and temperature models, architecture of neural networks, and performance validation of the trained models on unseen data sets. We formulate the nonlinear MPC problem and describe how to solve it in Section 4.2.
4.1 Predictive modeling
We learn different models for energy and temperature predictions. The sampling time is chosen to be min since the compressor, and hence the energy consumption, is very responsive to changes in temperature setpoints. The same dynamical models are used for the length of the horizon to derive energy and temperature states in the future. The models were trained using data from the months of October 2018 – February 2019, and October – November 2019. In total, 18 weeks of data were used for training (the building was non-operational in the remaining weeks), of which 12 weeks of data were obtained with random excitation (kept constant for 1-2 hours) in temperature setpoints between 22-28C and the remaining data were obtained with constant setpoints. The building was unoccupied during the entire duration of the experimentation.
1. Energy prediction. This model predicts the energy consumption over the next sampling time, i.e. between and is given by the expression
We modelinclude outside temperature, humidity and solar radiation, temperature measurements from all 10 zones. The weather data are obtained from the weather station provided by the Cetemps. Note that the temperature predictions for the future time steps are obtained using models in (4). The control inputs include temperature setpoints for all 10 zones and the compressor mode (boolean), in that order. We use , , and . Although the compressor mode is not a free control variable since the compressor state ON/OFF is decided based on an embedded control law in the BAS microcontroller, adding it as a feature in the model drastically improves the modeling accuracy.
2. Temperature prediction. A separate model is learned for each room to predict the temperature in that room at time , given the temperature measurements from the sensor until time :
Here, each is a neural network with only 1 hidden layer (50 neurons) and ReLU activation function . The temperature dynamics is essentially given by a piecewise affine function of all the features, and is sufficient to predict room temperatures with high accuracy. The disturbances include outside temperature, humidity and solar radiation, and control input includes the temperature setpoint for that room. We use , and .
Model validation. The statistics for absolute predictions errors with energy and temperature models are shown in Figure 2. We compare 1-step predictions from the neural networks against a naive baseline model that assumes the predictions at the next time step are same as the measurements at the current time step, i.e. in (3) and in (4
). We observe that the baseline prediction errors show heavy tails even with a small sampling time of 2 min while the probability densities for the neural networks are concentrated in the region with small errors. Thus the predictions from the neural network are more robust. This is expected, especially for energy consumption due to fast dynamics of the compressor. Note that we observe spikes in the temperature plot for the baseline case because the measurements are available with only 1 decimal precision.
4.2 Receding horizon control
The control problem is setup as a finite receding horizon optimization. The dynamical models derived in Section 4.1
serve as equality constraints over the horizon. The BAS accepts only integral values for temperature setpoints. Since solving a mixed integer non-linear program is much harder and computationally challenging to solve, we solve an approximate problem with continuous input space and then round off the solution of the optimization to the nearest integers. More precisely, at each time step, we solve an optimization problem that allows us to trade-off energy usage and setpoint tracking
Note that is the element of . The sampling time for the models is 2 min but the MPC problem is solved every 5 min, using the same inputs for the next 5 min to avoid changing temperature setpoints too frequently. The control horizon is chosen to be 10 steps (20 min). In Section 5, we show results for 2 scenarios: (1) energy minimization only by setting , and (2) temperature tracking for better occupant comfort by choosing , , and 25C.
The optimization is solved using
, a custom tool we built for constrained optimization in TensorFlow using IPOPT. IPOPT is an open source software package for large-scale nonlinear optimizationWaechter2009. To solve general non-linear (possibly non-convex) optimization problems like (5) which have inequality constraints, we needed an interface that allows us to call IPOPT from TensorFlow since the energy and temperature models were trained in TensorFlow. The tool is available at https://github.com/jainachin/tf-ipopt
. Note that training of machine learning models involving minimization of a loss function are unconstrained optimization problems.
We evaluate MPC problem (5) under two different scenarios and compare the results against a baseline controller that chooses fixed setpoints. The following three controllers are compared.
1. Baseline: relay controller with fixed setpoints. This is a relay controller that comes with the heating unit. Constant setpoint of 25C is chosen for each zone.
2. MPC-min: energy minimization only. Set and in (5). The goal is to minimize energy consumption while keeping all zone temperatures between 23-27C.
3. MPC-tracking: setpoint tracking for better occupant comfort. Set , , and C in (5). This MPC controller adjusts the setpoints to keep the measured temperature closer to the chosen reference.
5.1 MPC-tracking versus Baseline
Baseline controller was tested on December 13-15, 2019 (blue) and MPC-tracking on December 16-18, 2019 (green). Recall that Baseline controller has a maximum threshold cutoff 0.5C above and minimum threshold 1.5C below the setpoint, i.e. the relay width is 2C. This asymmetry by design is aimed to save energy and prevent excess heating. The comparison of occupant comfort (quality of temperature tracking) is shown in Figure 4
. The mean and standard deviation of measured zone temperatures are, for MPC-tracking and , for Baseline. In the case of Baseline controller, the mean zone temperature is far from 25
C (high bias) and also shows large fluctuation (high variance). By dynamically changing the setpoints, MPC-tracking keeps the zone temperature closer to 25C (low bias) and also reduces the fluctuation significantly (low variance). High bias with Baseline is attributed to the asymmetry in the design of the relay controller (see above) which can be reduced by tracking 25.5C instead. However, to reduce the variance, the relay width must be modified. This would require installation of additional hardware in each indoor unit that would decrease the relay width from 2C to 1C. MPC-tracking provides the same benefit without any changes to the existing heating system.
5.2 MPC-min versus Baseline
To evaluate the benefits of MPC-min against Baseline from Section 5.1, it is important to consider similar weather conditions, especially outside temperature. However, since only one controller can be tested at a time, a perfect comparison with the exact same weather conditions is impossible. We identified two periods of three consecutive days with similar weather conditions to compare MPC-min with Baseline. The distributions of weather disturbances on these days are shown in Figure 3. MPC-min was run on December 5-7, 2019 (orange). On average, it is warmer and less humid on the days Baseline controller was run. The energy consumption and measured temperature in one of the zones are shown in Figure 4. As expected, MPC-min pushes the zone temperature closer to the lower bound at 23C to realize energy savings, see the bottom plot in Figure 4. Due to the short horizon and internal system dynamics, sometimes the zone temperature goes below the lower bound. For clarity, we show results for only the initial 12 hour period from the days of experiments. We observe that the difference between the cumulative energy between Baseline and MPC-min is continuously increasing with time. At the end of 3 days, MPC-min consumes 18.7 kWh less energy, a 5.7% decrease over Baseline. We also note that the days when Baseline was tested were warmer so less heating will be required to achieve the same setpoint. Further, there is a potential for Baseline controller to reduce energy consumption if we change the reference temperature below 25C. However, this will come at the expense of much higher variance in the zone temperature as we discussed in Section 5.1 and thus more constraint violations. On the other hand, MPC-min serves as a supervisory controller reducing energy usage and fluctuation in zone temperature without any modifications to the heating system.
In applications like demand response and peak demand reduction where optimal decisions are sensitive to varying electricity pricing, model-based predictive control will outperform a ruled-based strategy (like fixed setpoints) by a big margin since the setpoints will need to be dynamically changed to reduce energy costs. Thus, our approach for neural networks based MPC would stand out against Baseline. Two such examples problems are designing (1) MPC-min for minimizing energy costs, and (2) MPC-tracking for following a reference demand response signal. These experiments are a part of our future work.
6 Conclusion and future work
We present an approach to learn neural networks that predict energy consumption and zone temperature dynamics in a two-story building in L’Aquila, Italy equipped with a heating system from Mitsubishi. We set up nonlinear MPC problem using these neural networks that allows us to trade-off energy savings and better occupant comfort. By dynamically changing the temperature setpoints (supervisory control), the controller reduces energy consumption while respecting comfort bounds. In a separate experiment, we also show that we can achieve better occupant comfort by reducing the variance in temperature tracking without any modifications to the existing heating system in the building. Our results make a strong case for the application of MPC with black-box models, in particular, neural network-based models, in building energy optimization.
The following directions are part of our future plan. First, we will conduct more experiments to study the trade-off between energy savings and occupant comfort. In this paper, we present results for (1) only energy optimization and (2) only temperature tracking. It would be interesting to have non-zero weights to both objectives in the same MPC problem. Second, we will study the effect of dynamic electricity pricing and how we can exploit building’s thermal inertia to save energy costs. Third, we will evaluate the performance of our controller in providing custom occupant comfort, i.e. tracking different temperatures in different rooms. Fourth, we will investigate continual learning of neural networks using model-based reinforcement learning (RL). As building properties and weather conditions change with time, the goal is to minimize the maintenance of neural networks required in manual functional tests by leveraging the exploration capabilities in RL.
Appendix A Floor level layouts of the building
Appendix B Data acquisition
The building data acquisition system consists of 3 major components:
1. Local server – This hosts a LabVIEW application that reads and writes data to Mitsubishi’s building automation system via Modbus TCP/IP protocol. The building’s HVAC system is based on Mitsubishi proprietary serial network called M-NET. In order to communicate with the sensors via LabVIEW, it was necessary to add a translator from M-NET to Modbus TCP/IP to allow OPC to act as a link between LabView and the system tags. The complete scheme is shown in Figure 7.
2. Remote Elasticsearch database in AWS cloud – This stores real-time logs for remote monitoring and visualization using Grafana. We use Elasticsearch2019, a distributed, RESTful search and analytics engine capable of handling terabytes of data together with Grafana2018, an open sourced tool for analytics and real-time monitoring. This particularly helps in running control experiments and visualizing data in real-time without being physically present in the building. The experiments for functional testing and MPC send setpoints for each room to “controller” index in Elasticsearch. Once the database is updated, the most recent setpoints are sent to the local server by the communication link below.
3. Link between local server and remote database – The purpose of this link is to sync data between the LabVIEW application and the Elastisearch database every 15 seconds. When the setpoints are updated in the “controller” index in Elasticsearch, it fetches the data and sends them to the LabVIEW application that further relays them to the BAS via Modbus TCP/IP protocol. Simultaneously, the measurements from the sensors and multimeters are read from the BAS and sent to “measurements” index in Elastisearch database.