The recent technological advances in wireless communications and computation, and their integration into networked control and cyber-physical systems (CPS) , open the door to a myriad of new and exciting opportunities in transportation, health care, agriculture, energy, and many others.
However, the distributed nature of CPS is often a source of vulnerability [2, 3, 4]. Security breaches in CPS can have catastrophic consequences ranging from hampering the economy by obtaining financial gain, through hijacking autonomous vehicles and drones, and all the way to terrorism by manipulating life-critical infrastructures [5, 6, 7, 8]. Real-world instances of security breaches in CPS that were discovered and made available to the public include the revenge sewage attack in Maroochy Shire, Australia , the Ukraine power attack , the German steel mill cyber-attack  and the Iranian uranium enrichment facility attack via the Stuxnet malware [12, 13, 14, 15, 16]. Consequently, studying and preventing such security breaches via control-theoretic methods has received a great deal of attention in recent years [17, 18, 19, 20, 21, 22, 23, 24, 25, 26].
An important and widely used class of attacks on CPS are based on the “man-in-the-middle” (MITM) attack technique (cf. ): an attacker takes over the physical plant’s control and sensor signals. The attacker overrides the control signals with malicious inputs in order to push the plant toward an alternative trajectory, often unstable and catastrophic. Consequently, the vast majority of CPS constantly monitor/sense the plant outputs with the objective of detecting a possible attack. The attacker, on the other hand, aims to overwrite the sensor readings in a manner that would be indistinguishable from the legitimate ones.
The simplest instance a MITM attack is the replay attack [28, 29, 30], in which the attacker observes and records the legitimate system behavior across a long period of time and then replays it at the controller’s input; this attack is reminiscent of the notorious attack of video surveillance systems, in which previously recorded surveillance footage is replayed during a heist. A well-known example of this attack is that of the Stuxnet malware, which used an operating system vulnerability to enable a twenty-one seconds long replay attack during which the attacker is believed to have driven the speed of the centrifuges at a uranium enrichment facility toward excessively high and destructive speed levels . The extreme simplicity of the replay attack, which can be implemented with zero knowledge of the system dynamics and sensors specification, has made it a popular and well-studied topic of research [28, 29, 30, 32, 33, 34].
In contrast to the replay attack, a paradigm that follows Shannon’s maxim of Kerckhoffs’s principle: “the enemy knows the system”, was considered by Satchidanandan and Kumar  and Ko et al. . This assumes that the attacker has complete knowledge of the dynamics and parameters of the system, which allows the attacker to construct arbitrarily long fictitious sensor readings, that are statistically identical to the actual signals, without being detected.
To counter both replay and “statistical-duplicate” attacks, Mo and Sinopoli , and Satchidanandan and Kumar , respectively, proposed to superimpose a random watermark on top of the (optimal) control signal that is not known to the attacker. In this way, by testing the correlation of the subsequent measurements with the watermark signal, the controller is able to detect the attack. Thus, by superimposing watermarking at different power levels, improved detection probability of the attack can be traded for an increase in the control cost.
The two interesting models described above suffer from some shortcomings. First, in the case of a replay attack the usage of the watermarking signal is unnecessary: by taking a long enough detection window, the controller is always able to detect such an attack even in the absence of watermarks by simply testing for repetitions. A watermark is only necessary when the detection window of the controller is small compared to the recording (and replay) window of the attacker. Second, in the case of a statistical-duplicate attack, we must assume that the attacker has no access to the signal generated and applied by the controller. Since this type of attack assumes the attacker has full system knowledge, if it also has access to the control signal then it can contstruct a fictitious sensor readings containing any watermark signal inscribed by the controller. Assuming there is no access to the control signal seems a questionable assumption for an attacker who is capable of hijacking the whole system.
The two models constitute two extremes: the replay attack assumes no knowledge of the system parameters —and as a consquence it is relatively easy to detect. The statistical-duplicate attack assumes full knowledge of the system dynamics (2) —and as a consequence it requires a more sophisticated detection procedure, as well as additional assumptions to ensure it can be detected.
In the current work, we explore a model that is in between these two extremes. We assume that the controller has perfect knowledge of the system dynamics,111A reasonable assumption as the controller is tuned in much longer and therefore can learn the system dynamics to a far greater percision. while the attacker knows that the system is linear and time-invariant, but does not know the actual open-loop gain. It follows that the attacker needs to “learn” the plant first, before being capable of generating a fictitious control input. In this setting, we also consider the case when the attacker has full access to the control signals, and we investigate the robustness of different attacks to system parametric uncertainty. To determine whether an attack can be successful or not, we rely on physical limitations of the system’s learning process, similar to an adaptive control setting , rather than on cryptographic/watermarking techniques.
Our approach is reminiscent of parametric linear system identification (SysID), but in contrast to classical SysID our attacker is constrained to passive identification. Specifically, we consider two-phase
attacks akin to the exploration and exploitation phases in reinforcement learning/multi-armed bandit problems[38, 39]: in the exploration phase the attacker passively listens and learns the system parameter(s); in the exploitation phase the attacker uses the learned parameter(s) of the first phase to try and mimic the statistical behavior of the real plant, in a similar fashion to the statistical-duplicate attack. For the case of two-phase linear attacks, we analyze the achievable performance of a least-squares (LS) estimation-based scheme and a variance detection test, along with lower bound on the attack-detection probability under the variance detection test and any learning algorithm. We provide explicit results for the case where the duration of the exploitation phase tends to infinity. To enhance the security of the system, we also extend the results to the case of a superimposed watermark (or authentication) signal.
An outline of the rest of the paper is as follows. We set up the problem in Sec. II, and state the main results in Sec. III, with their proofs relegated to the appendix of the paper. Simulations are provided in Sec. IV. We conclude the paper and discuss the future research directions in Sec. V.
We denote by the set of natural numbers. All logarithms, denoted by , are base 2. For two real valued functions and , as means , and as means . We denote by
the realization of the random vectorfor . denotes the Euclidean norm.
denotes the distribution of the random variablewith respect to (w.r.t.) probability measure , whereas
denotes its probability density function (PDF) w.r.t. to the Lebesgue measure, if it has one. An event happens almost surely (a.s.) if it occurs with probability one. For real numbersand , means is much less than
, while for probability distributionsand , means is absolutely continuous w.r.t. . denotes the Radon–Nikodym derivative of w.r.t. . The Kullback–Leibler (KL) divergence between probability distributions and is defined as
where denotes the expectation w.r.t. probability measure . The conditional KL divergence between probability distributions and averaged over is defined as , where are independent and identically distributed (i.i.d.). The mutual information between random variables and is defined as . The conditional mutual information between random variables and given random variable is defined as , where are i.i.d.
Ii Problem Setup
We consider the networked control system depicted in Fig. 1, where the plant dynamics are described by a scalar, discrete-time, linear time-invariant (LTI) system
where , , , are real numbers representing the plant state, open-loop gain of the plant, control input, and plant disturbance, respectively, at time . The controller, at time , observes and generates a control signal as a function of . We assume that the initial condition has a known (to all parties) distribution and is independent of the disturbance sequence , which is an i.i.d. process with PDF a known to all parties. We assume that . With a slight loss of generality and for analytical purposes, we assume
Moreover, to simplify the notations, let denote the state-and-control input at time and its trajectory up to time —by
The controller is equipped with a detector that tests for anomalies in the observed history . When the controller detects an attack, it shuts the system down and prevents the attacker from causing further “damage” to the plant. The controller/detector is aware of the plant dynamics (2) and knows exactly the open-loop gain of the plant. On the other hand, the attacker knows the plant dynamics (2) as well as the plant state , and control input (or equivalently, ) at time (see Fig. 1). However, it does not know the open-loop gain of the plant.
In what follows, it will be convenient to treat the open-loop gain of the plant as a random variable (i.e., it is fixed in time), whose PDF is known to the attacker, and whose realization is known to the controller. We assume all random variables to exist on a common probability space with probability measure , and denote the probability measure conditioned on by . Namely, for any measurable event , we define
is further assumed to be independent of and .
We consider the (time-averaged) linear quadratic (LQ) control cost :
where the weights and are non-negative known (to the controller) real numbers that penalize the cost for state deviations and control actuations, respectively.
Ii-a Adaptive Integrity Attack
We define Adaptive Integrity Attacks (AIA) that consist of a passive and an active phases, referred to as exploration and exploitation, respectively.
During the exploration phase, depicted in Fig. 0(a), the attacker eavesdrops and learns the system, without altering the input signal to the controller, i.e., during this phase.
On the other hand, during the exploitation phase, depicted in Fig. 0(b), the attacker intervenes as a MITM in two different parts of the control loop with the aim of pushing the plant toward an alternative trajectory (usually unstable) without being detected by the controller: it hijacks the true measurements and feeds the controller with a fictitious input instead. Furthermore, it issues and overrides a malicious control signal to the actuator instead of the signal that is generated by the controller as depicted in Fig. 0(b).
Attacks that manipulate the control signal by tampering the integrity of the sensor readings, while trying to remain undetected, are usually referred to as integrity attacks, e.g., . Since in the class of attacks described above, the attacker learns the open-loop gain of the plant in a fashion reminiscent of adaptive control techniques, we referred to attacks in this class as AIA.
Ii-B Two-Phase AIA
While in a general AIA the attacker can switch between the exploration and exploitation phases back and forth or try to combine them together in an online fashion, in this work, we concentrate on a special class of AIA comprising only two disjoint consecutive phases as follows.
Phase 1: Exploration. As is illustrated in Fig. 0(a) for , the attacker observes the plant state and control input, and tries to learn the open-loop gain . We denote by the attacker’s estimate of the open-loop gain .
Phase 2: Exploitation. As is illustrated in Fig. 0(b) from time and onwards, the attacker hijacks the system and feeds a malicious control signal to the plant , and—a fictitious sensor reading to the controller.
Ii-C Linear Two-Phase AIA
A linear two-phase attack is a special case of the two-phase AIA of Sec. II-B, in which the exploitation phase of the attacker takes the following linear form.
where for are i.i.d. with ; is the control signal generated by the controller, which is fed with the fictitious virtual signal by the attacker; and is the estimate of the open-loop gain of the plant at the conclusion of Phase 1.
The controller/detector, being aware of the dynamic (2) and the open-loop gain , attempts to detect possible attacks by testing for statistical deviations from the typical behavior of the system (2). More precisely, under legitimate system operation (the null hypothesis [42, Ch. 14]), the controller observation behaves according to
Note that in case of an attack, during Phase 2 (), (8) can be rewritten as
Ii-D Deception and Detection Probabilities
Define the hijack indicator at time as the first time index at which hijacking occurs:
At every time , the controller uses to construct an estimate of . We denote the following events.
, : There was no attack, and no attack was declared by the detector.
, : An attacker hijacked the controller observation before time but was caught by the controller/detector. In this case we say that the controller detected the attack. The detection probability at time is defined as
, : An attacker hijacked the observed signal by the controller before time , and the controller/detector failed to detect the attack. In this case, we say that the attacker deceived the controller or, equivalently, that the controller misdetected the attack [42, Ch. 3]. The deception probability at time is defined as
, : The controller falsely declared an attack. We refer to this event as false alarm. The false alarm probability at time is defined as
The controller wishes to achieve a low false alarm probability, while guaranteeing a low deception probability [42, Ch. 3] and a low control cost (6). In addition, in case of an attacker that knows (or has perfectly learned) the system gain , and replaces of (2) with a virtual signal that is statistically identical and independent of it, the controller has no hope of correctly detecting the attack.
We further define the deception, detection, and false alarm probabilities w.r.t. the probability measure , without conditioning on , and denote them by , , and , respectively. For instance, is defined as
w.r.t. a PDF of .
Iii Statement of the results
We now describe the main results of this work. We start by describing a variance-based attack-detection test in Sec. III-B1. We derive upper and lower bound on the deception probability in Sec. III-B. The proofs of the results in this section are relegated to the appendix of the paper.
Iii-a Attack-Detection Variance Test
A simple and widely used test is the one that seeks anomalies in the variance, i.e., a test that examines the empirical variance of (8) is equal to . In this way, only second-order statistics of need to be known at the controller. The price of this is of course is its inability to detect higher-order anomalies.
Specifically, this test sets a confidence interval of lengtharound the expected variance, i.e., it checks whether
where is called the test time. That is, as is implied by (9), the attacker manages to deceive the controller () if
As a result, as , the probability of false alarm goes to zero. Hence, in this limit, we are left with the task of determining the behavior of the deception probability (12). We note that the asymptotic assumption simplifies the presentation of the results. Nonetheless, similar treatment can be done in the non-asymptotic case.
Iii-B Bounds on the Deception Probability
Under the Variance Test
In what follows, we assume that the power of the fictitious sensor reading signal ,
converges a.s. to a deterministic value as tends to infinity for some positive real number , namely,
Iii-B1 Lower Bound
To provide a lower bound on the deception probability , we consider a specific estimate of at the conclusion of the first phase by the attacker, assuming a controller that uses the variance test (16). To that end, we use least-squares (LS) estimation due to its efficiency and amenability to recursive update over observed incremental data, which makes it the method of choice for many applications of real-time parametric identification of dynamical systems [44, 45, 46, 37, 47, 48]. The LS algorithm approximates the overdetermined system of equations
by minimizing the Euclidean distance
to estimate (or “identify”) the plant, the solution to which is
Since we assumed for all time has a PDF, the probability that is zero. Consequently, (26) is well-defined.
Using LS estimation (26) achieves the following asymptotic deception probability.
Thm. 1 guarantees for the choice . An important consequence of this is that, for this choice, even without having any prior knowledge of the open-loop gain of the plant, the attacker can still carry a successful attack.
Iii-B2 Upper Bound
We derive an upper bound on the deception probability for the case of a uniformly distributedover a symmetric interval . We assume the attacker knows the distribution of (including the value of ), whereas the controller knows the true value of (as before). Similar results can be obtained for other interval choices. We further note that this bound remains true for the scenario in which guarantees for the worst-case distribution need to be derived.
Let be distributed uniformly over for some , and consider any control policy and any linear two-phase AIA (7) with fictitious-sensor reading power (22) that satisfies . Then, the asymptotic deception probability when using the variance test (16) is bounded from above as
In addition, if is a Markov chain for all
is a Markov chain for all, then
for any sequence of probability measures , provided for all .
The bound in (29c) implies that the deception probability decreases with . This is consistent with the observation of Zames  (see also ) that SysID becomes harder as uncertainty about the open-loop gain of the plant increases; in our case, larger uncertainty interval corresponds to worse estimation of the open-loop gain by the attacker, which leads, in turn, to a decrease in the achievable deception probability by the attacker.
Thm. 2 provides two upper bounds on the deception probability. The first of them (29) clearly shows that increasing the privacy of the open-loop gain —manifested in the mutual information between and the state-and-control trajectory during the exploration phase—reduces the deception probability. The second bound (30) allows freedom in choosing the auxiliary probability measure , making it a rather useful bound. An important instance is that of an i.i.d. Gaussian plant disturbance sequence ; by choosing , for this case, for all , we can rewrite the upper bound (30) in term of as follows.
Under the assumptions of Thm. 2, if is a Markov chain for all , and is an i.i.d. Gaussian plant disturbance sequence, the following upper bound on the asymptotic deception probability holds:
While the upper bound in (29c) is valid for all control policies, the upper bound in (30), and consequently also the one in (31), is only valid for control policies where form a Markov chain for all . To demonstrate this, choose and evaluate the bounds in (29c) and (31). Clearly (32) is finite. On the other hand and hence also the upper bound in (29c), is infinite, since, given and , can be fully determined.
To increase the security of the system, at any time , the controller can add an authentication (or watermarking) signal to a unauthenticated control policy :
We refer to such a control policy as the authenticated control policy . We denote the states of the system that would be generated if only the unauthenticated control signal were applied by , and the resulting trajectory—by .
A “good” authentication signal entails little increase in the control cost (6) compared to its unauthenticated version while providing enhanced detection probability (12) and/or false alarm probability.
In both the replay-attack  and the statistical-duplicate  models no access to the control signal by the attacker was allowed. Thus, to improve the detection probability of the controller in case of an attack, one could add an authentication/watermarking signal, which facilitated the controller with identify abnormalities by correlating the input watermarking signal with its contribution to the sensor reading. Yet, since in the statistical-duplicate setting full system knowledge at the attacker was assumed, if the attacker has the access to the control signal it could easily simulate the contribution of the any inscribed watermarking signal to the sequence of fictitious sensor readings. In contrast, in the replay-attack setting, no system knowledge is assumed rendering any knowledge of the control signal useless, unless learning the plant dynamics is invoked (and brings it to the realm of our work); the latter however takes away from the appeal of this technique which is owes it to its simplicity. Indeed, in our setup the attacker has full access to the control signal. However, in contrast to the statistical-duplicate setting, it cannot perfectly simulate the effect of the control signal as it lacks knowledge of the open-loop gain. Thus, the watermarking signal here is used for a different purpose—to impede the learning process of the attacker.
At first glance, one may envisage that superimposing any watermarking signal on top of the control policy would necessarily enhance the detectability of an attack since the effective observations of the attacker are in this case noisier. However, it turns out that injecting a strong noise may in fact speed up the learning process as it improves the the power of the signal maginifed by the open-loop gains with respect to the observed noise .
The following corollary proposes a class of watermarking signals that provide better guarantees on the deception probability .
In this section, we compare the empirical performance of the variance-test algorithm with the developed bounds in this work as well as the replay attack.
At every time , the controller tests the empirical variance for abnormalities over a detection window of size , , with a confidence interval around the expected variance (16). When the statistical test used in the simulation, the hijack indicator , and its estimate at the controller become equivalent to the definitions of the variance test in (16), the hijack indicator in (10), and the estimate of the latter of Sec. II-D, respectively.
We use the following parameters for the simulation: , and , the open-loop gain of the plant (2) is , the entries of the plant disturbance sequence are i.i.d. standard Gaussian. The applied control policy is . The length of exploration phase, for both the replay attack and the AIA, is . We use the LS algorithm (26) of Sec. III-B1 to construct .
Fig. 2 demonstrates the weakness of the replay attack once the controller uses a sufficiently large detection window, even in the absence of watermarking.
In contrast, when no attack is cast on the system, the alarm rate becomes the false alarm rate and is also depicted in Fig. 2. Clearly, the false alarm probability is high for small detection windows and decays to zero as the detection window become large, with agreement with (20).
In our second simulation, depicted in Fig. 3, we evaluate the detection rate as a function of the power of a watermarking signal. To that end, we fix the detection window to be , which guarantees a negligible false alarm probability, and use Gaussian i.i.d. zero-mean watermarks as in (33) with different power.
We studied attacks on cyber-physical systems which consist of exploration and exploitation phases, where the attacker first explores the dynamic of the plant, after which it hijacks the system by playing a fictitious sensor reading to the controller/detector while and feedind a detrimental control input to the plant. Future research will address the setting of authentication systems in which both the attacker and the controller do not know the dynamic of the plant. To that end, one needs to generate watermarking signals that simultaneously facilitate the learning of the controller and hinder the learning of the attacker.
This research was partially supported by NSF award CNS-1446891. This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 708932. This work was done, in part, while A. Khina was visiting the Simons Institute for the Theory of Computing.
-  K.-D. Kim and P. R. Kumar, “Cyber–physical systems: A perspective at the centennial,” Proceedings of the IEEE, vol. 100, no. Special Centennial Issue, pp. 1287–1308, 2012.
-  A. A. Cardenas, S. Amin, and S. Sastry, “Secure control: Towards survivable cyber-physical systems,” System, vol. 1, no. a2, p. a3, 2008.
-  H. Sandberg, S. Amin, and K. H. Johansson, “Cyberphysical security in networked control systems: An introduction to the issue,” IEEE Control Systems, vol. 35, no. 1, pp. 20–23, 2015.
-  Y. Mo, T. H.-J. Kim, K. Brancik, D. Dickinson, H. Lee, A. Perrig, and B. Sinopoli, “Cyber–physical security of a smart grid infrastructure,” Proceedings of the IEEE, vol. 100, no. 1, pp. 195–209, 2012.
-  Y. Y. Haimes, “Risk of terrorism to cyber-physical and organizational-societal infrastructures,” Public Works Management & Policy, vol. 6, no. 4, pp. 231–240, 2002.
-  A. S. Brown, “SCADA vs. the hackers,” Mechanical Engineering, vol. 124, no. 12, p. 37, 2002.
-  S. Checkoway, D. McCoy, B. Kantor, D. Anderson, H. Shacham, S. Savage, K. Koscher, A. Czeskis, F. Roesner, T. Kohno et al., “Comprehensive experimental analyses of automotive attack surfaces.” in USENIX Security Symposium. San Francisco, 2011, pp. 77–92.
-  K. Koscher, A. Czeskis, F. Roesner, S. Patel, T. Kohno, S. Checkoway, D. McCoy, B. Kantor, D. Anderson, H. Shacham et al., “Experimental security analysis of a modern automobile,” in Security and Privacy (SP), 2010 IEEE Symposium on. IEEE, 2010, pp. 447–462.
-  J. Slay and M. Miller, “Lessons learned from the maroochy water breach,” in International Conference on Critical Infrastructure Protection. Springer, 2007, pp. 73–82.
-  G. Liang, S. R. Weller, J. Zhao, F. Luo, and Z. Y. Dong, “The 2015 ukraine blackout: Implications for false data injection attacks,” IEEE Transactions on Power Systems, vol. 32, no. 4, pp. 3317–3318, 2017.
-  R. M. Lee, M. J. Assante, and T. Conway, “German steel mill cyber attack,” Industrial Control Systems, vol. 30, p. 62, 2014.
-  D. P. Fidler, “Was stuxnet an act of war? decoding a cyberattack,” IEEE Security & Privacy, vol. 9, no. 4, pp. 56–59, 2011.
-  R. Langner, “Stuxnet: Dissecting a cyberwarfare weapon,” IEEE Security & Privacy, vol. 9, no. 3, pp. 49–51, 2011.
-  T. Chen and S. Abu-Nimeh, “Lessons from stuxnet,” Computer, vol. 44, no. 4, pp. 91–93, 2011.
-  N. Falliere, L. O. Murchu, and E. Chien, “W32. stuxnet dossier,” White paper, Symantec Corp., Security Response, vol. 5, no. 6, p. 29, 2011.
-  G. McDonald, L. O. Murchu, S. Doherty, and E. Chien, “Stuxnet 0.5: The missing link,” Symantec Report, 2013.
-  S. Amin, A. A. Cárdenas, and S. S. Sastry, “Safe and secure networked control systems under denial-of-service attacks,” in International Workshop on Hybrid Systems: Computation and Control. Springer, 2009, pp. 31–45.
-  A. Cetinkaya, H. Ishii, and T. Hayakawa, “Networked control under random and malicious packet losses,” IEEE Transactions on Automatic Control, vol. 62, no. 5, pp. 2434–2449, 2017.
-  V. Dolk, P. Tesi, C. De Persis, and W. Heemels, “Event-triggered control systems under denial-of-service attacks,” IEEE Transactions on Control of Network Systems, vol. 4, no. 1, pp. 93–105, 2017.
-  F. Pasqualetti, F. Dörfler, and F. Bullo, “Attack detection and identification in cyber-physical systems,” IEEE Transactions on Automatic Control, vol. 58, no. 11, pp. 2715–2729, 2013.
-  C.-Z. Bai, F. Pasqualetti, and V. Gupta, “Data-injection attacks in stochastic control systems: Detectability and performance tradeoffs,” Automatica, vol. 82, pp. 251–260, 2017.
-  H. Fawzi, P. Tabuada, and S. Diggavi, “Secure estimation and control for cyber-physical systems under adversarial attacks,” IEEE Transactions on Automatic Control, vol. 59, no. 6, pp. 1454–1467, 2014.
-  Y. Mo, E. Garone, A. Casavola, and B. Sinopoli, “False data injection attacks against state estimation in wireless sensor networks,” in Decision and Control (CDC), 2010 49th IEEE Conference on. IEEE, 2010, pp. 5967–5972.
-  Y. Shoukry, P. Martin, Y. Yona, S. Diggavi, and M. Srivastava, “Pycra: Physical challenge-response authentication for active sensors under spoofing attacks,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 2015, pp. 1004–1015.
-  Y. Shoukry, M. Chong, M. Wakaiki, P. Nuzzo, A. Sangiovanni-Vincentelli, S. A. Seshia, J. P. Hespanha, and P. Tabuada, “Smt-based observer design for cyber-physical systems under sensor attacks,” ACM Transactions on Cyber-Physical Systems, vol. 2, no. 1, p. 5, 2018.
-  N. Bezzo, J. Weimer, M. Pajic, O. Sokolsky, G. J. Pappas, and I. Lee, “Attack resilient state estimation for autonomous robotic systems,” in Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on. IEEE, 2014, pp. 3692–3698.
-  N. Asokan, V. Niemi, and K. Nyberg, “Man-in-the-middle in tunnelled authentication protocols,” in International Workshop on Security Protocols. Springer, 2003, pp. 28–41.
-  Y. Mo and B. Sinopoli, “Secure control against replay attacks,” in Communication, Control, and Computing, 2009. Allerton 2009. 47th Annual Allerton Conference on. IEEE, 2009, pp. 911–918.
-  Y. Mo, R. Chabukswar, and B. Sinopoli, “Detecting integrity attacks on SCADA systems,” IEEE Transactions on Control Systems Technology, vol. 22, no. 4, pp. 1396–1407, 2014.
-  Y. Mo, S. Weerakkody, and B. Sinopoli, “Physical authentication of control systems: designing watermarked control inputs to detect counterfeit sensor outputs,” IEEE Control Systems, vol. 35, no. 1, pp. 93–109, 2015.
-  R. Langner, “To kill a centrifuge: A technical analysis of what Stuxnet’s creators tried to achieve,” 2013.
-  M. Zhu and S. Martínez, “On the performance analysis of resilient networked control systems under replay attacks,” IEEE Transactions on Automatic Control, vol. 59, no. 3, pp. 804–808, 2014.
-  F. Miao, M. Pajic, and G. J. Pappas, “Stochastic game approach for replay attack detection,” in Decision and control (CDC), 2013 IEEE 52nd annual conference on. IEEE, 2013, pp. 1854–1859.
-  M. Zhu and S. Martínez, “On distributed constrained formation control in operator–vehicle adversarial networks,” Automatica, vol. 49, no. 12, pp. 3571–3582, 2013.
-  B. Satchidanandan and P. R. Kumar, “Dynamic watermarking: Active defense of networked cyber–physical systems,” Proceedings of the IEEE, vol. 105, no. 2, pp. 219–240, 2017.
-  W.-H. Ko, B. Satchidanandan, and P. Kumar, “Theory and implementation of dynamic watermarking for cybersecurity of advanced transportation systems,” in Communications and Network Security (CNS), 2016 IEEE Conference on. IEEE, 2016, pp. 416–420.
-  K. J. Åström and B. Wittenmark, Adaptive control. Courier Corporation, 2013.
-  R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press Cambridge, 1998, vol. 1, no. 1.
-  P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire, “Gambling in a rigged casino: The adversarial multi-armed bandit problem,” in Foundations of Computer Science, 1995. Proceedings., 36th Annual Symposium on. IEEE, 1995, pp. 322–331.
-  D. Liberzon, Calculus of variations and optimal control theory: a concise introduction. Princeton University Press, 2011.
-  Y. Mo, R. Chabukswar, and B. Sinopoli, “Detecting integrity attacks on SCADA systems,” IEEE Transactions on Control Systems Technology, vol. 22, no. 4, pp. 1396–1407, 2014.
-  E. L. Lehmann and J. P. Romano, Testing statistical hypotheses, 3rd ed. New York, NY: Springer Science & Business Media Inc., 2005.
-  D. Marelli and M. Fu, “Ergodic properties for multirate linear systems,” IEEE transactions on signal processing, vol. 55, no. 2, pp. 461–473, 2007.
-  H. W. Sorenson, “Least-squares estimation: from Gauss to Kalman,” IEEE spectrum, vol. 7, no. 7, pp. 63–68, 1970.
-  L. Ljung, “Consistency of the least-squares identification method,” IEEE Transactions on Automatic Control, vol. 21, no. 5, pp. 779–781, 1976.
-  T. L. Lai and C. Z. Wei, “Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems,” The Annals of Statistics, pp. 154–166, 1982.
-  M. Raginsky, “Divergence-based characterization of fundamental limitations of adaptive dynamical systems,” in Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on. IEEE, 2010, pp. 107–114.
S. Shalev-Shwartz and S. Ben-David,
Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
-  G. Zames, “Adaptive control: Towards a complexity-based general theory,” Automatica, vol. 34, no. 10, pp. 1161–1167, 1998.
-  K. J. Åström and P. Eykhoff, “System identification-a survey,” Automatica, vol. 7, no. 2, pp. 123–162, 1971.
-  R. Durrett, Probability: theory and examples. Cambridge university press, 2010.
-  J. C. Duchi and M. J. Wainwright, “Distance-based and continuum fano inequalities with applications to statistical estimation,” arXiv preprint arXiv:1311.2669, 2013.
-  Y. Yang and A. Barron, “Information-theoretic determination of minimax rates of convergence,” Annals of Statistics, pp. 1564–1599, 1999.