I Introduction
The recent technological advances in wireless communications and computation, and their integration into networked control and cyberphysical systems (CPS) [1], open the door to a myriad of new and exciting opportunities in transportation, health care, agriculture, energy, and many others.
However, the distributed nature of CPS is often a source of vulnerability [2, 3, 4]. Security breaches in CPS can have catastrophic consequences ranging from hampering the economy by obtaining financial gain, through hijacking autonomous vehicles and drones, and all the way to terrorism by manipulating lifecritical infrastructures [5, 6, 7, 8]. Realworld instances of security breaches in CPS that were discovered and made available to the public include the revenge sewage attack in Maroochy Shire, Australia [9], the Ukraine power attack [10], the German steel mill cyberattack [11] and the Iranian uranium enrichment facility attack via the Stuxnet malware [12, 13, 14, 15, 16]. Consequently, studying and preventing such security breaches via controltheoretic methods has received a great deal of attention in recent years [17, 18, 19, 20, 21, 22, 23, 24, 25, 26].
An important and widely used class of attacks on CPS are based on the “maninthemiddle” (MITM) attack technique (cf. [27]): an attacker takes over the physical plant’s control and sensor signals. The attacker overrides the control signals with malicious inputs in order to push the plant toward an alternative trajectory, often unstable and catastrophic. Consequently, the vast majority of CPS constantly monitor/sense the plant outputs with the objective of detecting a possible attack. The attacker, on the other hand, aims to overwrite the sensor readings in a manner that would be indistinguishable from the legitimate ones.
The simplest instance a MITM attack is the replay attack [28, 29, 30], in which the attacker observes and records the legitimate system behavior across a long period of time and then replays it at the controller’s input; this attack is reminiscent of the notorious attack of video surveillance systems, in which previously recorded surveillance footage is replayed during a heist. A wellknown example of this attack is that of the Stuxnet malware, which used an operating system vulnerability to enable a twentyone seconds long replay attack during which the attacker is believed to have driven the speed of the centrifuges at a uranium enrichment facility toward excessively high and destructive speed levels [31]. The extreme simplicity of the replay attack, which can be implemented with zero knowledge of the system dynamics and sensors specification, has made it a popular and wellstudied topic of research [28, 29, 30, 32, 33, 34].
In contrast to the replay attack, a paradigm that follows Shannon’s maxim of Kerckhoffs’s principle: “the enemy knows the system”, was considered by Satchidanandan and Kumar [35] and Ko et al. [36]. This assumes that the attacker has complete knowledge of the dynamics and parameters of the system, which allows the attacker to construct arbitrarily long fictitious sensor readings, that are statistically identical to the actual signals, without being detected.
To counter both replay and “statisticalduplicate” attacks, Mo and Sinopoli [29], and Satchidanandan and Kumar [35], respectively, proposed to superimpose a random watermark on top of the (optimal) control signal that is not known to the attacker. In this way, by testing the correlation of the subsequent measurements with the watermark signal, the controller is able to detect the attack. Thus, by superimposing watermarking at different power levels, improved detection probability of the attack can be traded for an increase in the control cost.
The two interesting models described above suffer from some shortcomings. First, in the case of a replay attack the usage of the watermarking signal is unnecessary: by taking a long enough detection window, the controller is always able to detect such an attack even in the absence of watermarks by simply testing for repetitions. A watermark is only necessary when the detection window of the controller is small compared to the recording (and replay) window of the attacker. Second, in the case of a statisticalduplicate attack, we must assume that the attacker has no access to the signal generated and applied by the controller. Since this type of attack assumes the attacker has full system knowledge, if it also has access to the control signal then it can contstruct a fictitious sensor readings containing any watermark signal inscribed by the controller. Assuming there is no access to the control signal seems a questionable assumption for an attacker who is capable of hijacking the whole system.
The two models constitute two extremes: the replay attack assumes no knowledge of the system parameters —and as a consquence it is relatively easy to detect. The statisticalduplicate attack assumes full knowledge of the system dynamics (2) —and as a consequence it requires a more sophisticated detection procedure, as well as additional assumptions to ensure it can be detected.
In the current work, we explore a model that is in between these two extremes. We assume that the controller has perfect knowledge of the system dynamics,^{1}^{1}1A reasonable assumption as the controller is tuned in much longer and therefore can learn the system dynamics to a far greater percision. while the attacker knows that the system is linear and timeinvariant, but does not know the actual openloop gain. It follows that the attacker needs to “learn” the plant first, before being capable of generating a fictitious control input. In this setting, we also consider the case when the attacker has full access to the control signals, and we investigate the robustness of different attacks to system parametric uncertainty. To determine whether an attack can be successful or not, we rely on physical limitations of the system’s learning process, similar to an adaptive control setting [37], rather than on cryptographic/watermarking techniques.
Our approach is reminiscent of parametric linear system identification (SysID), but in contrast to classical SysID our attacker is constrained to passive identification. Specifically, we consider twophase
attacks akin to the exploration and exploitation phases in reinforcement learning/multiarmed bandit problems
[38, 39]: in the exploration phase the attacker passively listens and learns the system parameter(s); in the exploitation phase the attacker uses the learned parameter(s) of the first phase to try and mimic the statistical behavior of the real plant, in a similar fashion to the statisticalduplicate attack. For the case of twophase linear attacks, we analyze the achievable performance of a leastsquares (LS) estimationbased scheme and a variance detection test, along with lower bound on the attackdetection probability under the variance detection test and any learning algorithm. We provide explicit results for the case where the duration of the exploitation phase tends to infinity. To enhance the security of the system, we also extend the results to the case of a superimposed watermark (or authentication) signal.An outline of the rest of the paper is as follows. We set up the problem in Sec. II, and state the main results in Sec. III, with their proofs relegated to the appendix of the paper. Simulations are provided in Sec. IV. We conclude the paper and discuss the future research directions in Sec. V.
Ia Notation
We denote by the set of natural numbers. All logarithms, denoted by , are base 2. For two real valued functions and , as means , and as means . We denote by
the realization of the random vector
for . denotes the Euclidean norm.denotes the distribution of the random variable
with respect to (w.r.t.) probability measure , whereasdenotes its probability density function (PDF) w.r.t. to the Lebesgue measure, if it has one. An event happens almost surely (a.s.) if it occurs with probability one. For real numbers
and , means is much less than, while for probability distributions
and , means is absolutely continuous w.r.t. . denotes the Radon–Nikodym derivative of w.r.t. . The Kullback–Leibler (KL) divergence between probability distributions and is defined as(1) 
where denotes the expectation w.r.t. probability measure . The conditional KL divergence between probability distributions and averaged over is defined as , where are independent and identically distributed (i.i.d.). The mutual information between random variables and is defined as . The conditional mutual information between random variables and given random variable is defined as , where are i.i.d.
Ii Problem Setup
We consider the networked control system depicted in Fig. 1, where the plant dynamics are described by a scalar, discretetime, linear timeinvariant (LTI) system
(2) 
where , , , are real numbers representing the plant state, openloop gain of the plant, control input, and plant disturbance, respectively, at time . The controller, at time , observes and generates a control signal as a function of . We assume that the initial condition has a known (to all parties) distribution and is independent of the disturbance sequence , which is an i.i.d. process with PDF a known to all parties. We assume that . With a slight loss of generality and for analytical purposes, we assume
(3) 
Moreover, to simplify the notations, let denote the stateandcontrol input at time and its trajectory up to time —by
(4) 
The controller is equipped with a detector that tests for anomalies in the observed history . When the controller detects an attack, it shuts the system down and prevents the attacker from causing further “damage” to the plant. The controller/detector is aware of the plant dynamics (2) and knows exactly the openloop gain of the plant. On the other hand, the attacker knows the plant dynamics (2) as well as the plant state , and control input (or equivalently, ) at time (see Fig. 1). However, it does not know the openloop gain of the plant.
In what follows, it will be convenient to treat the openloop gain of the plant as a random variable (i.e., it is fixed in time), whose PDF is known to the attacker, and whose realization is known to the controller. We assume all random variables to exist on a common probability space with probability measure , and denote the probability measure conditioned on by . Namely, for any measurable event , we define
(5) 
is further assumed to be independent of and .
We consider the (timeaveraged) linear quadratic (LQ) control cost [40]:
(6) 
where the weights and are nonnegative known (to the controller) real numbers that penalize the cost for state deviations and control actuations, respectively.
Iia Adaptive Integrity Attack
We define Adaptive Integrity Attacks (AIA) that consist of a passive and an active phases, referred to as exploration and exploitation, respectively.
During the exploration phase, depicted in Fig. 0(a), the attacker eavesdrops and learns the system, without altering the input signal to the controller, i.e., during this phase.
On the other hand, during the exploitation phase, depicted in Fig. 0(b), the attacker intervenes as a MITM in two different parts of the control loop with the aim of pushing the plant toward an alternative trajectory (usually unstable) without being detected by the controller: it hijacks the true measurements and feeds the controller with a fictitious input instead. Furthermore, it issues and overrides a malicious control signal to the actuator instead of the signal that is generated by the controller as depicted in Fig. 0(b).
Remark 1.
Attacks that manipulate the control signal by tampering the integrity of the sensor readings, while trying to remain undetected, are usually referred to as integrity attacks, e.g., [41]. Since in the class of attacks described above, the attacker learns the openloop gain of the plant in a fashion reminiscent of adaptive control techniques, we referred to attacks in this class as AIA.
IiB TwoPhase AIA
While in a general AIA the attacker can switch between the exploration and exploitation phases back and forth or try to combine them together in an online fashion, in this work, we concentrate on a special class of AIA comprising only two disjoint consecutive phases as follows.
Phase 1: Exploration. As is illustrated in Fig. 0(a) for , the attacker observes the plant state and control input, and tries to learn the openloop gain . We denote by the attacker’s estimate of the openloop gain .
Phase 2: Exploitation. As is illustrated in Fig. 0(b) from time and onwards, the attacker hijacks the system and feeds a malicious control signal to the plant , and—a fictitious sensor reading to the controller.
IiC Linear TwoPhase AIA
A linear twophase attack is a special case of the twophase AIA of Sec. IIB, in which the exploitation phase of the attacker takes the following linear form.
(7) 
where for are i.i.d. with ; is the control signal generated by the controller, which is fed with the fictitious virtual signal by the attacker; and is the estimate of the openloop gain of the plant at the conclusion of Phase 1.
The controller/detector, being aware of the dynamic (2) and the openloop gain , attempts to detect possible attacks by testing for statistical deviations from the typical behavior of the system (2). More precisely, under legitimate system operation (the null hypothesis [42, Ch. 14]), the controller observation behaves according to
(8) 
IiD Deception and Detection Probabilities
Define the hijack indicator at time as the first time index at which hijacking occurs:
(10) 
At every time , the controller uses to construct an estimate of . We denote the following events.

, : There was no attack, and no attack was declared by the detector.

, : An attacker hijacked the controller observation before time but was caught by the controller/detector. In this case we say that the controller detected the attack. The detection probability at time is defined as
(11) 
, : An attacker hijacked the observed signal by the controller before time , and the controller/detector failed to detect the attack. In this case, we say that the attacker deceived the controller or, equivalently, that the controller misdetected the attack [42, Ch. 3]. The deception probability at time is defined as
(12) 
, : The controller falsely declared an attack. We refer to this event as false alarm. The false alarm probability at time is defined as
(13)
Clearly,
(14) 
The controller wishes to achieve a low false alarm probability, while guaranteeing a low deception probability [42, Ch. 3] and a low control cost (6). In addition, in case of an attacker that knows (or has perfectly learned) the system gain , and replaces of (2) with a virtual signal that is statistically identical and independent of it, the controller has no hope of correctly detecting the attack.
We further define the deception, detection, and false alarm probabilities w.r.t. the probability measure , without conditioning on , and denote them by , , and , respectively. For instance, is defined as
(15) 
w.r.t. a PDF of .
Iii Statement of the results
We now describe the main results of this work. We start by describing a variancebased attackdetection test in Sec. IIIB1. We derive upper and lower bound on the deception probability in Sec. IIIB. The proofs of the results in this section are relegated to the appendix of the paper.
Iiia AttackDetection Variance Test
A simple and widely used test is the one that seeks anomalies in the variance, i.e., a test that examines the empirical variance of (8) is equal to . In this way, only secondorder statistics of need to be known at the controller. The price of this is of course is its inability to detect higherorder anomalies.
Specifically, this test sets a confidence interval of length
around the expected variance, i.e., it checks whether(16) 
where is called the test time. That is, as is implied by (9), the attacker manages to deceive the controller () if
(17)  
(18) 
Eq. (8) suggests that the false alarm probability of the variance test (16) is
(19) 
By applying Chebyshev’s inequality (see [42, Prob. 11.27]) and (3), we have
(20) 
As a result, as , the probability of false alarm goes to zero. Hence, in this limit, we are left with the task of determining the behavior of the deception probability (12). We note that the asymptotic assumption simplifies the presentation of the results. Nonetheless, similar treatment can be done in the nonasymptotic case.
IiiB Bounds on the Deception Probability
Under the Variance Test
In what follows, we assume that the power of the fictitious sensor reading signal ,
(21) 
converges a.s. to a deterministic value as tends to infinity for some positive real number , namely,
(22) 
Remark 2.
Assuming the control policy is memoryless, namely is only dependent on , the process is Markov for . By further assuming that
and using the generalization of the law of large number for Markov processes
[43], we deduce(23) 
Consequently, in this case we have .
We now provide lower and upper bounds on the deception probability (12) of any linear twophase AIA (7) where of (7) is constructed using any learning algorithm.
IiiB1 Lower Bound
To provide a lower bound on the deception probability , we consider a specific estimate of at the conclusion of the first phase by the attacker, assuming a controller that uses the variance test (16). To that end, we use leastsquares (LS) estimation due to its efficiency and amenability to recursive update over observed incremental data, which makes it the method of choice for many applications of realtime parametric identification of dynamical systems [44, 45, 46, 37, 47, 48]. The LS algorithm approximates the overdetermined system of equations
(24) 
by minimizing the Euclidean distance
(25) 
to estimate (or “identify”) the plant, the solution to which is
(26) 
Remark 3.
Since we assumed for all time has a PDF, the probability that is zero. Consequently, (26) is welldefined.
Using LS estimation (26) achieves the following asymptotic deception probability.
Theorem 1.
Remark 4.
Thm. 1 guarantees for the choice . An important consequence of this is that, for this choice, even without having any prior knowledge of the openloop gain of the plant, the attacker can still carry a successful attack.
IiiB2 Upper Bound
We derive an upper bound on the deception probability for the case of a uniformly distributed
over a symmetric interval . We assume the attacker knows the distribution of (including the value of ), whereas the controller knows the true value of (as before). Similar results can be obtained for other interval choices. We further note that this bound remains true for the scenario in which guarantees for the worstcase distribution need to be derived.Theorem 2.
Let be distributed uniformly over for some , and consider any control policy and any linear twophase AIA (7) with fictitioussensor reading power (22) that satisfies . Then, the asymptotic deception probability when using the variance test (16) is bounded from above as
(29a)  
(29b)  
(29c) 
In addition, if
is a Markov chain for all
, then(30) 
for any sequence of probability measures , provided for all .
Remark 5.
The bound in (29c) implies that the deception probability decreases with . This is consistent with the observation of Zames [49] (see also [47]) that SysID becomes harder as uncertainty about the openloop gain of the plant increases; in our case, larger uncertainty interval corresponds to worse estimation of the openloop gain by the attacker, which leads, in turn, to a decrease in the achievable deception probability by the attacker.
Thm. 2 provides two upper bounds on the deception probability. The first of them (29) clearly shows that increasing the privacy of the openloop gain —manifested in the mutual information between and the stateandcontrol trajectory during the exploration phase—reduces the deception probability. The second bound (30) allows freedom in choosing the auxiliary probability measure , making it a rather useful bound. An important instance is that of an i.i.d. Gaussian plant disturbance sequence ; by choosing , for this case, for all , we can rewrite the upper bound (30) in term of as follows.
Corollary 1.
Under the assumptions of Thm. 2, if is a Markov chain for all , and is an i.i.d. Gaussian plant disturbance sequence, the following upper bound on the asymptotic deception probability holds:
(31) 
where
(32) 
Remark 6.
While the upper bound in (29c) is valid for all control policies, the upper bound in (30), and consequently also the one in (31), is only valid for control policies where form a Markov chain for all . To demonstrate this, choose and evaluate the bounds in (29c) and (31). Clearly (32) is finite. On the other hand and hence also the upper bound in (29c), is infinite, since, given and , can be fully determined.
IiiC Watermarking
To increase the security of the system, at any time , the controller can add an authentication (or watermarking) signal to a unauthenticated control policy :
(33) 
We refer to such a control policy as the authenticated control policy . We denote the states of the system that would be generated if only the unauthenticated control signal were applied by , and the resulting trajectory—by .
A “good” authentication signal entails little increase in the control cost (6) compared to its unauthenticated version while providing enhanced detection probability (12) and/or false alarm probability.
Remark 7.
In both the replayattack [29] and the statisticalduplicate [35] models no access to the control signal by the attacker was allowed. Thus, to improve the detection probability of the controller in case of an attack, one could add an authentication/watermarking signal, which facilitated the controller with identify abnormalities by correlating the input watermarking signal with its contribution to the sensor reading. Yet, since in the statisticalduplicate setting full system knowledge at the attacker was assumed, if the attacker has the access to the control signal it could easily simulate the contribution of the any inscribed watermarking signal to the sequence of fictitious sensor readings. In contrast, in the replayattack setting, no system knowledge is assumed rendering any knowledge of the control signal useless, unless learning the plant dynamics is invoked (and brings it to the realm of our work); the latter however takes away from the appeal of this technique which is owes it to its simplicity. Indeed, in our setup the attacker has full access to the control signal. However, in contrast to the statisticalduplicate setting, it cannot perfectly simulate the effect of the control signal as it lacks knowledge of the openloop gain. Thus, the watermarking signal here is used for a different purpose—to impede the learning process of the attacker.
At first glance, one may envisage that superimposing any watermarking signal on top of the control policy would necessarily enhance the detectability of an attack since the effective observations of the attacker are in this case noisier. However, it turns out that injecting a strong noise may in fact speed up the learning process as it improves the the power of the signal maginifed by the openloop gains with respect to the observed noise [50].
The following corollary proposes a class of watermarking signals that provide better guarantees on the deception probability .
Iv Simulation
In this section, we compare the empirical performance of the variancetest algorithm with the developed bounds in this work as well as the replay attack.
At every time , the controller tests the empirical variance for abnormalities over a detection window of size , , with a confidence interval around the expected variance (16). When the statistical test used in the simulation, the hijack indicator , and its estimate at the controller become equivalent to the definitions of the variance test in (16), the hijack indicator in (10), and the estimate of the latter of Sec. IID, respectively.
We use the following parameters for the simulation: , and , the openloop gain of the plant (2) is , the entries of the plant disturbance sequence are i.i.d. standard Gaussian. The applied control policy is . The length of exploration phase, for both the replay attack and the AIA, is . We use the LS algorithm (26) of Sec. IIIB1 to construct .
Fig. 2 demonstrates the weakness of the replay attack once the controller uses a sufficiently large detection window, even in the absence of watermarking.
In contrast, when no attack is cast on the system, the alarm rate becomes the false alarm rate and is also depicted in Fig. 2. Clearly, the false alarm probability is high for small detection windows and decays to zero as the detection window become large, with agreement with (20).
In our second simulation, depicted in Fig. 3, we evaluate the detection rate as a function of the power of a watermarking signal. To that end, we fix the detection window to be , which guarantees a negligible false alarm probability, and use Gaussian i.i.d. zeromean watermarks as in (33) with different power.
V Conclusions
We studied attacks on cyberphysical systems which consist of exploration and exploitation phases, where the attacker first explores the dynamic of the plant, after which it hijacks the system by playing a fictitious sensor reading to the controller/detector while and feedind a detrimental control input to the plant. Future research will address the setting of authentication systems in which both the attacker and the controller do not know the dynamic of the plant. To that end, one needs to generate watermarking signals that simultaneously facilitate the learning of the controller and hinder the learning of the attacker.
Acknowledegment
This research was partially supported by NSF award CNS1446891. This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie SkłodowskaCurie grant agreement No 708932. This work was done, in part, while A. Khina was visiting the Simons Institute for the Theory of Computing.
References
 [1] K.D. Kim and P. R. Kumar, “Cyber–physical systems: A perspective at the centennial,” Proceedings of the IEEE, vol. 100, no. Special Centennial Issue, pp. 1287–1308, 2012.
 [2] A. A. Cardenas, S. Amin, and S. Sastry, “Secure control: Towards survivable cyberphysical systems,” System, vol. 1, no. a2, p. a3, 2008.
 [3] H. Sandberg, S. Amin, and K. H. Johansson, “Cyberphysical security in networked control systems: An introduction to the issue,” IEEE Control Systems, vol. 35, no. 1, pp. 20–23, 2015.
 [4] Y. Mo, T. H.J. Kim, K. Brancik, D. Dickinson, H. Lee, A. Perrig, and B. Sinopoli, “Cyber–physical security of a smart grid infrastructure,” Proceedings of the IEEE, vol. 100, no. 1, pp. 195–209, 2012.
 [5] Y. Y. Haimes, “Risk of terrorism to cyberphysical and organizationalsocietal infrastructures,” Public Works Management & Policy, vol. 6, no. 4, pp. 231–240, 2002.
 [6] A. S. Brown, “SCADA vs. the hackers,” Mechanical Engineering, vol. 124, no. 12, p. 37, 2002.
 [7] S. Checkoway, D. McCoy, B. Kantor, D. Anderson, H. Shacham, S. Savage, K. Koscher, A. Czeskis, F. Roesner, T. Kohno et al., “Comprehensive experimental analyses of automotive attack surfaces.” in USENIX Security Symposium. San Francisco, 2011, pp. 77–92.
 [8] K. Koscher, A. Czeskis, F. Roesner, S. Patel, T. Kohno, S. Checkoway, D. McCoy, B. Kantor, D. Anderson, H. Shacham et al., “Experimental security analysis of a modern automobile,” in Security and Privacy (SP), 2010 IEEE Symposium on. IEEE, 2010, pp. 447–462.
 [9] J. Slay and M. Miller, “Lessons learned from the maroochy water breach,” in International Conference on Critical Infrastructure Protection. Springer, 2007, pp. 73–82.
 [10] G. Liang, S. R. Weller, J. Zhao, F. Luo, and Z. Y. Dong, “The 2015 ukraine blackout: Implications for false data injection attacks,” IEEE Transactions on Power Systems, vol. 32, no. 4, pp. 3317–3318, 2017.
 [11] R. M. Lee, M. J. Assante, and T. Conway, “German steel mill cyber attack,” Industrial Control Systems, vol. 30, p. 62, 2014.
 [12] D. P. Fidler, “Was stuxnet an act of war? decoding a cyberattack,” IEEE Security & Privacy, vol. 9, no. 4, pp. 56–59, 2011.
 [13] R. Langner, “Stuxnet: Dissecting a cyberwarfare weapon,” IEEE Security & Privacy, vol. 9, no. 3, pp. 49–51, 2011.
 [14] T. Chen and S. AbuNimeh, “Lessons from stuxnet,” Computer, vol. 44, no. 4, pp. 91–93, 2011.
 [15] N. Falliere, L. O. Murchu, and E. Chien, “W32. stuxnet dossier,” White paper, Symantec Corp., Security Response, vol. 5, no. 6, p. 29, 2011.
 [16] G. McDonald, L. O. Murchu, S. Doherty, and E. Chien, “Stuxnet 0.5: The missing link,” Symantec Report, 2013.
 [17] S. Amin, A. A. Cárdenas, and S. S. Sastry, “Safe and secure networked control systems under denialofservice attacks,” in International Workshop on Hybrid Systems: Computation and Control. Springer, 2009, pp. 31–45.
 [18] A. Cetinkaya, H. Ishii, and T. Hayakawa, “Networked control under random and malicious packet losses,” IEEE Transactions on Automatic Control, vol. 62, no. 5, pp. 2434–2449, 2017.
 [19] V. Dolk, P. Tesi, C. De Persis, and W. Heemels, “Eventtriggered control systems under denialofservice attacks,” IEEE Transactions on Control of Network Systems, vol. 4, no. 1, pp. 93–105, 2017.
 [20] F. Pasqualetti, F. Dörfler, and F. Bullo, “Attack detection and identification in cyberphysical systems,” IEEE Transactions on Automatic Control, vol. 58, no. 11, pp. 2715–2729, 2013.
 [21] C.Z. Bai, F. Pasqualetti, and V. Gupta, “Datainjection attacks in stochastic control systems: Detectability and performance tradeoffs,” Automatica, vol. 82, pp. 251–260, 2017.
 [22] H. Fawzi, P. Tabuada, and S. Diggavi, “Secure estimation and control for cyberphysical systems under adversarial attacks,” IEEE Transactions on Automatic Control, vol. 59, no. 6, pp. 1454–1467, 2014.
 [23] Y. Mo, E. Garone, A. Casavola, and B. Sinopoli, “False data injection attacks against state estimation in wireless sensor networks,” in Decision and Control (CDC), 2010 49th IEEE Conference on. IEEE, 2010, pp. 5967–5972.
 [24] Y. Shoukry, P. Martin, Y. Yona, S. Diggavi, and M. Srivastava, “Pycra: Physical challengeresponse authentication for active sensors under spoofing attacks,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 2015, pp. 1004–1015.
 [25] Y. Shoukry, M. Chong, M. Wakaiki, P. Nuzzo, A. SangiovanniVincentelli, S. A. Seshia, J. P. Hespanha, and P. Tabuada, “Smtbased observer design for cyberphysical systems under sensor attacks,” ACM Transactions on CyberPhysical Systems, vol. 2, no. 1, p. 5, 2018.
 [26] N. Bezzo, J. Weimer, M. Pajic, O. Sokolsky, G. J. Pappas, and I. Lee, “Attack resilient state estimation for autonomous robotic systems,” in Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on. IEEE, 2014, pp. 3692–3698.
 [27] N. Asokan, V. Niemi, and K. Nyberg, “Maninthemiddle in tunnelled authentication protocols,” in International Workshop on Security Protocols. Springer, 2003, pp. 28–41.
 [28] Y. Mo and B. Sinopoli, “Secure control against replay attacks,” in Communication, Control, and Computing, 2009. Allerton 2009. 47th Annual Allerton Conference on. IEEE, 2009, pp. 911–918.
 [29] Y. Mo, R. Chabukswar, and B. Sinopoli, “Detecting integrity attacks on SCADA systems,” IEEE Transactions on Control Systems Technology, vol. 22, no. 4, pp. 1396–1407, 2014.
 [30] Y. Mo, S. Weerakkody, and B. Sinopoli, “Physical authentication of control systems: designing watermarked control inputs to detect counterfeit sensor outputs,” IEEE Control Systems, vol. 35, no. 1, pp. 93–109, 2015.
 [31] R. Langner, “To kill a centrifuge: A technical analysis of what Stuxnet’s creators tried to achieve,” 2013.
 [32] M. Zhu and S. Martínez, “On the performance analysis of resilient networked control systems under replay attacks,” IEEE Transactions on Automatic Control, vol. 59, no. 3, pp. 804–808, 2014.
 [33] F. Miao, M. Pajic, and G. J. Pappas, “Stochastic game approach for replay attack detection,” in Decision and control (CDC), 2013 IEEE 52nd annual conference on. IEEE, 2013, pp. 1854–1859.
 [34] M. Zhu and S. Martínez, “On distributed constrained formation control in operator–vehicle adversarial networks,” Automatica, vol. 49, no. 12, pp. 3571–3582, 2013.
 [35] B. Satchidanandan and P. R. Kumar, “Dynamic watermarking: Active defense of networked cyber–physical systems,” Proceedings of the IEEE, vol. 105, no. 2, pp. 219–240, 2017.
 [36] W.H. Ko, B. Satchidanandan, and P. Kumar, “Theory and implementation of dynamic watermarking for cybersecurity of advanced transportation systems,” in Communications and Network Security (CNS), 2016 IEEE Conference on. IEEE, 2016, pp. 416–420.
 [37] K. J. Åström and B. Wittenmark, Adaptive control. Courier Corporation, 2013.
 [38] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press Cambridge, 1998, vol. 1, no. 1.
 [39] P. Auer, N. CesaBianchi, Y. Freund, and R. E. Schapire, “Gambling in a rigged casino: The adversarial multiarmed bandit problem,” in Foundations of Computer Science, 1995. Proceedings., 36th Annual Symposium on. IEEE, 1995, pp. 322–331.
 [40] D. Liberzon, Calculus of variations and optimal control theory: a concise introduction. Princeton University Press, 2011.
 [41] Y. Mo, R. Chabukswar, and B. Sinopoli, “Detecting integrity attacks on SCADA systems,” IEEE Transactions on Control Systems Technology, vol. 22, no. 4, pp. 1396–1407, 2014.
 [42] E. L. Lehmann and J. P. Romano, Testing statistical hypotheses, 3rd ed. New York, NY: Springer Science & Business Media Inc., 2005.
 [43] D. Marelli and M. Fu, “Ergodic properties for multirate linear systems,” IEEE transactions on signal processing, vol. 55, no. 2, pp. 461–473, 2007.
 [44] H. W. Sorenson, “Leastsquares estimation: from Gauss to Kalman,” IEEE spectrum, vol. 7, no. 7, pp. 63–68, 1970.
 [45] L. Ljung, “Consistency of the leastsquares identification method,” IEEE Transactions on Automatic Control, vol. 21, no. 5, pp. 779–781, 1976.
 [46] T. L. Lai and C. Z. Wei, “Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems,” The Annals of Statistics, pp. 154–166, 1982.
 [47] M. Raginsky, “Divergencebased characterization of fundamental limitations of adaptive dynamical systems,” in Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on. IEEE, 2010, pp. 107–114.

[48]
S. ShalevShwartz and S. BenDavid,
Understanding machine learning: From theory to algorithms
. Cambridge university press, 2014.  [49] G. Zames, “Adaptive control: Towards a complexitybased general theory,” Automatica, vol. 34, no. 10, pp. 1161–1167, 1998.
 [50] K. J. Åström and P. Eykhoff, “System identificationa survey,” Automatica, vol. 7, no. 2, pp. 123–162, 1971.
 [51] R. Durrett, Probability: theory and examples. Cambridge university press, 2010.
 [52] J. C. Duchi and M. J. Wainwright, “Distancebased and continuum fano inequalities with applications to statistical estimation,” arXiv preprint arXiv:1311.2669, 2013.
 [53] Y. Yang and A. Barron, “Informationtheoretic determination of minimax rates of convergence,” Annals of Statistics, pp. 1564–1599, 1999.