I Introduction
A timevarying complex network with fractional dynamics has node activities over time influenced by its own history and also by activities of the other nodes. The systems with such attributes enables the modeling of coupled nonstationary processes that exhibit longterm memory [1, 2, 3, 4, 5]. A multitude of examples can be found in several application domains such as biology, and engineering, e.g. bacteria dynamics [6, 7, 8] and swarm robotics [9, 10], respectively. Nonetheless, their applicability becomes even more useful in the context of heterogeneous networks that interact geographically and across different timescales, as it often occurs in the context of cyberphysical systems (CPS).
In the context of the present manuscript, we are motivated by the recent success of fractional order dynamics in modeling spatiotemporal properties of physiological signals, such as electrocardiogram (ECG), electromyogram (EMG), electroencephalogram (EEG) and blood oxygenation level dependent (BOLD) just to mention a few [11, 12]. Despite these modeling capabilities, there is one main limitation that continues to elude scientists and engineers alike. Specifically, complex networks such as the brain, whose nodes will dynamically evolve using fractal order dynamics, are often observed locally. Meaning that some of the dynamics assessed by the models are not only due to the local interaction, but might be constrained by unknown sources, i.e., stimuli that are external to the considered network. Consequently, we propose a model that enables us to account for the existence of such unknown stimuli, and determine the model that best captures the local dynamics under such stimuli. Observe that this enhances the analysis of these systems once we have an additional feature (i.e., the stimuli) that can be the main driver of a possible abnormal behavior of the network [13].
In the context of neuroscience and medical applications, the ability to determine unknown inputs is crucial in the retrieval of the model under the socalled input artifacts [14] captured in the EEG readings due to a pulse that causes a higher variation of the potential than the baseline. Alternatively, the existence of a model to cope with the presence of unknown inputs enable the modeling of stimuli that are originated in the subcortical structures of the brain that are often neglected in the current EEG and functional magnetic resonance imaging (fMRI) that leverages the BOLD signals [15]. Thus, it is imperative to develop such framework, as well as tools that enable us to perform the analysis and design of the systems modeled by such setup, which we introduce in the current paper.
To the best of the authors’ knowledge, such framework has not been advanced in the context of discretetime fractional order dynamics. In the domain of continuoustime fractional dynamics, works like [16, 17] exist for the design of observer in the presence of unknown inputs. The closest work for the discretetime case is [18], which does not consider the case of unknown inputs. Nonetheless, the usefulness of accounting for unknown inputs in the context of linear time invariant (LTI) systems is an old topic [19, 20, 21, 22, 23, 24]. Specifically, the closest papers to the results proposed in this paper are as follows: observer with unknown inputs [20], delayed systems with unknown inputs [24], estimation of unknown inputs with sparsity constraints [25]. Notwithstanding, LTI systems are not good approximations for fractional order dynamical systems due to their limited memory capabilities. Yet, due to the numerical approximation of the highly nonlinear infinite dimension fractional order system used, we are able to obtain finite dimension closedform description that will enable us to derive results alike those attained in the context of LTI systems.
The main contributions of the present paper are threefold: (i) we present an alternating scheme that enables to determine the best estimate of the model parameters and unknown stimuli; and (ii) we provide necessary and sufficient conditions to ensure that it is possible to retrieve the state and unknown stimuli; and (iii) upon these conditions we determine a small subset of variables that need to be measured to ensure that both state and input can be recovered, while establishing suboptimality guarantees with respect to the smallest possible subset.
The remaining of the paper is organized as follows. Section II introduces the model considered in this paper and the main problems studied in this manuscript. Next, in Section III and IV, we present the solution to these problems. Finally, in Section V, we present an illustrative example of the main results using real EEG data from a wearable technology.
Ii Problem Formulation
In this section, we first describe the timevarying complex network model having fractional order dynamical growth under unknown excitations. Next, upon this model, we propose two main problems regarding analysis and design to be addressed in the present paper.
Iia System Model
We consider a linear discrete time fractionalorder dynamical model described as follows:
(1) 
where is the state, is the unknown input and
is the output vector. Also, we can describe the system by its matrices tuple
of appropriate dimensions. In what follows, we assume that the input size is always strictly less than the size of state vector, i.e., . The difference between a classic linear timeinvariant and the above model is the inclusion of fractionalorder derivative whose expansion and discretization [26] for any th state can be written as(2) 
where is the fractional order corresponding to the th state and with denoting the gamma function.
Having defined the system model, the system identification i.e. estimation of model parameters from the given data is an important step. It becomes nontrivial when we have unknown inputs since one has to be able to differentiate which part of the evolution of the complex network is due to its intrinsic dynamics and what is due to the unknown inputs. Subsequently, one of the first problems we need to address is that of system identification from the data, as described next.
IiB Datadriven Model Estimation
The fractionalorder dynamical model takes care of longrange memory which often is the property of many physiological signals. The estimation of the spatiotemporal parameters when there are no inputs to the system was addressed in [18]. But as it happens, ignoring the inputs inherently assume that the system is isolated from the external surrounding. Hence, for a model to be able to capture the system dynamics, the inclusion of unknown inputs is necessary. Therefore, the first problem that we consider is as follows.
Problem1: Given the input coupling matrix , and measurements of all states across a time horizon of length , we aim to estimate the model parameters and the unknown inputs .
IiC Sensor Selection
For the system model described by (1), where the system parameters were determined as part of the solution to the Problem1, we consider that the output measurements are collected only by a subset of sensors. In numerous applications (for example physiological systems) it happens that the sensors are dedicated, i.e., each sensor capture an individual state [27], so the measurement model can be written as
(3) 
where is the matrix constructed by selecting rows indexed by set S of the identity matrix . As an example, if all sensors are selected, i.e., , then . For selecting the best set of sensors , with knowledge of the system matrices and the given observations, we would resort to the constraint of perfect observability that is defined as follows.
Definition 1 (Perfect Observability).
A system described by (1) is called perfectly observable if given the system matrices and observations , it is possible to recover the initial state and the unknown inputs .
Subsequently, the second problem that we consider is as follows. Problem2: Determine the minimum number of sensors such that the system whose dynamics is captured by is perfectly observable from the collection of measurements collected by a subcollection of dedicated outputs, i.e.,
(4) 
Iii Model Estimation
We consider the problem of estimating , and inputs from the given limited observations , , which due to the dedicated nature of sensing mechanism is same as and under the assumption that the input matrix is known. The realization of can be application dependent and is computed separately using experimental data. For the simplicity of notation, let us denote with chosen appropriately. The prefactors in the summation in (2) grows as and, therefore, for the purpose of computational ease we have limited the summation in (2) to the first values, where is sufficiently large. Therefore, can be written as
(5) 
with the assumption that for . Using the above introduced notations and the model definition in (1), the given observations are written as
(6) 
where
is assumed to be Gaussian noise independent across space and time. For simplicity, we have assumed that each noise component has same variance, i.e.,
. Also, let us denote the system matrices as and . The vertical concatenated states and inputs during an arbitrary window of time as , respectively, and for any th state we have . For the sake of brevity, we would be dropping the time horizon subscript from the above matrices as it is clear from the context.Since the problem of joint estimation of the different parameters is highly nonlinear, we proceed as follows: (i) we estimate the fractional order using the wavelet technique described in [28]; and (ii) with known, the in (5) can be computed under the additional assumption that the system matrix is known. Therefore, the problem now reduces to estimate and the inputs
. Towards this goal, we propose the following sequential optimization algorithm similar to an expectationmaximization (EM) algorithm
[29]. Briefly, the EM algorithm is used for maximum likelihood estimation (MLE) of parameters subject to hidden variables. Intuitively, in our case, in Algorithm 1, we estimate in the presence of hidden variables or unknown unknowns . Therefore, the ‘Estep’ is performed to average out the effects of unknown unknowns and obtain an estimate of , where due to the diversity of solutions, we control the sparsity of the inputs (using the parameter ). Subsequently, the ‘Mstep’ can then accomplish MLE estimation to obtain an estimate of . The solution provided in [18] can be related to the proposed technique as follows.Remark 1.
Proof.
It is worthwhile to mention the following result regarding EM algorithm.
Theorem 1 ([30]).
The incomplete data (without unknown unknowns) likelihood is nondecreasing after an EM iteration.
Hence, the proposed algorithm being EM (detailed formulation in the Appendix A) has nondecreasing likelihood. Additionally, we have the following result about the incomplete data likelihood.
Proposition 1.
The incomplete data likelihood is bounded at each iteration .
We can comment about the convergence of the Algorithm 1 as follows.
Lemma 1.
The Algorithm 1 is convergent in the likelihood sense.
Proof.
It should be noted that convergence in likelihood does not always guarantee convergence of the parameters. But as emphasized in [31], from numerical viewpoint the convergence of parameters is not as important as convergence of the likelihood. Also the EM algorithm can converge to saddle points as exemplified in [29].
Iv Sensor Selection
Before defining the problem of sensor selection, we review the properties of fractionalorder growth system with closedform expressions for state vectors. Using the expansion of fractional order derivative in (2), we can write the state evolving equation as
(7) 
where . Alternatively, (7) can be written as
(8) 
where and for . With this definition of , we can define the matrices as follows [32]:
(9) 
Thus, we can obtain the following result.
Iva System Observability
To achieve perfect observability, i.e., to retrieve the initial state and the unknown inputs, we need system matrices and observations. While any observation matrix is sufficient for defining the perfect observability, if we set as introduced in Section II, then and are intertwined. Simply speaking, by increasing we will have more measurements acquired across time which could be used to compensate the number of measurements at each instance of time ruled by the set of sensors used.
Given the observations from to , we can represent the initial state and the unknown inputs using (10) by defining the following matrices
(11) 
and
(12) 
where and are the observation and input matrices from (1) respectively, and is as defined in (9). Having and defined and using (1), we can write the initial state and inputs in terms of the observations as
(13) 
where and .
Using (13) and the Definition 1, a necessary and sufficient condition to attain the perfect observability is obtained as follows.
Proposition 2.
The system described by (1) is perfectly observable after measurements if and only if .
Proof.
The proof follows from rewriting equation (13) as
(14) 
and, therefore, and first inputs from can be recovered if and only if . ∎
Proposition 2 will be key to formulate the constraints in the sensor selection problem as detailed in the next section.
IvB Sensor Selection Problem
Given the system matrices and first observations, the problem of sensor selection is defined as selecting the minimum number of sensors such that the system is perfectly observable. It can be mathematically written as
(15) 
where denotes the rank of matrix when and are constructed from (11) and (12) with . An analogous problem of sensor selection with no inputs is studied in [27] and it was shown to be NPhard; hence, (15) is at least as computationally challenging since it contains the former as a special case when .
Consequently, we propose a suboptimal solution to the above problem, while providing optimality guarantees within constant factor. For the discrete set , a function is called submodular if for all sets
(16) 
Also, the marginal of an element w.r.t. set is defined as . Alternatively, a set function is referred as submodular if and only if it satisfies the diminishing returns property, i.e., for all and [33]. The monotone submodular functions have a remarkable property of performance through greedy selection within constant factor of the optimality [34].
With the motivation to borrow such attributes, let us define a discrete set function as, .
Theorem 2.
The discrete set function is submodular in .
Since is submodular, we will be using a greedy selection of sensors to maximize the rank of ; in other words, greedily select sensors such that . The sensor selection algorithm is described as Algorithm 2.
Lemma 3.
The complexity of Algorithm 2 with a total of sensors and length time horizon is i.e polynomial.
Proof.
With the computation of would require at most . The algorithm being forward greedy selection has at most executions and hence the complexity of the algorithm is at most . ∎
V Experiments
We demonstrate the application of the results derived in the previous sections on physiological signals. In particular we have taken a channel EEG signal which records the brain activity of 109 subjects. The channel/electrode distribution with the corresponding labels and numbers are shown in Figure 1. The subjects were asked to perform motor and imagery tasks. The data was collected by BCI system with sampling rate of 160Hz [35, 36].
Va Model parameters estimation
The parameters of the system model and , are estimated by the application of Algorithm 1. The performance of EM algorithm like any iterative algorithm is crucially dependent on its initial conditions. For the considered example of EEG dataset, it was observed that convergence of the algorithm is fast. Further, even a single iteration was sufficient to reach the point of local maxima of the likelihood. This shows that the choice of the initial point for EM algorithm is considerably good. The input coupling matrix can be easily determined through experiments. The values predicted by the model in comparison with actual data are shown in Figure 2. The one step prediction follows very closely the actual data, but there is small mismatch in the five step prediction. The ratio of square root of mean squared error of the prediction by model with and without inputs [27] is shown in Figure 3 for total of subjects. As observed, the error ratio is less than onethird in the case when unknown inputs is considered. Therefore, the fractionalorder dynamical model with unknown inputs fits the EEG data much better than the case of no inputs. In the next part, we will use these estimated parameters to compute the set of sensors for perfect observability.
VB Sensor selection
The estimated parameters are used to construct and as written in (11) and (12) for the greedy sensor selection Algorithm 2. Upon the application of Algorithm 2, we found that roughly half of the sensors out of are sufficient enough to retrieve the initial state and unknown inputs uniquely. The selected sensors for a particular subject are marked in the Figure 1. Due to the large size and sparsity of
matrix, some values of the initial state may blow up due to the presence of very small eigenvalues. In such cases, we can first remove the unknown inputs from (
13) by multiplying both sides by , i.e., projecting signals into the orthogonal space of . We can then enforce the norm constraint of into the least squares estimation ofor, in other words, use RIDGE regression
[37], i.e.,where .
Upon the knowledge of the initial state, the unknown inputs are recovered in the similar fashion or by enforcing sparsity constraint. Figure 4 shows a comparison between the actual and retrieved initial states when using the sensor set as output of the Algorithm 2. The retrieved initial states are close to the actual values provided they are estimated in the presence of numerical errors and sparsity.
The presented experimental results show how the proposed schemes are useful in mapping the complex dynamics of brain in the presence of unknown stimuli. The same framework can be easily applied to the study of other complex time evolving networks such as the physiological dynamics systems, for example BOLD signals, EMG, ECG etc.
Vi Conclusion
In this paper, we introduced the framework of discrete fractional order dynamical systems under unknown inputs. Also, we provided tools to perform analysis and design of such systems. Specifically, for the analysis, we presented an alternating scheme that enables to determine the best estimate of the model parameters and unknown stimuli. Also, we provided necessary and sufficient conditions to ensure that it is possible to retrieve the state and unknown stimuli. Furthermore, we enable the design of sensing capabilities of such systems, and provided a mechanism to determine a small subset of variables that need to be measured to ensure that both state and input can be recovered, while establishing suboptimality guarantees with respect to the smallest possible subset.
Future research will focus on exploiting the structure of fractional order dynamical systems in the context of multicase scenarios under quantitative description of the estimation quality of the states and inputs. Such extension will enable to determine the confidence in the model obtained that would permit formal claims about the mechanism underlying in the process under study. Additionally, some of the algorithms need to be improved to be deployed in realtime applications when energy and computational resources are limited.
Vii Acknowledgment
The authors are thankful to the reviewers for their valuable feedback. G.G. and P.B. gratefully acknowledge the support by the U.S. Army Defense Advanced Research Projects Agency (DARPA) under grant no. W911NF1710076 and DARPA Young Faculty Award under grant no. N66001171 4044.
Appendix A EM Formulation
We present the detailed construction of EM like algorithm in this section. In our formulation, the observed (incomplete) data is and while is the hidden data, therefore the complete data would be . Let us consider, and at the th iteration we denote
We can enforce Laplacian prior for for sparsity (any other prior could also be used) such that . Therefore, is then derived as
We have approximated the conditional distribution as . In the final step of expectation, we can write
where is used to signify the likelihood of the complete data, and constants are ignored. For the Maximization step, we can simply write
or in other words,
where any th component of is .
Appendix B Proof of Proposition 1
Proof.
We show that the likelihood for incomplete (observed) data is bounded at each EM update step. Let us denote the likelihood of the observed data in relation to the parameter as
(17) 
which is further written as
where is the normality constant. Therefore is bounded for every iteration index . ∎
Appendix C Proof of Theorem 2
Proof.
For a given matrix , let denote the span of rows of indexed by set . Let be the rank of matrix composed by rows indexed by set , therefore . For given , and , we can write
where the last equality is written using the fact that dimension of intersection of and orthogonal space of are number of linearly independent rows in which are not in i.e. . ∎
References
 [1] F. Moon, Chaotic and Fractal Dynamics: An Introduction for Applied Scientists and Engineers. Wiley, 1992.
 [2] B. N. Lundstrom, M. H. Higgs, W. J. Spain, and A. L. Fairhall, “Fractional differentiation by neocortical pyramidal neurons,” Nature Neuroscience, vol. 11, no. 11, pp. 1335–1342, Nov 2008.
 [3] G. Werner, “Fractals in the nervous system: Conceptual implications for theoretical neuroscience,” Frontiers in Physiology, vol. 1, p. 15, 2010.
 [4] R. G. Turcott and M. C. Teich, “Fractal character of the electrocardiogram: Distinguishing heartfailure and normal patients,” Annals of Biomedical Engineering, vol. 24, no. 2, pp. 269–293, 1996.
 [5] S. Thurner, C. Windischberger, E. Moser, P. Walla, and M. Barth, “Scaling laws and persistence in human brain activity,” Physica A: Statistical Mechanics and its Applications, vol. 326, no. 34, pp. 511–521, 2003.
 [6] A. A. M. Arafa, “Fractional differential equations in description of bacterial growth,” Differential Equations and Dynamical Systems, vol. 21, no. 3, pp. 205–214, Jul 2013.
 [7] F. A. Rihan, D. Baleanu, S. Lakshmanan, and R. Rakkiyappan, “On fractional sirc model with salmonella bacterial infection,” in Abstract and Applied Analysis, vol. 2014. Hindawi Publishing Corporation, 2014.
 [8] H. Koorehdavoudi, P. Bogdan, G. Wei, R. Marculescu, J. Zhuang, R. W. Carlsen, and M. Sitti, “Multifractal characterization of bacterial swimming dynamics: a case study on real and simulated serratia marcescens,” Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 473, no. 2203, 2017.
 [9] M. S. Couceiro, R. P. Rocha, N. M. F. Ferreira, and J. A. T. Machado, “Introducing the fractionalorder darwinian pso,” Signal, Image and Video Processing, vol. 6, no. 3, pp. 343–350, Sep 2012.

[10]
M. Couceiro and S. Sivasundaram, “Novel fractional order particle swarm optimization,”
Applied Mathematics and Computation, vol. 283, pp. 36 – 54, 2016.  [11] R. L. Magin, “Fractional calculus in bioengineering,” in 2012 13th International Carpathian Control Conference, ICCC 2012, 2012.
 [12] D. Baleanu, J. A. T. Machado, and A. C. Luo, Fractional dynamics and control. Springer Science & Business Media, 2011.
 [13] A. D. Norden and H. Blumenfeld, “The role of subcortical structures in human epilepsy,” Epilepsy & Behavior, vol. 3, no. 3, pp. 219–231, 2002.

[14]
A. Delorme, T. Sejnowski, and S. Makeig, “Enhanced detection of artifacts in EEG data using higherorder statistics and independent component analysis,”
Neuroimage, vol. 34, no. 4, pp. 1443–1449, 2007.  [15] A. S. Shah, S. L. Bressler, K. H. Knuth, M. Ding, A. D. Mehta, I. Ulbert, and C. E. Schroeder, “Neural dynamics and the fundamental mechanisms of eventrelated brain potentials,” Cerebral cortex, vol. 14, no. 5, pp. 476–483, 2004.
 [16] S. Kong, M. Saif, and B. Liu, “Observer design for a class of nonlinear fractionalorder systems with unknown input,” Journal of the Franklin Institute, vol. 354, no. 13, pp. 5503 – 5518, 2017.
 [17] I. N’Doye, M. Darouach, H. Voos, and M. Zasadzinsk, “Design of unknown input fractionalorder observers for fractionalorder systems,” International Journal of Applied Mathematics and Computer Science, vol. 23, no. 3, pp. 491–500, 2013.
 [18] Y. Xue, S. Rodriguez, and P. Bogdan, “A spatiotemporal fractal model for a CPS approach to brainmachinebody interfaces,” in DATE, 2016.
 [19] B. A. Charandabi and H. J. Marquez, “A novel approach to unknown input filter design for discretetime linear systems,” Automatica, vol. 50, no. 11, pp. 2835–2839, 2014.
 [20] S. Sundaram and C. N. Hadjicostis, “Optimal state estimators for linear systems with unknown inputs,” in Decision and Control, 2006 45th IEEE Conference on. IEEE, 2006, pp. 4763–4768.
 [21] J. N. Yang and H. Huang, “Sequential nonlinear leastsquare estimation for damage identification of structures with unknown inputs and unknown outputs,” International Journal of Nonlinear mechanics, vol. 42, no. 5, pp. 789–801, 2007.
 [22] Y. Park and J. L. Stein, “Closedloop, state and input observer for systems with unknown inputs,” International Journal of Control, vol. 48, no. 3, pp. 1121–1136, 1988.

[23]
C.S. Hsieh, “On the optimality of twostage kalman filtering for systems with unknown inputs,”
Asian Journal of Control, vol. 12, no. 4, pp. 510–523, 2010.  [24] S. Sundaram and C. N. Hadjicostis, “Delayed observers for linear systems with unknown inputs,” Automatic Control, IEEE Transactions on, vol. 52, no. 2, pp. 334–339, 2007.
 [25] S. Sefati, N. J. Cowan, and R. Vidal, “Linear systems with sparse inputs: Observability and input recovery,” in American Control Conference (ACC), 2015. IEEE, 2015, pp. 5251–5257.
 [26] A. Dzielinski and D. Sierociuk, “Adaptive feedback control of fractional order discrete statespace systems,” in CIMCAIAWTIC, 2005.
 [27] Y. Xue, S. Pequito, J. R. Coelho, P. Bogdan, and G. J. Pappas, “Minimum number of sensors to ensure observability of physiological systems: a case study,” in Allerton, 2016.
 [28] P. Flandrin, “Wavelet analysis and synthesis of fractional brownian motion,” IEEE Transactions on Information Theory, vol. 38, no. 2, pp. 910–917, March 1992.
 [29] G. McLachlan and T. Krishnan, The EM Algorithm and Extensions. John Wiley & Sons, New York, 1996.
 [30] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 39, no. 1, pp. 1–38, 1977.
 [31] C. F. J. Wu, “On the convergence properties of the EM algorithm,” Ann. Statist., vol. 11, no. 1, pp. 95–103, Mar 1983.
 [32] S. Guermah, S. Djennoune, and M. Bettayeb, “Controllability and observability of linear discretetime fractionalorder systems,” Applied Mathematics and Computer Science, vol. 18, no. 2, pp. 213–222, 2008.
 [33] F. Bach, “Learning with submodular functions: A convex optimization perspective,” 2013, arXiv:1111.6453.
 [34] M. Conforti and G. Cornuejols, “Submodular set functions, matroids and the greedy algorithm: tight worstcase bounds and some generalizations of the RadoEdmonds theorem,” Discrete Applied Math, vol. 7, no. 3, pp. 251–274, 1984.
 [35] G. Schalk, D. J. McFarland, T. Hinterberger, N. Birbaumer, and J. R. Wolpaw, “Bci2000: a generalpurpose braincomputer interface (BCI) system,” IEEE Transactions on Biomedical Engineering, vol. 51, no. 6, pp. 1034–1043, June 2004.
 [36] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet,” Circulation, vol. 101, no. 23, pp. e215–e220, 2000.
 [37] K. P. Murphy, Machine Learning: A Probabilistic Perspective. The MIT Press, 2012.
Comments
There are no comments yet.