I Introduction
Daily activities of human body are performed through the joint functioning of biological subsystems, including nervous, muscular, respiratory, etc. Extensive attention has been devoted into developing methods for utilizing the rich information collected from human body via multiple sources, while each source of information is referred to as a modality. The major challenge of multimodal data fusion of human physiological activities is to characterize the nonlinear, timevarying interrelationship among subsystems. To address this challenge, a trending approach in recent years is to construct a mapping from multivariate time series to complex networks for modeling the dynamic patterns of interactions among components in a complex system [1, 2]
. In network models, the nodes represent the components of system and the edges represent the (dis)similarity or dependency relationships among any pairs of components (e.g., Euclidean distance, mutual information, causality, etc.). On the other side, the topological structures of network models can be described through a variety of quantitative metrics, which allows hybrid pattern recognition methods to classify the mental/physical states of the human body through physiological activities.
Despite the strong potential of network approaches for analyzing human body as a complex system, limitations still exists since the conventional distance measures are not perfectly suitable for capturing the (dis)similarity between the coupling dynamics for physiological time series [3]. Distance measures for quantifying the interrelationships among multivariate time series were introduced in previous study (for a comprehensive and most recent review, please refer to [4]). In particular, considering the nonlinear and nonstationary properties of human physiological signals, we selected recurrence plot (RP) [5] to characterize the pairwise (dis)similarity based on the chaotic behaviors of physiological time series. Through the mapping from time series to RP, quantification metrics (recurrence quantification analysis, RQA [6]) are available to describe the chaotic features of a complex system, e.g. periodicity and predictability (determinism). Moreover, Romano [7] extended RQA from univariate to multivariate time series. Feldhoff et al. developed a networkbased method to identify the intersystem coupling relationships [8]. These previous studies have enabled (1) a new class of quantitative distance measurements for nonlinear time series analysis based on their synchronization attributes; (2) the integration of networkbased analysis and multivariate pattern recognition to time series data. However, as previous studies used homogeneous temporal data collected from systems/subsystems driven by similar sources, the capability of this type of methods has not been fully investigated on complex systems consisting of multivariate time series with heterogeneous response patterns, temporal scales and resolutions. Most existing multimodal data fusion methods, to the best of our knowledge, integrate the information from multimodal physiological data based on feature or decision level.
In this study, we present a recurrenceplot based method to fuse information from multivariate time series on the signal level in order to recover the nonlinear coupling relationships among biological subsystems while preserve the heterogeneous temporal scales in physiological responses of different subsystems. Moreover, we intend to examine whether if our method is capable to characterize interrelation patterns of the human body under different emotional states using the physiological patterns represented by timevarying networks.
Ii Method
In this section, we demonstrate a network model for mapping multimodal time series to complex dynamic networks, where the nodes in the network represent modalities and edges are defined based on the directional coupling between modalities. Fig. 1 illustrates the methodological flow. The network construction includes the following steps: (1) the reconstruction from time series to timevarying graphs using timedelayed embedding theorem [10]; (2) the RP construction for each channel [5]; (3) the extension of RP to multivariate time series using joint RP (JRP) [11]; and (4) the extraction of timevarying graph theoretic network metrics for classifying different emotional states.
In order to characterize the nonlinear coupling strength among multivariate time series, Determinism (DET) and Laminarity (LAM) are extracted from JRP [11]. For a univariate time series signal, the patterns of recurrences are depicted by RP [5]
, which is a binary matrix representing the neighborhood relationships among all state vectors during a certain period of time. Sequentially, the information fusion between any pairs of channels/variables is conducted by multiplying two RPs pointtopoint, generating another binary matrix (JRP). JRP can be converted into a network by treating the binary matrix as the adjacency matrix
for a network , as shown in Fig. 1. shows the temporal dependency pattern of a system using graph representations. The mathematical details are explained in the following sections.Iia From Univariate to Multivariate Nonlinear Time Series Analysis
RP is a visualization tool to represent the temporal dependency relationships between all states in a time series data using a binary, squared matrix [5]. Suppose the state of a system (or modality) at time i and j is represented by , the recurrences can be recorded by the following binary function:
(1) 
where is a Heaviside function and a black dot is assigned to coordinates on RP if , indicating at time and the state of the system is sufficiently close (within the system threshold ).
Although the original method was developed for a single time series, variations of RP included consideration of multivariate time series from different aspects. As the goal is to best capture the dynamic coupling between multiple modalities, JRP is employed which represents when a recurrence occurs simultaneously in two or more time series [11]. For any two systems and , the JRP is obtained by the equation:
(2) 
Moreover, as JRP characterizes the simultaneous occurrence of recurrences [6], the joint recurrence quantification analysis (JRQA) defines a set of metrics based on JRP for quantifying nonlinear dynamics underlying bivariate coupling time series. In the present study, two selected metrics are computed based on JRP, which are defined as below [11]:
IiA1 Determinism (DET)
describes the predictability of systems. DET of is defined as the average length of diagonal lines with minimal length . Similarly, JDET can be computed to present how predictable the coupling behaviors are for coupling systems X, Y based on JRP . In the present study, we select the , which means a diagonal line need to have at least three points in a row.
IiA2 Laminarity (LAM)
Laminarity describes the average time of staying in a consistent state. This is an analog of DET but defined based on vertical lines. LAM is computed as the average length of vertical lines with minimal length , and comparably, JLAM is computed for . In the present study, we define the .
IiB Constructing Timevarying Graphs and Identifying Coupling Relationships
In order to capture the timevarying dynamics and perform a signallevel data fusion, a sliding window is applied to each channel throughout the time series as the first step. The segmented time series data is used for reconstructing phase space trajectory and constructing RP, which describes the recurrence patterns of each modality. Then, the intermodality connections are built based on the simultaneous recurrences of two modalities by constructing JRP using pairwise multiplications of RPs. Finally, JRQA metrics are computed in order to quantify the coupling of periodic/deterministic behaviors for any two subsystems (modalities) invariant to the original temporal scales of time series. This desirable feature enables the signallevel data fusion regardless of varying temporal scales. The resulting JRQA metrics are converted into edge weights for the corresponding pair of modalities. Note that each node can either present a channel or a modality; for the latter case, if one modality has multiple channels, the nodes of all channels are merged as one node to present the modality by taking the avereage weights of each inter or intrasystem edge. In this study, channels from the same modality are merged, such that each node represents a modality. Lastly, after constructing the network representation for each time window , a timevarying network is created by concatenating the static networks obtained by applying a sliding window throughout an dimensional time series , where each is a column vector of components.
IiC Timevarying Graph Theoretic Metrics for Multimodal Pattern Characterization
Temporal networks, or timevarying networks, are used in our model for characterizing the timeordered sequential changes among physiological subsystems [12]. Some metrics are defined as the generalization of the static network metrics, while the temporal changes are considered as an additional dimension. The selected timevarying network metrics are implemented in MATLAB 2017b using a toolbox [13]. The interpretation of selected graph theoretic metrics is as follows.
IiC1 Network efficiency
defines how efficiently information could spread throughout the network in temporal scale. It is a global measure for the entire network.
IiC2 Timevarying shortest path
defines how fast can two nodes be reachable for each other in temporal scale. The unit for steps are time intervals.
IiC3 Number of fastest paths
defines how many fastest timevarying shortest paths exist in between two nodes.
IiC4 Temporal smallworldness and temporal correlation coefficients
generalizes of smallworldness and clustering correlation coefficients to timevarying network.
IiC5 Connectedness
is a measure of reachability. Strongconnected nodes refer to two nodes which are reachable from both forward and backward temporal directions (i.e. paths from node i to j and j to i are both available in time order). Weakconnected nodes refer to two nodes which are reachable from only one temporal direction.
Iii Experimental Results
Iiia Dataset Description and Preprocessing
Our model was evaluated using five selected subjects from DEAP, a benchmark multimodal database in which EEG recordings and peripheral physiological signals were collected and published by Koelstra and colleagues [9]. A 1020 International system of 32 AgCl electrodes was used for EEG data recording at a sampling rate of 512 Hz. Multimodal physiological responses used for our study include Galvanic skin response (GSR) measured form left middle and ring finger, Zygomaticus and Trapezius Electromyography (zEMG and tEMG), Electrooculogram (EOG) and respiration (RESP).
In the experiment phase, each subject was instructed to watch music videos while their physiological responses were monitored by wearable sensors. SelfAssessment Manikin (SAM) [14] questionnaires were distributed to collect selfreported emotional scores (0 – 9) regarding these videos based on four dimensions: arousal, valence, liking, and dominance. Then, a threeclass classification problem was formulated by discretizing continuous scores of arousal and valence into low (1 – 4), medium (4 – 6) and high (6 – 9). The preprocessed signals (as described in [9]) have been downsampled to 128 Hz and segmented into 40 trials of oneminute long. In addition, EMG signals were highpass filtered at 10 Hz, and all other signals were bandpass filtered at 4 – 40 Hz. The sliding window for segmentation was five seconds with 20% overlap.
IiiB Feature Extraction
Constructing JRP from bivariate time series data requires the selection of timeembedding parameters, which is conducted as suggested in previous literature [6]
. Furthermore, since we intend to preserve as much information as possible from each channel, the JRPbased metrics are computed between each pair of channels to preserve interactions both across and within each modality. For example, the edge weight between EEG and EMG is computed by taking the overall average for all existing edges between EEG and EMG channels; the withinmodality interaction is estimated by averaging the edge weights within EEG and EMG respectively. Lastly, the timevarying graph metrics, as described in Section
IIC, are computed as features and fed to selected machine learning classifiers for training classification models.
IiiC Emotional States Classification and Validation
The selected subjects have at least 8 trials in each class (low, medium, high) of emotional states. Since many features are expected to be redundant or irrelevant to the prediction target,
Norm regularized logistic regression model (LASSO,
[15]) is used to simultaneously perform feature selection with model fitting by forcing small coefficients to be zeros. Two models are trained and evaluated independently for arousal and valence.
Classification performances are summarized in Table I. The best model for classifying valence uses JDET as edge weights for the graphs and achieves 52.50% for threeclass classification, while the best model for classifying arousal achieves 57.9% using JLAM. Fivefold crossvalidation is used for evaluating the generalizability on unobserved data. Our algorithm is implemented in MATLAB 2017b using customized code, while thirdparty toolboxes are used for computing features and training/testing classification models [16, 13, 17].
IiiD Discussions
As an example, the average patterns of dynamic synchronization among modalities for one subject (s02) are shown in Fig. 2. Note that each individual may have different connectivity patterns for emotional states. An interesting observation is that EMG has rarely been connected to any other modalities during all conditions. GSR and EOG appear to have more connections in high arousal scenarios, while respiration and ECG have more connections in low arousal scenarios. These patterns may reflect the physiological responses associated with psychological excitement or calm.
Compared to two previous studies [18, 19] for threeclass (high, medium, low) classification tasks on the same dataset, our model is comparable on classifying the emotional states in terms of the classification accuracy while considering the random selection level as 33.33% (as shown in Table I). While previous studies used more conventional and blackbox models, the extracted network features are useful to recognize the underlying system states and our model based on a linear LASSO classifier allows for intuitive insights into physiological responses of human affective states from a dynamic network perspective.
It is noteworthy that the proposed model produced more misclassified cases in the medium level than in the low and high levels. Given that the emotional states are labeled based on selfreport SAM scores, a potential source of variation is the subjective scoring of human subjects especially for the neutral states. It is extremely difficult to train a model on a small sample when some cases located on the boundary of two classes are very close. Another limitation of our method is the connectedness of timevarying graphs. The computation of some graph theoretic metrics would become problematic when the graph is disconnected; therefore, the threshold selection is critical when converting the distance metrics to binary metrics.
Iv Conclusion
In this study, we demonstrated a novel method for building timevarying networks as representations for time series signals. After extracting temporal graphtheoretic features based on both spatial and temporal coupling dynamics, the classification model was applied to recognize the emotional states of the complex human body system consisting of multimodal physiological time series. Our future work would extend the current model to a general approach which uses timevarying networks to characterize complex systems with various types of coupling relationships, allowing applications to different domains. Besides, the efficiency of information fusion in multiresolution, multimodal complex systems by approximate computations in network construction and metrics computation will be investigated more specifically.
References
 [1] Z.K. Gao, Y.X. Yang, W.D. Dang, Q. Cai, Z. Wang, N. Marwan, S. Boccaletti, and J. Kurths, “Reconstructing multimode networks from multivariate time series,” EPL (Europhysics Letters), vol. 119, no. 5, p. 50008, 2017.
 [2] J. M. Amigó and Y. Hirata, “Detecting directional couplings from multivariate flows by the joint distance distribution,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 28, no. 7, p. 075302, 2018.
 [3] S. Spiegel, J.B. Jain, and S. Albayrak, “A recurrence plotbased distance measure,” Translational Recurrences, p. 1, 2014.
 [4] Z.K. Gao, M. Small, and J. Kurths, “Complex network analysis of time series,” EPL (Europhysics Letters), vol. 116, no. 5, p. 50001, 2017.
 [5] J.P. Eckmann, S. O. Kamphorst, and D. Ruelle, “Recurrence plots of dynamical systems,” EPL (Europhysics Letters), vol. 4, no. 9, p. 973, 1987.
 [6] N. Marwan, N. Wessel, U. Meyerfeldt, A. Schirdewan, and J. Kurths, “Recurrenceplotbased measures of complexity and their application to heartratevariability data,” Physical review E, vol. 66, no. 2, p. 026702, 2002.

[7]
M. C. Romano, M. Thiel, J. Kurths, and C. Grebogi, “Estimation of the direction of the coupling by conditional probabilities of recurrence,”
Physical Review E, vol. 76, no. 3, p. 036211, 2007.  [8] J. H. Feldhoff, R. V. Donner, J. F. Donges, N. Marwan, and J. Kurths, “Geometric detection of coupling directions by means of intersystem recurrence networks,” Physics Letters A, vol. 376, no. 46, pp. 3504–3513, 2012.
 [9] S. Koelstra, C. Muhl, M. Soleymani, J.S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “Deap: A database for emotion analysis; using physiological signals,” IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 18–31, 2012.
 [10] F. Takens, “Detecting strange attractors in turbulence,” Dynamical Systems and Turbulence, Warwick 1980, pp. 366–381, 1981.
 [11] M. C. Romano, M. Thiel, J. Kurths, and W. von Bloh, “Multivariate recurrence plots,” Physics letters A, vol. 330, no. 34, pp. 214–223, 2004.
 [12] V. Nicosia, J. Tang, C. Mascolo, M. Musolesi, G. Russo, and V. Latora, “Graph metrics for temporal networks,” in Temporal networks. Springer, 2013, pp. 15–40.
 [13] A. E. Sizemore and D. S. Bassett, “Dynamic graph metrics: Tutorial, toolbox, and tale,” NeuroImage, 2017.
 [14] M. M. Bradley and P. J. Lang, “Measuring emotion: the selfassessment manikin and the semantic differential,” Journal of behavior therapy and experimental psychiatry, vol. 25, no. 1, pp. 49–59, 1994.
 [15] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 267–288, 1996.

[16]
M. Soleymani, F. VillaroDixon, T. Pun, and G. Chanel, “Toolbox for emotional feature extraction from physiological signals (teap),”
Frontiers in ICT, vol. 4, p. 1, 2017.  [17] J. Qian, T. Hastie, J. Friedman, R. Tibshirani, and N. Simon, “Glmnet for matlab. 2013,” URL http://www. stanford. edu/~ hastie/glmnet_matlab, 2013.

[18]
S. Y. Chung and H. J. Yoon, “Affective classification using bayesian classifier and supervised learning,” in
Control, Automation and Systems (ICCAS), 2012 12th International Conference on. IEEE, 2012, pp. 1768–1771. 
[19]
S. Tripathi, S. Acharya, R. D. Sharma, S. Mittal, and S. Bhattacharya, “Using deep and convolutional neural networks for accurate emotion classification on deap dataset.” in
AAAI, 2017, pp. 4746–4752.
Comments
There are no comments yet.