1 Introduction
One of the most common orthopedic injuries is lateral ankle sprains where individuals roll the foot inward and damage the lateral ankle structures Garrick (1977). It is reported that up to 70% of individuals may eventually develop chronic ankle instability (CAI) following a significant ankle sprain Anandacoomarasamy Barnsley (2005). CAI is characterized by persistent pain and swelling, episodes of ankle giving way, and recurrent ankle sprains Hertel (2002). Individuals with CAI have been reported to have diminished physical activity HubbardTurner Turner (2015) and lower quality of life Houston . (2015). More problematically, emerging evidence has linked CAI to future development of irreversible, posttraumatic ankle osteoarthritis Golditz . (2014). The high prevalence and associated medical costs make CAI a significant health concern Soboroff . (1984).
Despite the clinical significance of CAI, the etiology of this medical problem remains unclear. It has been hypothesized that CAI may be caused by peripheral issues such as reduced ankle proprioception Garn Newton (1988); Forkin . (1996) and weak peroneal muscles Bosien . (1955); Tropp (1986). However, some counterevidence has shown that individuals with CAI do not necessarily have reduced proprioception Refshauge . (2000) and ankle strength deficits are not highly correlated with chronic ankle instability Kaminski Hartsell (2002). More recently, researchers started questioning if central control issues also contribute to CAI Hass . (2010). A main reason was because individuals with CAI often show deviations in walking and running gaits Monaghan . (2006); Delahunt . (2006).
Human locomotion requires the control from the central level. Theories such as the central pattern generator has been proposed to explain how the central nervous system controls gait Duysens Van de Crommert (1998); Dimitrijevic . (1998). Previous studies have found increased ankle inversion Monaghan . (2006); Delahunt . (2006), increased ankle plantarflexion Chinn . (2013); Drewes, McKeon, Kerrigan Hertel (2009), and altered spatial and temporal parameters Gigi . (2015) during walking and/or running in individuals with CAI. These findings suggest that the central control of gait is altered in these patients, and support that CAI is not only a peripheral issue but also a central issue.
Gait represents a complex control problem in which redundant degrees of freedom must be constrained and coordinated to create a smooth pattern
Van Emmerik . (2005). Thus, examining interjoint or intersegment coordination may be able to generate further insight into the central control issue in individuals with CAI. A review of literature revealed that most gait studies on CAI focused on examining individual joints rather than coordination Moisan . (2017). Few studies did examine if CAI is associated with coordination change during gait, but they only looked at coordination between two segments Herb . (2014); Drewes, McKeon, Paolini . (2009) or two joints Yen . (2017). A possible reason may be because the current measurements for coordination, such as vector coding and continuous relative phase, can only quantify the coordination between two body components Hamill . (2000).Recently, with advanced biosensor technology, there are studies focusing on functional data analysis of complex multivariate biosensor data in various applied domains such as biology, brain neuroscience, gait disorders Duhamel . (2006); Ullah Finch (2013). The objective is to characterize the interrelationships of multiple dynamic units as a whole. Most of them were focused on dimensionality reduction. To reduce the dimensional of original multivariate data, it is reasonable to separate the original data into multiple lowdimensional subspaces associated with different classes or functions Bahadori . (2015). Some nonparametric methods are developed and attempt to find the latent structure from functional data Xiang . (2013), such as quantitative trait locus mapping based on LASSO Z. Li . (2017) and linear manifold modeling method Chiou Müller (2014)
. Some methods, such as functional principal component Analysis and subspace learning
Kadkhodaeian HosseinZadeh (2012), are developed to lower the data dimensions and describe the correlation structure of high dimensional signals. Subspace learning is advantageous to efficiently transform high dimensional data into a low dimensional subspace
Elhamifar Vidal (2013), while reducing the effect of noise in the data Wang Xu (2013). In addition, structured sparse subspace clustering is developed for detecting both the affinity and the segmentation from data CG. Li . (2017). This property makes it widely used for studying various multivariable functional data, such as learning the dynamic and functional brain connectivity using functional magnetic resonance imaging (fMRI) data Kadkhodaeian HosseinZadeh (2012).In this study, the overarching goal is to identify critical information (measurements) that can summarize and/or represent the motion behaviors of individuals with CAI using biosensor data from a multijoint coordination of bilateral hip, knee, and ankle joints. We present the multijoint coordination as an interconnected network as it is hypothesized that there exist interactions between body components (hip, knee, and ankle), which result from CAI. We propose an analytic framework that first learns and extracts the network patterns of joint subspaces learned from multivariate functional data, and then develop a decision model using support vector machine (SVM) to validate and distinguish between between individuals with CAI and a healthy cohort.
The organization of this paper is as follows. In Section 2, we review relevant research for clinical ankle sprains diagnosis and assessment, followed by a review of quantitative methods for functional data analysis. In Section 3
, we describe data acquisition of our designed experiment and then present the proposed analytic framework that involves both unsupervised learning and supervised learning of mutlivariate functional data. In Section
4, computational results are demonstrated and compared between conventional data extraction and our method. Finally, the conclusion and future work are addressed in Section 5.2 Related Work
2.1 Clinical Diagnosis and Assessment of Ankle Sprains
Traditional clinical assessment for CAI is mainly based on self report. For example, Hiller et al. developed Cumberland Ankle Instability Tool (CAIT) to assess subjective symptoms of CAI Hiller . (2006). Martin et al. developed a selfreport survey called The Foot and Ankle Ability Measure (FAAM) that has been widely used in CAI assessment Martin . (2005). Another selfreport assessment for CAI is the Ankle Instability Instrument developed by Docherty et al. Docherty . (2006). These tools have been recommended by he International Ankle Consortium to determine if an individual has CAI Gribble . (2014). However, selfreport answers may be subject to a number of biases (e.g., social desirability bias) that could affect the reliability and accuracy of the measures. To address this problem, these selfreported tools should be used in conjunction with objective measures.
Previous studies have used lab tools to objectively measure ankle proprioception, balance control, ankle muscle strength, and the reaction time to ankle inversion perturbation in individuals with CAI, but the results were often inconsistent Holmes Delahunt (2009); Gutierrez . (2009). For example, some studies reported that individuals with CAI have delayed reaction time to inversion perturbation Konradsen Ravn (1990); Karlsson Andreasson (1992), but others did not find similar results Ebig . (1997). Recently, more studies have examined walking and/or running patterns in individuals with CAI, and the results have suggested that CAI may be associated with deviations in these patterns Monaghan . (2006); Delahunt . (2006); Chinn . (2013); Drewes, McKeon, Kerrigan Hertel (2009); Gigi . (2015); Yen . (2017). Walking and/or running could be used as a motor task to objectively differentiate healthy individuals and those with CAI.
Existing gait research on individuals with CAI often focus on examining the ankle kinematics Monaghan . (2006); Delahunt . (2006); Chinn . (2013). However, gait is a complex motor task that involves all leg joints. Interjoint coordination could be a parameter used to determine if an individuals with CAI. For example, Yen et al. used a vector coding method to examine hipankle coupling during walking, and found that the coupling pattern was different between healthy individuals and those with CAI Yen . (2017). A weakness of this study was that the coordination measure was limited to hipankle coupling in the affected side in the frontal plane, and deviations may exist in other couplings that were not measured. This weakness was due to limitation in current methods to quantify a coordination pattern. Predominate methods such as vector coding and continuous relative phase can only quantify the coupling between two kinematic trajectories Hamill . (2000).
2.2 Pattern Recognition and Analysis of Multivariate Functional Data
Pattern recognition and analysis of multivariate functional data describe a goal to discover the insight into functional data through statistical and machine learning methods. Traditionally, statistical methods are used to test the coupling and grouping information of functional data. For example, a study investigated Parkinson’s disease with the use of paired ttest and ANOVA of fMRI and kinematic data
van der Stouwe . (2015). However, statistic methods fail to recognize the dynamic patterns in functional data. A biomechanical gait research found that functional data analysis can detect the location and magnitude differences between functions, while statistical method, such as ANOVA, only provided information about discrete data points Park . (2017). Instead of focusing on detect the difference between gait movements, another research modeled the motion patterns of human activities based on kinematic data Janidarmian . (2015). They studied the patterns for 33 activities based on human kinematic data, and use classification algorithms, including SVM and Neural Network, for data segments directly. Their research suggested these parametric methods provide promising activity pattern recognition. In addition to parametric methods, knearest neighbors algorithm (KNN) is a nonparametric method commonly used to classify time series data. A research suggested that with proper similarity measurement, such as dynamic time warping (DTW), KNN can outperform other statisticbased models on time series data
Lin (); Kim Provost (2013). Instead of classification approach, some researches focused on modeling the dynamic of system. For example, a research used a hidden Markov model (HMM) for gait phase detection and walking/jogging discrimination
Mannini Sabatini (2012). Combining with statistical approach, the significant discriminative patterns between walking and jogging were recognized with 99% accuracy.Some graphical models, originally used in image and video processing, are applied to multivariate functional data analysis, since these graphical model have an advantage of analyzing the structure of correlated variables. Our research herein adopts the similar concept, called sparse subspace clustering (SSC), to recognize the crosscorrelations of multivariate functions Elhamifar Vidal (2013); Wang Xu (2013). To recognize the nonlinear relationships, a kernel trick is suggested to apply to the SSC Patel Vidal (2014); Yin . (2016). The results suggested kernel sparse subspace clustering (KSSC) can improve recognition performance for complex data. Recently, another study used a similar modeling method on human motion data to detect and classify gestures, and achieved promising results Zhang . (2018)
. Based on same selfexpression assumption as SSC, a research group developed a sparse selfexpressive decomposition to perform sparse matrix factorization and shows this method provided a promising results for feature selection and clustering on multiple functional datasets
Dyer . (2015)3 Materials and Methods
3.1 Data Acquisition and Processing
In this study, we acquired motion data in running mode from 48 subjects in total, among which 25 subjects with CAI and 23 normal controls. There are 37 females (19 subjects with CAI and 18 normal controls) and 11 males (6 subjects with CAI and 5 normal controls). Subjects were recruited from the Northeastern University community. Screening exams were conducted for all potential subjects to determine their suitability for the study based on the inclusion and exclusion criteria. We followed the selection standards endorsed by the International Ankle Consortium to set the inclusion criteria for CAI Gribble . (2014). All subjects with CAI had a signiﬁcant ankle sprain at least one year before being enrolled in this study. In addition, they scored 24 or lower in CAIT and had more than one episode of ankle giving ways in the past 6 months.
All qualified subjects were asked to participate in a single session, in which they were asked to run on a treadmill for one minute. Their running performance was captured by a 7camera 3D motion capture system (Qualisys AB, Sweden). Reflective markers were placed on the major bony landmarks of the pelvis and left and right legs in order to create biomechanical models of bilateral hip, knee, and ankle. The recorded marker data were then analyzed using Visual3D (CMotion, MD). The resultant variables are bilateral hip, knee, and ankle position changes in each of the sagittal, frontal, and transverse planes over time. Each joint trajectory was segmented into cycles (defined as the interval between two consecutive initial contacts of the same foot) for further analysis. On average, each session contains around forty cycles varied by subject’s running posture. Figure 1 shows the motion data collection from the motion capture system for two subjects of CAI and normal control. For each joint, there are three motion time series in xaxis, yaxis, and zaxis presenting sagittal, frontal, and transverse planes.
There are eighteen channels in each motion dataset. We define channels 13 representing the sagittal, frontal, and transverse planes on right hip, channels 46 representing the sagittal, frontal, and transverse planes for left hip, and so on. Since the total number of time points in running cycles varies in different subjects, we first applied an interpolation method to realign all cycles length to 84 time points, which is the longest running cycle in the dataset. Then, motion time series data were detrended and zscored to avoid variability between subject postures and and potential human errors during data collection. As a result, for a subject, there are
40 matrices, denoted by where and .3.2 Proposed Method
In this section, we propose an analytic framework consisting of two phases to recognize the discriminating patterns across multiple channels of the joint coordination system for discriminating and predicting subjects with CAI. In the first phase, we develop subspace learning methods to find significant clustering (connectivity) patterns for individual subjects. In the second phase, we develop a SVMbased classification model using the estimated connectivity measures as input features for performing binary classification task between CAI cohort versus normal cohort. Figure
2 illustrates the concept of our proposed framework.3.2.1 Sparse Subspace Learning
Let us consider a set of multivariate motion time series in a joint coordination system ( = 18), which is considered as a dynamic system of multiple units where each unit is assumed to linearly or nonlinearly interact with each other at different connectivity degree and can be presented in certain subspaces. Each subspace is presented by a set of basis functions, that is, motion time series can be linearly represented by the basis functions, and their associated functions in the same subspace are highly correlated while those in different subspaces have no or weak correlations. Not that in this study, the terms of channels and function are interchangeable.
Let assume that every subspace is selfexpressive. It means that each time point in highdimension space can be expressed as a linear combination of other time points in the same lowerdimension subspace. Therefore, every function is represented by a linear combination of other functions in the same subspace:
(1) 
where is the original functions, and . In this model, should be set to zero to avoid a trivial solution. In this case, will be a sparse solution after solving equation (1), where the nonzero entries in corresponding to efficient of linear representation. For such complicated system, there will be infinite solutions, so we need to restrict the solution by minimizing the following objective function:
(2) 
We introduce a matrix C = , then above problem can be written as
(3) 
Since real data usually corrupted by noises or errors, we assume where is the real signal functions and is noise with mean zero and independent with each other. We use to represent noises and errors. The model above can be reformed as follows:
(4) 
where the is the penalty parameter for noises.
The model above can efficiently extract linear relations between functions. However, in most case, the major connection is nonlinear. In order to extract dominant nonlinear relation, we use kernel trick to extend dimension and reform our model as kernel sparse subspace learning (KSSC). Kernel Trick is a commonly used technique in machine learning to avoid intensive computation in a highdimensional feature space. Let be the kernel function. In our study, we use Gaussian kernel . Then we can rewrite equation (3) as follows Patel Vidal (2014):
(5) 
When we set linear kernel as our kernel function, then solving the model in Equation (4) is equivalent to solving one in Equation (3).
3.2.2 Model Optimization
We rewrite the model in Equation (4) by introducing an auxiliary matrix Elhamifar Vidal (2013), and add two penalty terms to make the objective function convex in terms of variables (C, A). Note that Equation (6) shares the same solution with Equation (5). Then we can use ADMM algorithm to solve this convex problem:
(6) 
By applying Lagrangian multipliers, we obtain a new formulation of the optimization with a vector and a matrix as follows:
(7) 
ADMM is an algorithm for solving convex optimizing problems by breaking them into small subproblems, which are easier to solve Boyd (). This approach consists a set of iterations in following pseudocode. Note that in steps 2 and 3, is the shrinkagethresholding operator applied on each element in matrix, and it is defined as . Operator returns its argument if nonnegative, and returns zero otherwise.
After reaching stop criteria and obtaining sparse clustering matrix, we calculate an affinity matrix
for further pattern analysis in next part.For parameter optimization, we use a method called grid search, which is calculating the result under different parameter sets, and then comparing the results with each other. We use this method to find the best penalty parameters set, for and 1 for , for SSC learning, and this parameter set generates the highest crossvalidation accuracy.
3.2.3 Feature Extraction and Representation
For input features used in pattern classification task in the next section, we propose two ways to form subspaces, where motion patterns are recognized in the joint coordination system. First, we consider individual column vector in the sparse clustering matrix (see CM1 in Figure 2). Each column vector forms a subspace of nonzero coefficients of other channels associated with channel . We then aggregate column vectors from all cycles of all subjects to form input features in a new subspace, and there are sets of input features to be examined separately (see CM1 in Figure 2).
Secondly, from the affinity matrix , which is symmetric matrix to present the network connectivity across all channels (see CM2 in Figure 2), we employ a classical clustering technique, called densitybased spatial clustering of applications with noise (DBSCAN) Ester . (1996), to find clustering subspaces , in which each cluster is formed of channels that interact with each other closely and dynamically; that is, connectivities in between are strong. The DBSCAN’s goal is to discover clusters of input channels based on two parameters. The first parameter is a distance threshold to determine the closeness for the neighborhood of a given sample and the second parameter is a threshold to limit the number of samples to form a neighborhood. In this study, we set the thresholds equal to the median distance and = 4. DBSCAN starts with an arbitrary channel and includes all other channels that are directly densityreachable as neighborhood if satisfied with the two parameter thresholds. This process iterates until no new channels being added to this neighborhood. Note that this is computationally fast since there are only 18 channels. After determining the major cluster that dominates the network, we perform sparse subspace learning, mentioned in Section 3.2.1, again on the channels included in and obtain an updated clustering matrix of this major cluster (see FR2 in Figure 2). Then, the elements in the uppertriangle part in are the network features in the new subspace.
3.2.4 Pattern Classification
The learned sparse clustering matrices and in Section 3.2.1 summarize and represent the hidden patterns in the joint coordination system. We propose to train a linear SVM model with the feature sets for performing binary classification task between CAI and NC cohorts, which is used to validate the effectiveness of the learned features. SVM is considered as one of robust supervised learning methods in various applied problems of highdimensional data Hearst . (1998)
. The idea of SVM is to find a hyperplane with a maximum margin that allows to separate the samples in two classes (i.e., CAI versus NC). Let us have a training dataset
as samples presented by feature set and binary class associated with the samples. A hyperplane is defined as(8) 
where is a normal weight vector and is a scalar called bias. A sample is decided to belong to class if ; otherwise, it belongs to class if . To find an optimal hyperplane , one can solve a constrained optimization problem to optimize the inbetween margin, shown as follows:
(9) 
In this work, we solve this problem using a widely used library LIBSVM Chang Lin (2011) with the default setting for parameter . For the sake of interpretability, a linear kernel is used in SVM throughout the computational experiments.
4 Results on Chronic Ankle Instability
In this section, we perform and compare our proposed subspace clustering methods (SSC and KSSC) to traditional methods including statistic features and Pearson correlation, and principal component analysis (PCA). For statistic features, we treat statistic features, including mean and variance, as input feature of SVM classification model. Pearson correlation is a correlation measurement to evaluate the linear relationship between two variables. Similar to our clustering matrix, calculating pairwised correlation can generate a correlation matrix. PCA is a dimension reduction technique to project a set of correlated variables into a set of uncorrelated variables (called principal components) in a new feature space. We use the number of principal components, which accounts for 99.9 percent of the variance in the original signal. Then we calculate and use the mean and variance of components to form a feature vector. At last, We train a linear SVM model with these extracted features for classification using a leaveonesubjectout cross validation; each subject with
40 sample cycles are left for testing iteratively.4.1 Result for Column Vector based Subspace
From selfexpressive assumption we make in modeling part, every column vector in clustering matrix represents the coefficient vector of the combination of other joints. So to test the motion pattern, we can treat every column vector in clustering matrix as a feature vector. We use leaveonesubjectout crossvalidation to find the column vector which can represent the system pattern best. And we develop two accuracy measurements to verify the pattern we obtain. First, we define hit rate by , where is the number of predictions of cycles matching true label and is the total number of samples. And the second measurement is subject prediction accuracy, and it is the proportion of correctly classified subjects. After calculating accuracy measurements for SVM models of all column vectors, we select the column vector, giving highest voting accuracy, as our final moving pattern.
In Table 1, a performance summary including training and testing accuracy is provided. It is shown that the feature vector extracted by our proposed methods, SSC and KSSC, outperform other methods. Pearson correlation performs slightly better than random assignment (50% hit rate), while statistic method performs worse than random assignment (50% hit rate). It is noted that although there are big gaps (around 2040%) between testing and training accuracies in all methods above, it is commonly seen in crosssubject validation due to high variation between samples in different subjects. Limited by the number of subjects, it is difficult to eliminate the bias between subjects with different body condition, such as BMI, gender, dominant and affected leg. It is nothing related to overfitting the SVM model. In order to keep our result statistical meaningful (based on relatively large sample size), we cannot perform a similar analysis on a smaller subset, for instance, only considering female with right ankle injury and in a certain range of BMI index.
Statistic  Correlation  PCA(99.9% variance)  SSC  KSSC  
Testing Hit Rate  46.56%  58.69%  51.66%  61.24%  63.83% 
Training Hit Rate  80.54%  92.71%  71.02%  73.16%  80.94% 
Moreover, to make prediction in a more practical way, we classify a testing subject based on a majority voting scheme  a subject is predicted and labeled as CAI if there are more than 50 percent of sample cycles can be recognized abnormal. The results for our proposed methods are shown in Table 2. Compared with the previous results in Table 1
, classification performance of our methods has been improved by 510% as expected. Our observation indicates that a large proportion of misclassified sample cycle comes from misclassified subjects. One possible reason for this result is that a bias exists between different subjects, and another possible reason is that some subjects may be sample outliers, which makes these subjects cannot be correctly classified. In this case, most of the running cycles from these sample outliers may be misclassified, which leads to low test accuracy in Table 1.
Statistic  Correlation  PCA(99.9% variance)  SSC  KSSC  
Voting Testing Accuracy  54.18%  63.64%  51.06%  72.34%  81.82% 
Voting Training Accuracy  80.54%  100.00%  77.79%  86.26%  100.00% 
To find out which joint may contribute significant information to distinguish between two cohorts of CAI and normal control, we further examine individual column vectors. The results are shown in Figure 3. Channel 3 (transverse plane of the right hip) is the most distinguishable and result in 72% accuracy, followed by channel 17 (frontal plane of the left ankle). It is interesting to see such fact that most CAI can be distinguished from the hip position rather than ankle position. The motion abnormality on knees along has little influence or impact on the running pattern of subjects.
As we mentioned, the subject bias is hard to eliminate. To reduce the impact of subject bias, we refined our model by combining the column vectors in descending order, based on their high individual voting accuracy in Figure 3. In this way, the combining vector can cover different features across different subjects. Because KSSC already extents feature space and tangles different joints together, we choose SSC instead of KSSC to avoid overcounting the crossrelation between column vectors.
# of vector combined  1  2  3  4  5 
Hit Rate  68.01%  70.20%  77.68%  78.25%  77.33% 
Voting Test Accuracy  72.73%  72.73%  77.27%  86.36%  90.91% 
In Figure 4, a plot of voting accuracy with respect to the number of vector combined is provided. From the figure, it is easy to see that accuracy reaches its top at when there are 4 or 5 vectors combine together. After that, the accuracy starts to drop due to overfitting. In Table 3, a part of numeric results in Figure 3 is provided, including hit rate and voting accuracy. Compared with the result of KSSC in Table 2, we can find the performance of the combined vectors is closed to a single grouping vector driven from KSSC. This also suggests that KSSC picks up crossrelation between functions.
Our experiments show that KSSC gives the highest clustering accuracy amony all the other algorithm. However, the clustering matrix from KSSC is hard to be interpreted, since Gaussian kernel used in kernel trick extends dimension to infinite. It means all 18 joints in original space tangles with each other in every possible scale. This property not only makes KSSC easier to detect the nonlinear relationship between joints, but the clustering matrix of KSSC harder to interpret than that of SSC. Therefore, in the next section, we will analyze the result driven by SSC rather than KSSC.
4.2 Results for Network based Subspace
Besides column based pattern classification, we propose another method to analyze the clustering matrix derived from SSC. In this network based method, we treat every joint in matrix as a node and the value of elements in matrix as a directed weighted edge. By this way, our clustering matrix or Pearson correlation matrix become a weighted directed network representing moving pattern of subjects. To reduce the dimension of data and find a subcluster, which contains most information about motion pattern, we apply a clustering algorithm, DBSCAN, to find two subcluster from original data. As introduced in Section 3.2.3, this algorithm is a densitybased clustering method. It groups together joints with its nearby neighbors. In our experiment, DBSCAN separates joints network into two subclusters, and each of the subclusters contains nine joints. In order to compare the performance of our clustering matrix with traditional method, we perform DBSCAN algorithm on a basis of the Pearson correlation matrix under the same parameter settings and cluster all joints into two subclusters.
In order to determine the dominant subcluster and compare subclusters derived from different methods, we apply SSC method on selected joints in the same cluster and compare the classification result based on their subclustering matrix. Since affinity matrix is a symmetric matrix, the upper triangular matrix (without diagonal line) contains all the linear relation coefficients. 36 elements is not a large number of features compared to the number of our dataset, about 2000 entries. Using the upper triangular matrix (without diagonal line) as input feature vector for linear SVM would not cause overfitting of our problem. Therefore, we train an SVM classification model by taking the upper triangular matrix of our subclustering matrix (without diagonal line) as input features. Table 4 is a summary of voting accuracy for those subclusters. Note that subcluster 1 and 2 are derived from clustering matrix via SSC, and subcluster 3 and 4 are complement subclusters derived from Pearson correlation matrix. Our experiment shows subcluster 1 achieves a higher accuracy than its complement set, subcluster 2. Subcluster 1 contains 9 joints, including the number from 1 to 8 and number 16. Among these joints, joints from 1 to 6 locate on hips, while 7 and 8 locate on right knee and joint 16 locates on left ankle. This result indicates that subcluster 1 is a dominant subcluster in motion pattern, where the difference between subjects with CAI and normal control is most significant. And compared to the subclusters, only considering hips, knees or ankles, subcluster 1 also contains more pattern information. For subclusters obtained from Pearson correlation, subcluster 4 achieves highest voting accuracy. The index of joints of subcluster 4 is 3, 4, 5, 6, 8, 11, 13, 15 and 16, most of which are locates on hips and ankles. To be noticed, both dominant subclusters cover hips.
SubClusters  Different Positions  Clustering Matrix from SSC  Pearson Correlation Matrix  
Hips  Knees  Ankles  Subcluster 1  Subcluster 2  Subcluster 3  Subcluster 4  
Accuracy  65.96%  44.68%  55.32%  70.21%  55.32%  54.55%  68.18% 
To quantify the importance of joints in subcluster, We use the weights for every element in the upper triangular matrix. A larger absolute value indicates higher weight and more importance. As shown in Figure 5, we circle the top five highest value in red. These elements with high absolute value relates to a strong connection between two related joints. The result shows the strongest connection in subcluster 1 in Figure 5 (a) is that between joints 1 and 5 (the sagittal plane of the right hip and frontal plane of the left hip). This connection is the connection within the hips. From Figure 5 (b), the subcluster based on correlation, the index of joints in dominant subset changes, but hips still contribution most significant connections. But it is interesting to find that the connections to joint 8 (the frontal plane of right knee) have highest weight in both SVM models. As shown in these two figures, it is observed that the difference between normal control and the CAI groups may be significant in hips. To identify subjects with CAI, it is suggested to check other joint positions, e.g., pattern on hips.
By analyzing the subcluster extracted by DBSCAN, we find that the subcluster provide useful information about the dominant function and connectivities between multiple joints. Our result indicates that hips have a significant impact on running patterns. While running, subjects with CAI may use muscles in hips to adjust their posture to protect their ankle from sprains. Our networkbased analysis in Section 4.2 shows a dominant subcluster, which significantly reduces the dimension of the original data and provide most of the useful information according to higer classification performance.
5 Conclusion
In this work, we studied a physical injury problem caused by ankle sprains, which develops a longterm and chronic instability in human daily movements. While most of recent studies were focused on the application of traditional statistic methods to biomechanical gait data, in this research we proposed a new analytic method to quantitatively learn the functional subspace that represents the original multivariate gait motion data acquired from the 3D motion capture system. We assumed that there are effects among bilateral ankle, knee, and hip joints and presented a multijoint coordination system as a network system. Our proposed subspace clustering algorithm attempted to learn a significant joint subspace (model) in a lower dimensional space to characterize the original motion system. A SVM classification model is trained with the extracted networkbased features and validated for subjects with CAI compared to normal controls in a leaveonesubjectout cross validation. We obtained higher classification performances by
10% compared to the statisticbased features (that resulted in 5560% accuracy). Moreover, we found that there indeed exists a joint effect among these joints from our analysis. The motion instability caused by ankle sprains are found significantly on the hip joints in the studied subjects. Subjects with CAI might use hip to stabilize and balance during movement. We have shown a potential that this proposed model can be applied to support the decisions in diagnosis, treatment, and rehabilitation. For future work, we will further investigate abnormal detection problem  when motion behavior begins to deteriorate during movement in individual subjects with CAI. A dynamic modeling approach is suggested to obtain a functional clustering tensor. Given that our current study does not consider the geometry structure of multivarate motion data, incorporating geometry information may improve the understanding of the motion patterns in the physical body system.
References
 Anandacoomarasamy Barnsley (2005) anandacoomarasamy2005longAnandacoomarasamy, A. Barnsley, L. 2005. Long term outcomes of inversion ankle injuries Long term outcomes of inversion ankle injuries. British journal of sports medicine393e14–e14.
 Bahadori . (2015) functionalBahadori, MT., Kale, D., Fan, Y. Liu, Y. 2015. Functional Subspace Clustering with Application to Time Series Functional subspace clustering with application to time series. 228237.
 Bosien . (1955) bosien1955residualBosien, WR., Staples, OS. Russell, SW. 1955. Residual disability following acute ankle sprains Residual disability following acute ankle sprains. JBJS3761237–1243.
 Boyd () boyd2011alternatingBoyd, S. . Alternating direction method of multipliers Alternating direction method of multipliers..
 Chang Lin (2011) chang2011libsvmChang, CC. Lin, CJ. 2011. LIBSVM: a library for support vector machines Libsvm: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST)2327.
 Chinn . (2013) chinn2013ankleChinn, L., Dicharry, J. Hertel, J. 2013. Ankle kinematics of individuals with chronic ankle instability while walking and jogging on a treadmill in shoes Ankle kinematics of individuals with chronic ankle instability while walking and jogging on a treadmill in shoes. Physical Therapy in Sport144232–239.
 Chiou Müller (2014) linearChiou, JM. Müller, HG. 2014. Linear manifold modelling of multivariate functional data Linear manifold modelling of multivariate functional data. Journal of the Royal Statistical Society: Series B (Statistical Methodology)763605–626. http://dx.doi.org/10.1111/rssb.12038 10.1111/rssb.12038
 Delahunt . (2006) delahunt2006alteredDelahunt, E., Monaghan, K. Caulfield, B. 2006. Altered neuromuscular control and ankle joint kinematics during walking in subjects with functional instability of the ankle joint Altered neuromuscular control and ankle joint kinematics during walking in subjects with functional instability of the ankle joint. The American Journal of Sports Medicine34121970–1976.
 Dimitrijevic . (1998) dimitrijevic1998evidenceDimitrijevic, MR., Gerasimenko, Y. Pinter, MM. 1998. Evidence for a Spinal Central Pattern Generator in Humans a Evidence for a spinal central pattern generator in humans a. Annals of the New York Academy of Sciences8601360–376.
 Docherty . (2006) docherty2006developmentDocherty, CL., Gansneder, BM., Arnold, BL. Hurwitz, SR. 2006. Development and reliability of the ankle instability instrument Development and reliability of the ankle instability instrument. Journal of athletic training412154.
 Drewes, McKeon, Kerrigan Hertel (2009) drewes2009dorsiflexionDrewes, LK., McKeon, PO., Kerrigan, DC. Hertel, J. 2009. Dorsiflexion deficit during jogging with chronic ankle instability Dorsiflexion deficit during jogging with chronic ankle instability. Journal of Science and Medicine in Sport126685–687.
 Drewes, McKeon, Paolini . (2009) drewes2009alteredDrewes, LK., McKeon, PO., Paolini, G., Riley, P., Kerrigan, DC., Ingersoll, CD. Hertel, J. 2009. Altered ankle kinematics and shankrearfoot coupling in those with chronic ankle instability Altered ankle kinematics and shankrearfoot coupling in those with chronic ankle instability. Journal of sport rehabilitation183375–388.
 Duhamel . (2006) duhamel2006functionalDuhamel, A., Devos, P., Bourriez, JL., Preda, C., Defebvre, L. Beuscart, R. 2006. Functional data analysis for gait curves study in Parkinson’s disease. Functional data analysis for gait curves study in parkinson’s disease. Studies in health technology and informatics124569–574.
 Duysens Van de Crommert (1998) duysens1998neuralDuysens, J. Van de Crommert, HW. 1998. Neural control of locomotion; Part 1: The central pattern generator from cats to humans Neural control of locomotion; part 1: The central pattern generator from cats to humans. Gait & posture72131–141.
 Dyer . (2015) dyer2015selfDyer, EL., Goldstein, TA., Patel, R., Kording, KP. Baraniuk, RG. 2015. Selfexpressive decompositions for matrix approximation and clustering Selfexpressive decompositions for matrix approximation and clustering. arXiv preprint arXiv:1505.00824.
 Ebig . (1997) ebig1997effectEbig, M., Lephart, SM., Burdett, RC., Miller, MC. Pincivero, DM. 1997. The effect of sudden inversion stress on EMG activity of the peroneal and tibialis anterior muscles in the chronically unstable ankle The effect of sudden inversion stress on emg activity of the peroneal and tibialis anterior muscles in the chronically unstable ankle. Journal of Orthopaedic & Sports Physical Therapy26273–77.
 Elhamifar Vidal (2013) sparseElhamifar, E. Vidal, R. 2013Nov.. Sparse Subspace Clustering: Algortihm, Theory, and Applications Sparse subspace clustering: Algortihm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence351127652781.
 Ester . (1996) dbscan1Ester, M., Kriegel, HP., Sander, J. Xu, X. 1996. A Densitybased Algorithm for Discovering Clusters a Densitybased Algorithm for Discovering Clusters in Large Spatial Databases with Noise A densitybased algorithm for discovering clusters a densitybased algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining Proceedings of the second international conference on knowledge discovery and data mining ( 226–231). AAAI Press. http://dl.acm.org/citation.cfm?id=3001460.3001507
 Forkin . (1996) forkin1996evaluationForkin, DM., Koczur, C., Battle, R. Newton, RA. 1996. Evaluation of kinesthetic deficits indicative of balance control in gymnasts with unilateral chronic ankle sprains Evaluation of kinesthetic deficits indicative of balance control in gymnasts with unilateral chronic ankle sprains. Journal of Orthopaedic & Sports Physical Therapy234245–250.
 Garn Newton (1988) garn1988kinestheticGarn, SN. Newton, RA. 1988. Kinesthetic awareness in subjects with multiple ankle sprains Kinesthetic awareness in subjects with multiple ankle sprains. Physical Therapy68111667–1671.
 Garrick (1977) garrick1977frequencyGarrick, JG. 1977. The frequency of injury, mechanism of injury, and epidemiology of ankle sprains The frequency of injury, mechanism of injury, and epidemiology of ankle sprains. The American journal of sports medicine56241–242.
 Gigi . (2015) gigi2015deviationsGigi, R., Haim, A., Luger, E., Segal, G., Melamed, E., Beer, Y.Elbaz, A. 2015. Deviations in gait metrics in patients with chronic ankle instability: a case control study Deviations in gait metrics in patients with chronic ankle instability: a case control study. Journal of foot and ankle research811.
 Golditz . (2014) golditz2014functionalGolditz, T., Steib, S., Pfeifer, K., Uder, M., Gelse, K., Janka, R.Welsch, G. 2014. Functional ankle instability as a risk factor for osteoarthritis: using T2mapping to analyze early cartilage degeneration in the ankle joint of young athletes Functional ankle instability as a risk factor for osteoarthritis: using t2mapping to analyze early cartilage degeneration in the ankle joint of young athletes. Osteoarthritis and cartilage22101377–1385.
 Gribble . (2014) gribble2014selectionGribble, PA., Delahunt, E., Bleakley, C., Caulfield, B., Docherty, C., Fourchet, F.others 2014. Selection criteria for patients with chronic ankle instability in controlled research: a position statement of the International Ankle Consortium Selection criteria for patients with chronic ankle instability in controlled research: a position statement of the international ankle consortium. Br J Sports Med48131014–1018.
 Gutierrez . (2009) gutierrez2009neuromuscularGutierrez, GM., Kaminski, TW. Douex, AT. 2009. Neuromuscular control and ankle instability Neuromuscular control and ankle instability. Pm&r14359–365.
 Hamill . (2000) hamill2000issuesHamill, J., Haddad, JM. McDermott, WJ. 2000. Issues in quantifying variability from a dynamical systems perspective Issues in quantifying variability from a dynamical systems perspective. Journal of Applied Biomechanics164407–418.
 Hass . (2010) hass2010chronicHass, CJ., Bishop, MD., Doidge, D. Wikstrom, EA. 2010. Chronic ankle instability alters central organization of movement Chronic ankle instability alters central organization of movement. The American journal of sports medicine384829–834.
 Hearst . (1998) hearst1998supportHearst, MA., Dumais, ST., Osuna, E., Platt, J. Scholkopf, B. 1998. Support vector machines Support vector machines. IEEE Intelligent Systems and their applications13418–28.
 Herb . (2014) herb2014shankHerb, CC., Chinn, L., Dicharry, J., McKeon, PO., Hart, JM. Hertel, J. 2014. Shankrearfoot joint coupling with chronic ankle instability Shankrearfoot joint coupling with chronic ankle instability. Journal of applied biomechanics303366–372.
 Hertel (2002) hertel2002functionalHertel, J. 2002. Functional anatomy, pathomechanics, and pathophysiology of lateral ankle instability Functional anatomy, pathomechanics, and pathophysiology of lateral ankle instability. Journal of athletic training374364.
 Hiller . (2006) hiller2006cumberlandHiller, CE., Refshauge, KM., Bundy, AC., Herbert, RD. Kilbreath, SL. 2006. The Cumberland ankle instability tool: a report of validity and reliability testing The cumberland ankle instability tool: a report of validity and reliability testing. Archives of physical medicine and rehabilitation8791235–1241.
 Holmes Delahunt (2009) holmes2009treatmentHolmes, A. Delahunt, E. 2009. Treatment of common deficits associated with chronic ankle instability Treatment of common deficits associated with chronic ankle instability. Sports Medicine393207–224.
 Houston . (2015) houston2015patientHouston, MN., Hoch, JM. Hoch, MC. 2015. Patientreported outcome measures in individuals with chronic ankle instability: a systematic review Patientreported outcome measures in individuals with chronic ankle instability: a systematic review. Journal of athletic training50101019–1033.

HubbardTurner Turner (2015)
hubbard2015physicalHubbardTurner, T. Turner, MJ.
2015.
Physical activity levels in college students with chronic ankle instability Physical activity levels in college students with chronic ankle instability.
Journal of athletic training507742–747.  Janidarmian . (2015) janidarmian2015analysisJanidarmian, M., Roshan Fekr, A., Radecka, K., Zilic, Z. Ross, L. 2015. Analysis of motion patterns for recognition of human activities Analysis of motion patterns for recognition of human activities. Proceedings of the 5th EAI International Conference on Wireless Mobile Communication and Healthcare Proceedings of the 5th eai international conference on wireless mobile communication and healthcare ( 68–72).
 Kadkhodaeian HosseinZadeh (2012) subspaceKadkhodaeian, BS. HosseinZadeh, GA. 2012. Subspacebased Identification Algorithm for characterizing causal networks in resting brain Subspacebased identification algorithm for characterizing causal networks in resting brain. Neuroimage60212361249.
 Kaminski Hartsell (2002) kaminski2002factorsKaminski, TW. Hartsell, HD. 2002. Factors contributing to chronic ankle instability: a strength perspective Factors contributing to chronic ankle instability: a strength perspective. Journal of athletic training374394.
 Karlsson Andreasson (1992) karlsson1992effectKarlsson, J. Andreasson, GO. 1992. The effect of external ankle support in chronic lateral ankle joint instability: an electromyographic study The effect of external ankle support in chronic lateral ankle joint instability: an electromyographic study. The American Journal of Sports Medicine203257–261.
 Kim Provost (2013) knn_dtwKim, Y. Provost, EM. 2013May. Emotion classification via utterancelevel dynamics: A patternbased approach to characterizing affective expressions Emotion classification via utterancelevel dynamics: A patternbased approach to characterizing affective expressions. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013 ieee international conference on acoustics, speech and signal processing ( 36773681). 10.1109/ICASSP.2013.6638344
 Konradsen Ravn (1990) konradsen1990ankleKonradsen, L. Ravn, JB. 1990. Ankle instability caused by prolonged peroneal reaction time Ankle instability caused by prolonged peroneal reaction time. Acta Orthopaedica Scandinavica615388–390.
 CG. Li . (2017) structuredLi, CG., You, C. Vidal, R. 201704. Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework Structured sparse subspace clustering: A joint affinity learning and subspace clustering framework. 2629883001.
 Z. Li . (2017) decipheringLi, Z., Guo, B., Yang, J., Herczeg, G., Gonda, A., Balázs, G.Merilä, J. 201701. Deciphering the genomic architecture of the stickleback brain with a novel multilocus genemapping approach Deciphering the genomic architecture of the stickleback brain with a novel multilocus genemapping approach.
 Lin () lin2012patternLin, J. . Pattern recognition in time series Pattern recognition in time series.
 Mannini Sabatini (2012) MANNINI2012657Mannini, A. Sabatini, AM. 2012. Gait phase detection and discrimination between walking–jogging activities using hidden Markov models applied to foot motion data from a gyroscope Gait phase detection and discrimination between walking–jogging activities using hidden markov models applied to foot motion data from a gyroscope. Gait & Posture364657  661. http://www.sciencedirect.com/science/article/pii/S0966636212002342 https://doi.org/10.1016/j.gaitpost.2012.06.017
 Martin . (2005) martin2005evidenceMartin, RL., Irrgang, JJ., Burdett, RG., Conti, SF. Swearingen, JMV. 2005. Evidence of validity for the Foot and Ankle Ability Measure (FAAM) Evidence of validity for the foot and ankle ability measure (faam). Foot & Ankle International2611968–983.
 Moisan . (2017) moisan2017effectsMoisan, G., Descarreaux, M. Cantin, V. 2017. Effects of chronic ankle instability on kinetics, kinematics and muscle activity during walking and running: A systematic review Effects of chronic ankle instability on kinetics, kinematics and muscle activity during walking and running: A systematic review. Gait & posture52381–399.
 Monaghan . (2006) monaghan2006ankleMonaghan, K., Delahunt, E. Caulfield, B. 2006. Ankle function during gait in patients with chronic ankle instability compared to controls Ankle function during gait in patients with chronic ankle instability compared to controls. Clinical Biomechanics212168–174.
 Park . (2017) park2017functionalPark, J., Seeley, MK., Francom, D., Reese, CS. Hopkins, JT. 2017. Functional vs. Traditional Analysis in Biomechanical Gait Data: An Alternative Statistical Approach Functional vs. traditional analysis in biomechanical gait data: An alternative statistical approach. Journal of human kinetics60139–49.
 Patel Vidal (2014) kernel1Patel, VM. Vidal, R. 2014Oct. Kernel sparse subspace clustering Kernel sparse subspace clustering. 28492853. 10.1109/ICIP.2014.7025576
 Refshauge . (2000) refshauge2000effectRefshauge, KM., Kilbreath, SL. Raymond, J. 2000. The effect of recurrent ankle inversion sprain and taping on proprioception at the ankle. The effect of recurrent ankle inversion sprain and taping on proprioception at the ankle. Medicine and science in sports and exercise32110–15.
 Soboroff . (1984) soboroff1984benefitsSoboroff, SH., Pappius, EM. Komaroff, AL. 1984. Benefits, risks, and costs of alternative approaches to the evaluation and treatment of severe ankle sprain. Benefits, risks, and costs of alternative approaches to the evaluation and treatment of severe ankle sprain. Clinical orthopaedics and related research183160–168.
 Tropp (1986) tropp1986pronatorTropp, H. 1986. Pronator muscle weakness in functional instability of the ankle joint Pronator muscle weakness in functional instability of the ankle joint. International journal of sports medicine705291–294.
 Ullah Finch (2013) ullah2013applicationsUllah, S. Finch, CF. 2013. Applications of functional data analysis: A systematic review Applications of functional data analysis: A systematic review. BMC medical research methodology13143.
 van der Stouwe . (2015) handmovementvan der Stouwe, AMM., Toxopeus, C., de Jong, B., Yavuz, P., Valsan, G., Conway, B.Maurits, N. 2015. Muscle coactivity tuning in Parkinsonian hand movement: diseasespecific changes at behavioral and cerebral level Muscle coactivity tuning in parkinsonian hand movement: diseasespecific changes at behavioral and cerebral level. Frontiers in Human Neuroscience9437. https://www.frontiersin.org/article/10.3389/fnhum.2015.00437 10.3389/fnhum.2015.00437
 Van Emmerik . (2005) van2005variabilityVan Emmerik, RE., Hamill, J. McDermott, WJ. 2005. Variability and coordinative function in human gait Variability and coordinative function in human gait. Quest571102–123.
 Wang Xu (2013) noisyWang, YX. Xu, H. 2013. Noisy sparse subspace clustering Noisy sparse subspace clustering. I89.

Xiang . (2013)
nonparametricXiang, D., Qiu, P. Pu, X.
2013.
NONPARAMETRIC REGRESSION ANALYSIS OF MULTIVARIATE LONGITUDINAL DATA Nonparametric regression analysis of multivariate longitudinal data.
Statistica Sinica232769789.  Yen . (2017) yen2017hipYen, SC., Chui, KK., Corkery, MB., Allen, EA. Cloonan, CM. 2017. Hipankle coordination during gait in individuals with chronic ankle instability Hipankle coordination during gait in individuals with chronic ankle instability. Gait & posture53193–200.
 Yin . (2016) kernel2Yin, M., Guo, Y., Gao, J., He, Z. Xie, S. 2016June. Kernel Sparse Subspace Clustering on Symmetric Positive Definite Manifolds Kernel sparse subspace clustering on symmetric positive definite manifolds. 51575164. 10.1109/CVPR.2016.557
 Zhang . (2018) zhang2018dynamicZhang, C., Yan, H., Lee, S. Shi, J. 2018. Dynamic Multivariate Functional Data Modeling via Sparse Subspace Learning Dynamic multivariate functional data modeling via sparse subspace learning. arXiv preprint arXiv:1804.03797.
Comments
There are no comments yet.