Accelerometer-Based Gait Segmentation: Simultaneously User and Adversary Identification

10/11/2019
by   Yujia Ding, et al.
0

In this paper, we introduce a new gait segmentation method based on accelerometer data and develop a new distance function between two time series, showing novel and effectiveness in simultaneously identifying user and adversary. Comparing with the normally used Neural Network methods, our approaches use geometric features to extract walking cycles more precisely and employ a new similarity metric to conduct user-adversary identification. This new technology for simultaneously identify user and adversary contributes to cybersecurity beyond user-only identification. In particular, the new technology is being applied to cell phone recorded walking data and performs an accuracy of 98.79% for 6 classes classification (user-adversary identification) and 99.06% for binary classification (user only identification). In addition to walking signal, our approach works on walking up, walking down and mixed walking signals. This technology is feasible for both large and small data set, overcoming the current challenges facing to Neural Networks such as tuning large number of hyper-parameters for large data sets and lacking of training data for small data sets. In addition, the new distance function developed here can be applied in any signal analysis.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 14

06/06/2021

3D Convolution Neural Network based Person Identification using Gait cycles

Human identification plays a prominent role in terms of security. In mod...
08/27/2021

EarGate: Gait-based User Identification with In-ear Microphones

Human gait is a widely used biometric trait for user identification and ...
12/14/2019

The Liar's Walk: Detecting Deception with Gait and Gesture

We present a data-driven deep neural algorithm for detecting deceptive w...
08/03/2015

Identifying Emotion from Natural Walking

Emotion identification from gait aims to automatically determine persons...
06/11/2014

Acoustic Gait-based Person Identification using Hidden Markov Models

We present a system for identifying humans by their walking sounds. This...
06/10/2016

IDNet: Smartphone-based Gait Recognition with Convolutional Neural Networks

Here, we present IDNet, a user authentication framework from smartphone-...
07/08/2021

Identification of Gait Phases with Neural Networks for Smooth Transparent Control of a Lower Limb Exoskeleton

Lower limbs exoskeletons provide assistance during standing, squatting, ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In today’s technology-advanced and data-explosion world, cell phone usage is woven into nearly every aspect of human life. As we enjoy the major benefits provided by cell phones, the rapid growing use of the information network brings with it a rise in malicious cyber activities, and hence proves strong reasons for cybersecurity. Evidences have shown that approaches based on cell phones data that contains both accelerometer and gyroscope signals usually provides good performance [neverova2016learning]. However, there are a fair number of cell phones have no gyroscope data recorded, especially for some low-priced cell phones. This paper proposes new techniques for gait extraction on accelerometer signal only; identification performance can be improved by adding gyroscope data when it is available. Moreover, our gait analysis is applied on human motion identification for strong user authentication on smart phones. The current authentication process for smart phones is largely based on weak static credentials, such as passwords or swipe patterns on touchscreens. Such credentials usually has many vulnerabilities that the attacker can easily gain access to the cell phone. Therefore, this paper researches on authentication methods aim to constantly protect the cell phone and simultaneously detecting the adversary. In addition to advanced knowledge on improving cell phone security, the methods provided in this paper will have several broader impacts; the gait segmentation approach works for other human motion analysis and the new time series distance, a new way of measuring similarity between signals, can be applied in other signal analysis. The principal arguments underlying this paper including: (a) a well-performed data pre-processing by the gait segmentation method that wipes off the non-walking activities and accurately segments the walking series into gait cycles is discussed in Section 2; (b) a new measurement for signal distance that transforms human vision into machine language is proposed in Section 3; (c) experimental studies of segmentation as well as a fast, stable and accurate user-adversary identification are demonstrated in Section 4; (d) a summary and conclusion of the impact of this research is shown in Section 5.

Gait cycle segmentation or classification is broadly pre-processing step in studying human walking. In particular, high accuracy of gait cycle segmentation is extremely important in increasing the performance of human authentication. Another crucial usage of gait segmentation is to monitor or diagnose specific diseases such as Parkinson’s disease or peripheral neuropathy. There are many systems can be used to collect gait signals, such as vision-based systems, goniometers, Inertial Measurement Units (IMUs) and foot pressure sensors, among which IMUs is mostly studied in this paper. Many gait cycle segmentation methods have been studied based on different types of signal recorded from different types of sensors. Some gait segmentation methods are applied in identification of contact events, such as heel strike and toe-off which could be particularly useful in providing online assistance during walking; different type of signals have been used, such as only kinematics data from the knee and hip joints [kalinowska2019data] or angular velocity of lower limb [grimmer2019stance]

. Some Hidden Markov Model based machine-learning methods are investigated in gait segmentation by pressure sensors, see

[crea2012development, de2012gait]. Foot-switch signal gait segmentation is researched in [agostini2013segmentation]. Comparison of four gait segmentation methods, including peak detection from event-based methods, two variations of dynamic time warping from template matching methods, and hierarchical hidden Markov models from machine learning methods, are made to exam the Parkinsonian gait in [haji2018segmentation]. Researches on pathological gait abnormality detection and segmentation are made in [khan2019pathological]. In the literature, zero-crossing and Lomb-Scargle periodogram for cycle extraction are widely used; the former is based on wave form [khan2019pathological, sugandhi2019overlap] and the latter is for detecting and characterizing periodic signals in unevenly-sampled data [vanderplas2018understanding]. A robust algorithm for gait cycle segmentation is provided based on a peak detection approach in [jiang2017robust]. Many accelerometer-based gait analysis has been summarized in [D2010].

Existing gait segmentation algorithms have achieved outstanding performance, but showed its deficiency when dealing with very noisy data and activity mixed gait signals. Therefore, the accelerometer-based segmentation method proposed in this paper aims to address the above problems and improves the performance of user-adversary identification or, in particular, the accuracy of smart phone authentication. The gait cycle obtained by our algorithm began when one foot touched the ground and ended when the same foot touched the ground again, shown in Figure 1 with x-axis of the accelerometer data. Several geometry features of the accelerometer-based gait cycle are summarized in the following:

  1. Both of the starting and ending points of the gait cycle are minimal peaks with bigger angles defined in Equation (3).

  2. Values of the start or end point of the gait cycle are lower than most of the points in the series and are the lowest within a cycle in most cases. See the left plot in Figure 1 for an example of common gait cycles and an exception is given in the right plot in Figure 1.

  3. The start and end points of the cycle are on critical or dominated lines. The critical line is the longest vector in the cycle that ignores small peaks as the dash lines indicated in Figure

    1.

The robust algorithm for gait cycle segmentation provided in [jiang2017robust], however, can not deal with the cases when in , the values of the edge points are not the lowest within a cycle, see for example the right plot in Figure 1. Our proposed gait segmentation approach is more reliable when dealing with real walking signal that involves all but not limited to the above problems and results in more accurate segmentation.

(a)
(b)
Figure 1: Gait cycle on x-axis of accelerometer data for walking. Left (a): data from HAPT volunteer 30; Gait cycle starts with point (65, 0.65) and ends with point (125, 0.68). Right (b): data from HAPT volunteer 3; Gait cycle starts with point (259, 0.74) and ends with point (323, 0.79); the lowest value in the cycle is at point (285, 0.67).

Apparently, it becomes confused when we mention the accuracy of the segmentation. We can definitely visually judge if the segmentation satisfies our criteria. However, teaching computer to determine if a cutting is good or not under the same criteria is a difficult task; many misjudgement are really out there and better approaches are urgently needed. We propose a simple and fast signal distance measurement that plays the role of translating human vision to computer language. This method provides the score of tuning the three hyper-parameters involved in the segmentation and as a result decides the accuracy of the segmentation. The segmentation technology as well as the signal distance investigated in this paper should not be limited to be applied in the user-adversary identification problem. Many applications in the area of human motion analysis and other type of signal analysis requires high accuracy method described here to address real world problems.

The vision-based human motion analysis in [zeng2019vision]

automatically identify the behaviour of human from a given image or a sequence of images. A human motion recognition method based on Kalman random forest algorithm model is studied in

[yi2019human] for improving the accuracy and efficiency of tracking algorithm. Many neural-network based methods to recognize human motion intention are widely investigated in literature, see e.g. [wu2019neural, lang2019joint, neverova2016learning, oukrich2019daily, furuya2019personal]. A method for positive identification of smartphone user’s identity is presented in [damavsevivcius2016smartphone], using user’s gait characteristics based on the application of the Random Projections method for feature dimensionality reduction.

Relying on the segmentation approach as well as the signal distance proposed, the accuracy of our user-adversary identification are rapidly improved even with simple and native classification methods. Against the current existing methods, we develop a new distance function for our classification method based on novel techniques of extracting certain archetypes to identify user and adversary behaviors from the training data set. Specifically, we extract the archetypes that represent individual walking behavior by simple clustering methods based on our signal distance. Then comparison are made between new signal data to the known archetypes of candidates containing user and adversary. The overall accuracy obtained shows that our methods are well-performed in cell phone authentication and detecting theft.

2 Accelerometer-based segmentation algorithm

In this section, a detailed gait segmentation algorithm is introduced based on the geometry features discussed in Section 1. The flow chart in Figure 2 briefly introduces the process of segmentation as well as the organization of this section.

Raw signal

Pre-processing:Normalization;Pre-cutting.

Finer processing:Finer cutting.

Find best cycle

Output segmentation

Find cycles on the left

Find cycles on the right
Figure 2: Segmentation algorithm flowchart

In Subsection 2.1, several pre-cutting detection approaches are applied to obtain a rough segmentation, which follows Subsection 2.2, where finer segmentation is processed. Next, a best cycle is determined from the finer segmentation in Subsection 2.3. Finally, Subsection 2.4 talks about approaches to obtain the final segmentation. For an example of segmentation processes, see Figure 3. In the following, let denotes the raw signal series, which has data points in total and be the value of the th data on the signal. Our goal is to find the segmentation of a given signal, where segmentation is represented by all the cycles on the signal, or in other words, all points that divide the signal into cycles. We call these dividing points "cuts". Here cycle is the partial signal between two nearby cuts.

(a)
(b)
(c)
(d)
(e)
Figure 3: X-axis full accelerometer segmentation processes on 1st period of volunteer 9 in HAPT data set with walking signal. 1st plot (a): pre-processing (Section 2.1); 2nd plot (b): finer processing (Section 2.2); 3rd plot (c): find best cycle (Section 2.3); 4th (d) and 5th plot (e): find cycles on the left and right of the best cycle (Section 2.4).

2.1 Pre-processing

Features and provided an idea of identify cutting positions. Apparently, we are more interested in the points on critical lines (dominated lines) and at the same time has relatively smaller y-values than others. We will use peak detection method provided by Scipy to roughly find all minimal peaks of the raw signal, before which, normalization is required to standardize different signals so that the peak detection method can be applied under same criteria. The minimal peaks obtained are the pre-cuts of the signal.

  1. First, we scale the signal to have the range of . The th scaled data is:

    (1)
  2. Second, we shift the scaled signal to have mean. The th -mean data is:

    (2)
  3. Last, we apply the peak detection in scipy on the normalized signals with the same criteria, to obtain the pre-cuts. Guaranteed by (1) and (2), height feature involved in the peak detection method in Scipy can be set to be for all signals. Moreover, the width feature is set to be for simplicity, since eventually the cycle cutting points are expected to be not very concentrated.

The result of the pre-cutting process is a vector of points for with Here denotes the position of the th data in and denotes the value of the th data in . The mappings involved in (1) and (2) are invertible and have no change on the time axis of the signal, so the positions of the th pre-cut is , which is the same on , and .

2.2 Finer processing

The cuts suppose to be the points where a gait cycle starts or ends, so the angles of the peaks in the accelerometer plots suppose to be relatively bigger, which is due to of gait cycle discussed in Section 1. Therefore, we pick peaks with larger angles to be possible candidates of the segmentation cuts. Recall that the th pre-cut point is , the closest points left and right to on are and respectively. The angle of data point is defined as the angle between vectors and :

(3)

Then we pick cuts from pre-cuts with angles above the median of the pre-cut angles. Now the finer processed cuts are denoted as for with Here denotes the position of the th point on and denotes the value of the th pre-cut point on .

2.3 Find best cycle

Based on the finer cuts picked in last subsection 2.2, cycles involved in are denoted by

where is the th cycle of the finer cuts with and . Moreover, the th cycle cuts are and . And the lengths of cycles are

where is the the cycle’s length with . The best cycle is the cycle best represents the signal’s gait features in the sense of statistical distribution; it will be used as a reference to find the rest of cycles on the signal. Let and be the values of histogram and the bin edges of respectively. Let be the length of the best cycle .

We are going to find the best cycle on the signal in at most three steps. See Algorithm 1 for details.

  1. Attempt to search for the best cycle that satisfies

    • A cycle has length that is among the non-zero frequency cycle lengths of the finer cuts.

    • The length of the cycle needs to be close to the hypothesized gait length , where hypothesized gait is a hyper-parameter to be tuned or given. Normally is the average gait length, but it can be modified to obtain different goals such as half cycle segmentation.

    If the two requirements of a best cycle can be satisfied by any cycle of the finer cuts, we will use that cycle as the best cycle and skip the next step. Otherwise, go to the next step.

  2. In this step, we are going to modify the finer cuts based on two situations below to achieve the requirements in Step 1.

    • If is much smaller than the gait length , we are going to combine each nearby pair of the finer cuts to make the average cycle length of finer cuts larger than before.

    • If is much larger than the gait length , we are going back to Session 2.2 and pick up more finer cuts.

    Either of the two modifications will renew the finer cuts. Then search for the best cycle on the new finer cuts based on the following conditions:

    • A cycle has length that is closest to the largest frequency of cycle lengths of the finer cuts.

    • The length of the cycle needs to be close to the hypothesized gait length .

    If no best cycle obtained, repeat Step 2 until one is found.

Up to this point, we have found the best cycle on the signal. Next, we will discuss how to find the rest of cycles on the signal based on the features of .

Output:
Input:
;
for   do
       ;
       if  then
             ;
             break;
            
       end if
      if  then
             break;
            
       end if
      
end for
n=1;
while  or  do
       if  then
             for ;
             ;
             ;
             ;
            
      else
             if n==6 then
                  ;
                  break;
             end if
             Repeat Section 2.2 but pick peaks with angles above

th quantile;

             and has new indexes;
             ;
             ;
            
       end if
      ;
       .
end while
Algorithm 1 Find best cycle

2.4 Find all cycles

It remains to cut the rest of the signal into cycles to obtain full cycle segmentation, which is denoted by

where and the cutting positions are

for . In this section, we mainly introduce the approach of finding other cycles to the left of the best cycle , and the right-hand side cycles can be found easily by a similar approach, so it is omitted. Two steps are included in searching to the left:

  1. Find cycles based on the left-hand side finer-picked peaks. Since the finer cuts are highly possible the right cutting positions, it deserves to scan one by one of these cuts to utilize the previously calculated results. For instance, the left-hand side peak right next to is . Let the potential cut position on the finer cuts be and is now at the th cut of finer cuts, or . So the potential cycle length is and the left over signal length is . There are three conditions listed in the following:

    • If is close to , the two finer cuts are kept as a cycle, i,e, . Then update and for next iteration.

    • If is much smaller than , is disregarded and update and for next iteration.

    • If is much larger than , then find the possible cycle in between the two finer cuts and update to . Then update and record the left over signal length .

  2. Since the finer cuts may not reach the very left end of the signal, we are going to keep searching for cycles left to the finer cuts if there is any.

The above steps are iteratively repeated until the end of the signal is reached. There are two hyper-parameters and involved to be given or tuned where controls how similar the two cycles are and controls the range of searching for local minimum. Details will be shown in Algorithm 2.

Output:
Input:
;
;
;
while   do
       with and ;
       ;
       ;
       if  then
             if The correlation of and is less than  then
                  break;
             end if
            ;
             ;
            
       else if  then
             ;
            
       else
             or positions that divide from the right into pieces with length ;
             for  do
                  ;
                   if The correlation of and is less than  then
                        break;
                   end if
                   ;
                  
             end for
            
       end if
      
end while
while  do
       with and ;
       ;
       ;
       if The correlation of and is less than  then
            break;
       end if
       ;
      
end while
Algorithm 2 Find cycles on the left of

3 New signal distance

We propose a new way of calculating distance between two signals. Let and be two signals and and are points on and respectively. Let be a permutation of such that

The basic idea is to find the interpolations of

on and , denoted by and with . Then the distance between and is defined as

Note that here we use Euclidean distance to formalize ; in fact any distance can be fitted here depends on different situations. The details are shown in Algorithm 3. This signal distance plays a crucial role in all signal analysis. In particularly, we are going to show how it will be applied in segmentation and human motion identification.

Output:
Input:
;
for   do
       ;
       ;
end for
Algorithm 3 Signal distance

4 Experimental study

All the experimental results in this section are based on HAPT data 111http://archive.ics.uci.edu/ml/datasets/Smartphone-Based+Recognition+of+Human+Activities+and+Postural+Transitions#. There are 30 volunteers in total within an age bracket of 19-48 years engaged in the data recording and all were wearing a smartphone (Samsung Galaxy S II) on the waist and performed a protocol of activities during the experiment execution. Among all the activities, walking, walking downstairs and walking upstairs are mainly studied in this paper. More specifically, 3-axial accelerometer and gyroscope data for each of the activity are studied.

The segmentation in Section 4.1 shows the results of cutting on the data set by using algorithm proposed in Section 2. The user-adversary identification in Section 4.2 will continue use the segmented data. Training data are clustered to extract individual archetypes or features, while each testing cycle is compared with all volunteer’s archetypes extracted in training process to do classification. Each volunteer may have several separated periods of data recorded for each activity; several periods are studied to capture full picture of the archetypes for an individual.

4.1 Segmentation

The segmentation algorithm discussed in Section 2 is applied in this section on the HAPT data set. Recall that there are three hyper-parameters involved in the segmentation algorithm: hypothesized gait length , cycle similarity threshold and local minimum searching range . The segmentation structure tunes the positions of the cuts on the signal by the three hyper-parameters on a score defined as

where is a shifting parameter that can bring the two signal and as close as possible, and in our case, should be in the range of . Eventually, the segmentation we obtained is . Based on our basic statistical analysis, here for a full gait cycle, is normally around 50, which follows the value of less than . We are going to set to cut all possible cycles in the signal.

We apply the segmentation on the x-axis accelerometer of walking, walking up, walking down and mixed walking activities for each volunteer. Some results are included in Figure 4 and Figure 5. Figure 4 indicates that the segmentation algorithm works for different activities: walking, walking up and walking down. All the periods of walking, walking up and walking down in the HAPT data set can be perfectly cut by our proposed approach, except for the case shown in Figure 5.

(a)
(b)
(c)
Figure 4: X-axis full accelerometer segmentation on 1st period of volunteer 1 in HAPT data set with different activities. Top (a): walking; middle (b): walking up; bottom (c): walking down.
(a)
(b)
(c)
Figure 5: X-axis full accelerometer segmentation for walking. Top (a): period 1 of volunteer 15 in HAPT data set; middle (b): period 2 of volunteer 15 in HAPT data set; bottom (c): period 2 of volunteer 27 in HAPT data set.

Like what is shown in top and middle figures in Figure 5, depending on which leg step forward first in the experiments, the segmentation might not be consistent. Even though we can say both of them are perfectly segmentation individually, it might become a problem in applications like user-adversary identification in next section. Our way of addressing this problem in the user-adversary identification process is to include each of the different cutting as a archetype for the individual. Besides, this problem can also be easily resolved by half segmentation of the gait cycle. More precisely, we can easily set smaller that that of full cycle and go through the same process. See Figure 6 first and second plots.

(a)
(b)
(c)
(d)
Figure 6: X-axis accelerometer segmentation for walking. 1st plot (a): half segmentation of period 1 of volunteer 15 in HAPT data set; 2nd plot (b): half segmentation of period 2 of volunteer 15 in HAPT data set; 3rd plot (c): half segmentation of period 2 of volunteer 27 in HAPT data set; 4th plot (d): full segmentation of period 2 of volunteer 27 in HAPT data set with positive .

Another problem is shown in bottom figure in Figure 5. Although certain period of certain activity is recorded at the same time with same volunteer, it sometimes not that stable. In this example, the first signal looks very different from the last of it, resulting in some errors in segmentation. Again, half segmentation can also resolve this problem, see Figure 6 third plot. Another alternative is using bigger which guarantee the similarity between cycles are maintained , see Figure 6 last plot. This alternative is more feasible for applications that detect the walking behaviors by cutting off the non-walking signals. The way we deal with this problem in next section is to repeat the latter alternative to extract different patterns of the gait cycle as more archetypes for the individual user.

Our segmentation algorithm also works for a more complicated situation when all types of walking cycles are mixed. By iteratively running our segmentation algorithm, we can separately segment the different types of walking signals and cut off the non-walking or noise signal. For example, we manually combined X-axis accelerometer walking, walking up and walking down signal of volunteer 1 in HAPT data set and then we apply our proposed algorithm to the new mixed data set; the segmentation results is shown in Figure 7.

(a)
(b)
Figure 7: X-axis accelerometer segmentation for mixed walking signal of volunteer 1 in HAPT data set. Top (a): manually combined mixed walking signal; bottom (b): iterative segmentation results.

4.2 User-adversary identification

In this section, multi-class classification is processed to identify user and adversary based on the well-cut x-axis accelerometer based gait segmentation from Section 4.1. The cut cycles are split into training (80%) and testing (20%) set for each volunteer. Cycles in the training set are used to extract the archetypes for each individual. The approach to obtain the archetypes is shown in Algorithm 4 and that the main idea is about clustering the training cycles into classes such that cycles within each class are guaranteed to have signal distance less than a threshold . The threshold controls the speed and accuracy of the classification process; a high speeds up the process but results in lower accuracy. Archetypes are obtained by taking "average" of the cycles in each classes using the new signal distance defined. As a result, each volunteer will have several archetypes, denoted as where . See Figure 8 for an example of archetypes of volunteer 1 with . Note that all the experiments in this paper use to obtain more archetypes and then higher performance. It is important to mention that, each test sample is single cycle based in the testing process; it is labeled into the a volunteer that has the archetype closed to the test cycle by using the signal distance. Our single cycle test method is in demand in cybersecurity field since user authentication has to be done within a few seconds.

Output:
Input: ,
j=1;
while M>0 do
       ;
       ;
       for   do
             if  then
                   ;
                   ;
                   ;
                   ;
                   ;
                   ;
                   Delete from ;
                  
             end if
            
       end for
      .
end while
Algorithm 4 Signal distance based clustering
Figure 8: Archetypes volunteer 1 based on x-axis acceleromater signal; the threshold .

4.2.1 Results

We discuss the results from several experiments under different scenarios in this subsection. Testing samples are single cycle based. Note that all the following experiments are conducted based on full-gait x-axis accelerometer segmentation, see Section 4.1 for details.

Experiment 1: 6, 10, and 30 classes user-adversary identification (classification) using x-axis accelerometer and x-axis gyroscope respectively for walking signal only.

Results are shown in Table 1

. We run 6, 10, and 30 classes classification on x-axis accelerometer and x-axis gyroscope walking signals respectively to identify the user and the adversary simultaneously. The x-axis accelerometer or x-axis gyroscope walking signals is segmented by x-axis accelerometer based segmentation. As the number of classes or the number of volunteers involved in the experiments increases, more potential adversaries are considered and therefore, adding more complicity to the classification system. This can be verified by the decreased accuracy from around 0.97 to 0.9 as the number of class rises from 6 to 30. All the performance results in the table are the mean values and standard errors of the results from 20 times randomly pick 6, 10, or 30 out of 30 volunteers in HAPT data set. In Figure

9, several confusion matrices of the classification results for each of the 4 groups contains randomly picked 6 volunteers are demonstrated. Although, multi-level classification seems to be more complicated than the binary one, our technology still reserves very high accuracy.

Besides the multi-level classification to identify user and adversary, we can also demonstrate a binary classification for user only identification; while we keep the training process the same, testing set includes the test cycles of the true user and randomly pick 2 test cycles of each of the 29 potential adversaries. The results are of 0.9744 and 0.9789 on average of test results for accelerometer and gyroscope signals respectively; the results perform better than that in [neverova2016learning, damavsevivcius2016smartphone] which have accuracy of 0.9302 and 0.943 respectively.

Number Accelerometer Gyroscope
of class ACC PPV TPR F ACC PPV TPR F
6 0.9649 0.9730 0.9558 0.9592 0.9706 0.9759 0.9620 0.9647
(0.008) (0.006) (0.009) (0.008) (0.007) (0.006) (0.009) (0.008)
10 0.9348 0.9490 0.9247 0.9270 0.9394 0.9528 0.9284 0.9326
(0.008) (0.006) (0.009) (0.009) (0.008) (0.006) (0.009) (0.009)
30 0.8789 0.8986 0.8677 0.8684 0.9000 0.9190 0.8835 0.8890
Table 1: Identification (classification) on different number of classes on x-axis accelerometer and x-axis gyroscope walking signal respectively. The signal type "accelerometer" means the identification is applied on x-axis of accelerometer signal cut by the x-axis accelerometer based segmentation; the signal type "gyroscope" means the identification is applied on x-axis of gyroscope signal cut by the x-axis accelerometer based segmentation. Each time we randomly pick 6, 10 or 30 out of the 30 volunteers to do the identification and the performance measurements are mean values of the results of 20 times repeating of the process; the standard errors are given in parentheses. "ACC": accuracy; "PPV": precision; "TPR": recall; "F

": harmonic mean of precision and recall.

(a)
(b)
(c)
(d)
Figure 9: Confusion matrix for 6-class classification. The classification is applied on x-axis of accelerometer walking signal cut by x-axis accelerometer based segmentation. Top left (a): volunteers involved are 22, 25, 21,16, 28, 7; accuracy and F1 score are 0.9714 and 0.9538 respectively. Top right (b): volunteers involved are 20, 16, 11, 13, 4, 23; accuracy and F1 score are 0.9750 and 0.9761 respectively. Bottom left (c): volunteers involved are 30, 4, 12, 6, 17, 23; accuracy and F1 score are 0.9500 and 0.9440 respectively. Bottom right (d): volunteers involved are 28, 29, 13, 7, 15, 30; accuracy and F1 score are 0.9500 and 0.944 respectively. Accuracy and F1 score are 1 and 1 respectively.
Experiment 2: 6, 10, 30 classes user-adversary identification (classification) using 3-axis accelerometer and 3-axis gyroscope respectively for walking signal only.

Results are shown in Table 2. Similar to Experiment 1, but use three axes of accelerometer or three axes of gyroscope signal in the classification process. Each test cycle (contains 3 axes as a group) is labeled to the volunteer that has the closet archetype (from 3 axes) to at least one of the axes of the test cycle. All the results are improved compared with the results using x-axis data only. The binary classification established the same as that in Experiment 1 provides an accuracy of 0.9906 for both accelerometer and gyroscope type classification.

Number Accelerometer Gyroscope
of class ACC PPV TPR F ACC PPV TPR F
6 0.9812 0.9838 0.9833 0.9811 0.9879 0.9900 0.9889 0.9880
(0.006) (0.005) (0.005) (0.006) (0.005) (0.004) (0.004) (0.004)
10 0.9798 0.9848 0.9783 0.9785 0.9767 0.9822 0.9775 0.9766
(0.005) (0.003) (0.005) (0.005) (0.005) (0.004) (0.005) (0.005)
30 0.9573 0.9659 0.9556 0.9537 0.9573 0.9659 0.9556 0.9537
Table 2: Identification (classification) of different number on 3-axis accelerometer and 3-axis gyroscope walking signal respectively. The signal type "accelerometer" means the identification is applied on 3-axis accelerometer signal cut by the x-axis accelerometer based segmentation; the signal type "gyroscope" means the identification is applied on 3-axis of gyroscope signal cut by the x-axis accelerometer based segmentation. Each time we randomly pick 6, 10 or 30 out of the 30 volunteers to do the identification and the performance measurements are mean values of the results of 20 times repeating of the process; the standard errors are given in parentheses. "ACC": accuracy; "PPV": precision; "TPR": recall; "F": harmonic mean of precision and recall.
Experiment 3: 6 classes user-adversary identification (classification) using 3-axis accelerometer and 3-axis gyroscope respectively for walking up, walking down and mixed walking signals.

Results are shown in Table 3. The experiments on walking up and walking down signals are similar to Experiment 1 with 6 classes but replace the activity with walking up and walking down respectively. The mixed walking experiment using the iterative segmentation presented in Figure 7; the classification is applied on three axes of accelerometer or gyroscope signal together for mixed walking activities.

Signal Accelerometer Gyroscope
Type ACC PPV TPR F ACC PPV TPR F
Walking up 0.9423 0.9632 0.9417 0.9388 0.9385 0.9569 0.9375 0.9336
(0.015) (0.009) (0.015) (0.016) (0.013) (0.010) (0.013) (0.014)
Walking down 0.8140 0.7715 0.7936 0.7604 0.8408 0.7768 0.7992 0.7692
(0.032) (0.042) (0.032) (0.038) (0.020) (0.034) (0.025) (0.030)
Mixed Walking 0.7170 0.7444 0.6936 0.6928 0.7903 0.8163 0.7857 0.7834
(0.025) (0.025) (0.025) (0.024) (0.014) (0.013) (0.014) (0.013)
Table 3: Identification of different activites by accelerometer based full-cycle segmentation of walking signal. The signal type "accelerometer" means the identification is applied on 3-axis accelerometer signal cut by the accelerometer based segmentation; the signal type "gyroscope" means the identification is applied on 3-axis of gyroscope signal cut by the accelerometer based segmentation. Each time we randomly pick 6 out of the 30 volunteers to do the identification and the performance measurements are mean values of the results of 20 times repeating of the process; the standard errors are given in parentheses. "ACC": accuracy; "PPV": precision; "TPR": recall; "F": harmonic mean of precision and recall.

5 Discussion and Conclusion

The technology developed here would have great impact on cell phone user-adversary identification and cybersecurity. Comparing with existing segmentation methods such as zero crossing and Lomb-Scargle Periodogram, we cut at the critical points based on geometric features integrated with statistical and iterative methods. Our method is more robust and scalable. Moreover, the new segmentation method is applicable for different walking-type signals such as walking, walking upstairs, walking downstairs and mixed walking. Further research involves applying our proposed technology on more flexible walking signals such as walking with cell phone in hand, in pocket or in bag. The signal distance proposed in this paper provides a new approach of measuring the difference between signals which gives a possibility of translating the way human vision in distinguishing signals to machine language. It can be broadly used in all kinds of signal processing analysis which indicates many possible directions of further researches. The user-adversary identification algorithm we proposed easily detects if certain walking behaviors are from the user or not. More precisely, it is possible to determine who is this imposter if potential imposters are provided. Comparing to the fact of tuning a large number of hyper-parameters in Neutral Network methods, our approaches only have four parameters with clear geometric meaning involved; the parameters can be tuned or given based on geometric analysis integrated by statistics. We obtain an accuracy of 0.9879 for 6-class classification on walking signal using 3 axes of the both accelerometer and gyroscope data; the accuracy is of 0.9812 when using only 3 axes of the accelerometer data; we still keep an accuracy of 0.9649 or 0.9706 when only using x-axis of the accelerometer or x-axis of accelerometer and gyroscope data. Even if the classification is on all the possible classes (30 classes) in HAPT, we still reserve an accuracy of 0.9573 by our approaches. When the multi-level classification degenerates to binary classification, we obtain an accuracy of around 0.99 for walking signal. It is worth to point out that the walking up and walking down data sets are very small, yet our experiments still showing solid results while there is not enough data for training in Neutral Network methods. In summary, the techniques we developed here is in high accuracy in user-adversary identification even with multi-level classification, which is crucial in cybersecurity.

References