ActiLabel: A Combinatorial Transfer Learning Framework for Activity Recognition

03/16/2020 ∙ by Parastoo Alinia, et al. ∙ Washington State University 0

Sensor-based human activity recognition has become a critical component of many emerging applications ranging from behavioral medicine to gaming. However, an unprecedented increase in the diversity of sensor devices in the Internet-of-Things era has limited the adoption of activity recognition models for use across different domains. We propose ActiLabel a combinatorial framework that learns structural similarities among the events in an arbitrary domain and those of a different domain. The structural similarities are captured through a graph model, referred to as the it dependency graph, which abstracts details of activity patterns in low-level signal and feature space. The activity labels are then autonomously learned by finding an optimal tiered mapping between the dependency graphs. Extensive experiments based on three public datasets demonstrate the superiority of ActiLabel over state-of-the-art transfer learning and deep learning methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Human activity recognition (HAR) systems are crucial components in health monitoring and personalized behavioral medicine. HAR systems use machine learning algorithms to detect physical activities based on the data collected from wearable and mobile sensors

[16, 15]. Such systems are usually designed based on labeled training data collected in a particular domain, such as with a specific sensor modality, wearing site, or user. A significant challenge with existing HAR systems is that the baseline machine learning model which is trained with a specific setting (i.e., source) performs poorly in new settings [25, 22]. This challenge has limited scalability of sensor-based HAR system given collecting sufficiently large amounts of labeled sensor data for every possible domain is a time-consuming, labor-intensive, and often infeasible process.

We introduce ActiLabel, a combinatorial framework that learns machine learning models in a new domain (i.e., target) without the need to manually collect any labels. A unique attribute of ActiLabel is that it examines structural relationships between activity events (i.e., classes/clusters) in two different domains and uses this information for target-to-source mapping. Such structural relationships allow us to compare the two domains at a higher level of abstraction than the common feature space, therefore enable knowledge transfer across radically diverse domains. We hypothesize that even under sever cross-domain spatial and temporal uncertainties (i.e., significant distribution shift), physical activities exhibit similar structural dependencies across different domains, mainly due to the physical and physiological underpinning of human health monitoring.

To the best of our knowledge, our work is the first study that develops a combinatorial approach for structural transfer learning. Our notable contributions can be summarized as follows. (i) We introduce a combinatorial optimization formulation for transfer learning; (ii) we devise methodologies for constructing a network representation of wearable sensor readings, referred to as

network graph

; (iii) we design algorithms that perform community detection on the network graph to identify core activity clusters; (iv) we introduce an approach to construct a dependency graph based on the core activity clusters identified on the network graph; (vi) we propose a novel multi-layer matching algorithm for mapping target-to-source dependency graphs; (vii) we conduct an extensive assessment of the performance of ActiLabel for cross-modality, cross-subject, and cross-location activity learning using real sensor data collected with human subjects.

2 Background and Related Work

2.1 Transfer Learning

Transfer learning (TL) is the ability to extend the knowledge in one setting to another, nonidentical but related, setting. We refer to the previous setting as the source domain. The sensor data captured in this domain is referred to as the source dataset, which is fully labeled in our case. The new state of the system, which may exhibit radical changes from the source domain, is referred to as the target domain, where we intend to label the sensor data autonomously [5, 14]. Depending on how the availability of the labels in the source and target, one can categorize TL techniques into three groups. Inductive TL is where the source is fully labeled and there are few labeled samples in the target. In transductive TL, which is the focus of this paper, labels are available in the source, but there is no label in the target. Unsupervised TL is where there is no label in neither target or source domains [23, 18]

. Prior research also proposed a deep convolution recurrent neural network to automate the process of feature extraction and to capture general patterns from activity data

[13]. However, deep learning models have not shown promising performance in highly diverse domains, such as cross-modality knowledge transfer. For example, previous research achieved only % accuracy in recognizing human gestures using deep learning with computationally dense algorithms cross sensors of different modalities [26, 7]. More advanced models combine knowledge of transfer and deep learning [24]. There have been studies that transfer different layers of deep neural networks across different domains. In one study, a cross-domain deep transfer learning method was introduced that achieved % accuracy with four activity classes for cross-location and cross-subject knowledge transfer [22]. Unlike our transductive transfer learning approach in this paper, these approaches fall within the category of inductive transfer learning, where some labeled instances are required in the target domain.

2.2 Graph Theory Definitions

k-Nearest Neighbor (k-NN) graphs are commonly used to classify unknown events using feature representations. During the classification process, certain features are extracted from unknown events and classified based on the features extracted from their k-nearest neighbors

[4, 10]. k-NN graph of a dataset is obtained by connecting each data point to its k closest points from the dataset based on a distance metric between the data points. The symmetric k-NN graphs are when each point is connected to another only if both are in each other k-nearest neighborhood.

Community detection algorithms are widely used to identify clusters in large scale network graphs [8]. Recent research suggests that detecting communities from a network representation of data could result in a higher clustering performance compared to traditional clustering algorithms [17, 3]. We define some of the essential features related to community detection in network graphs in the following.

Definition 1 (Cut).

Given a graph (,) and communities = {, , }, ”Cut” between communities and is defined as the number of edges with one end in and the other end in . That is,

(1)
Definition 2 (Cluster Density).

Given a graph (,) and communities = {, , } within the graph , ”community density”, (), for community is defined as the number of edges with both ends residing in .

(2)
Definition 3 (Community Size).

Given a graph (,) and communities = {, , } within the graph , ”Community Size”, (), for community is defined as the number of vertices that reside in .

(3)

3 ActiLabel

We propose ActiLabel to solve the problem of labeling sensor observations in an arbitrary setting (i.e., target) based on the labeled observations in another setting (i.e., source) even when the source and target observations follow highly diverse distributions. ActiLabel aims to create a labeled dataset in the target by transferring the knowledge from the labeled source observations such that the labeling error is minimized.

Assigning a label to each sensor observation in the target domain can be viewed as a mapping problem where sensor observations in the target domain are mapped to labeled observations in the source domain. ActiLabel finds an optimal mapping between the two domain; the mapping, however, is performed at a much higher level of abstraction that the traditional feature level. To this end, mapping in ActiLabel is done from groups of similar target observations, called core clusters, to known activity classes in the source domain. The goal of this optimization problem is to minimize the mapping costs/error.

The overall approach in ActiLabel is illustrated in Figure 1. As summarized in Algorithm 1, the design process in ActiLabel involves the following steps, where we refer to the first two steps as graph modeling and the next two steps as optimal label learning. (i) Network graph construction from sensor readings in both domains Figure 1-a; (ii) Core cluster identification given the network graphs in both domains Figure 1-b. (iii) Dependency graph construction based on the core clusters and network graph in both domains Figure 1-c. (iv) Optimal Label learning by mapping the dependency graphs from the source and target domains Figure 1-d, Figure 1-e, and Figure 1-f.

Figure 1: An overview of the ActiLabel framework including graph modeling and optimal label learning.
1:Input:, unlabeled target dataset, {, }, labeled source dataset.
2:Result: Labeled target dataset, {, }
3:Graph Modeling: (section 3.1)
4:  Construct Network Graphs in both domains; (section 3.1.1)
5:  Identify core clusters in both domains; (section 3.1.2)
6:  Build Dependency graphs; (section 3.1.3)
7:  Extract structural relationships among the core clusters in both domains;
8:Optimal Label Learning (section 3.2)
9:  Perform graph-level min-cost mapping from target to source;
10:  Assign labels to the observations in target;
11:  Train activity recognition model in target using new labels;
Algorithm 1 ActiLabel

3.1 Graph Modeling

We construct dependency graphs that capture structural dependencies among the events (i.e., physical activities) in both target and source domains. The dependency graphs are then used in optimal label learning to label sensor observations and generate a training dataset in the target domain. As shown in Figure 1, our graph modeling consists of three phases: (i) network graph construction; (ii) core cluster identification; and (iii) dependency graph construction. This section elaborates on each phase.

3.1.1 Network Graph Construction

We initially build a network representation of the sensor observations based on symmetric k-nearest-neighboring to quantify the amount of similarity between pairs of observations.

Definition 4 (Network Graph).

The network graph refers (,

) is a symmetric k-NN graph where vertices are feature representation of the sensor data and distance function is the cosine similarity between the features.

3.1.2 Core Cluster Identification

To identify core clusters in ActiLabel, we propose a graph-based clustering algorithm similar to the approach in prior research [2]. We refer to this approach as core cluster identification (CCI), which runs on the network graph (,) in two steps. (i) Partitioning the network graph into multiple communities of approximately the same vertex size using a greedy community detection technique. (ii) Merging the communities with the highest similarity score based on their dendrogram structure.

The amount of similarity between communities and is measured as the ratio of the number of edges between the two communities (i.e., (,)) to the average number of edges that reside within the two communities. Therefore, the similarity score of is given by

(4)

where the terms and denote the number of edges that reside in and , respectively. Note that the similarity score is defined such that it is not adversely influenced by the size of communities in unbalanced datasets.

3.1.3 Dependency Graph Construction

To capture high-level structural relationships among sensor observations, we devise a structural dependency graph where the core clusters identified previously represent vertices of the dependency graph.

Definition 5 (Dependency Graph).

Given a network graph (,) where = and core clusters = {, , } obtained from the network graph, we define dependency graph ( ,, , ) as a weighted directed complete graph as follows. Each vertex is associated with a core cluster . Thus, = . Each vertex is assigned a weight given by

(5)

where and refer to cluster density and cluster size, respectively, for core cluster . Each edge , associated with core clusters and , is assigned a weight given by

(6)
1:Input: and , dependency graphs for target and source domains.
2:Result: Labeled target dataset, {, }
3:Construct bipartite graph using edge components;
4:Obtain bipartite mapping on ;
5:Construct bipartite graph on vertex components;
6:Obtain bipartite mapping on ;
7:Construct bipartite graph using and ;
8:Obtain bipartite mapping OptMapping on ;
9:Assign source labels to appropriate core clusters in target using OptMapping;
Algorithm 2 Optimal Label Learning

3.2 Optimal Label Learning

Algorithm 2 summarizes the steps for optimal label learning. The goal of the optimal label learning is to find a mapping from the dependency graph in the target to that of the source domain while minimizing the mapping error. We refer to this optimization problem as min-cost dependency graph mapping and define it as follows.

Problem 1 (Min-Cost Dependency Graph Mapping).

Let and denote dependency graphs obtained from datasets in the source and target domains, respectively. The min-cost dependency graph mapping is to find a mapping from to such as the cost of such mapping is minimized.

Problem 1 can be viewed as a combinatorial optimization problem that finds an optimal mapping in a two-tier fashion: (i) it initially performs component-level mappings where vertex-wise and edge-wise mappings are found between source and target dependency graphs; and (ii) it then uses the component-level mappings to reach a consensus about the optimal mapping for the problem as a whole. Such a two-level mapping problem can be represented using the objective in (7).

(7)

where represents the number of mappings between and obtained through the component-level optimization. Furthermore, is a normalization factor that is equal to the total number of component-wise mappings. The objective in (7) attempts to minimize the mapping cost at the graph-level and, therefore, can be viewed as the objective for Problem 1.

We build a weighted complete bipartite graph on the components of the dependency graphs to find the minimum double-cost mapping. Figure 1-d is an illustration of such a bipartite graph where the nodes on the left shore of the graph represent components (e.g., node weights) of the target dependency graph and the nodes on the right shore of the bipartite graph are associated with corresponding components in the source dependency graph.

In constructing a bipartite graph, a weight is assigned to the edge that connects node in the target side to nodes in the source side. This weight also represents the actual mapping cost and is given by

(8)

where and are the weight values associated with component in the source domain and component in the target domain, respectively. We note that these weights can be computed using (5) and (6) for vertex-wise mapping and edge-wise mapping, respectively. We also note that if the number of components in source and target were not equal, we could add dummy nodes to one shore of the bipartite graph to create a complete and balanced bipartite graph.

We use Hungarian Algorithm (a widely used weighted bipartite matching algorithm with time complexity) [9] to identify an optimal mapping from the nodes on the left shore of the bipartite graph to the nodes on the right shore of the graph.

The last step is to assign the labels mapped to each cluster to the target observations within that cluster. A classification model is trained on the labeled target dataset for physical activity recognition.

3.3 Time Complexity Analysis

Lemma 1.

The graph modeling in ActiLabel has a time complexity of where ’’ denotes the number of sensor observations.

Proof.

The proof is eliminated for brevity. ∎

Lemma 2.

The optimal label learning phase in ActiLabel has a time complexity of where denotes the number of sensor observations and represents the number of classes.

Proof.

The proof is eliminated for brevity. ∎

Theorem 1.

The time complexity of ActiLabel is quadratic in the number of sensor observations, .

Proof.

Assuming that the number of classes, , is much smaller than the number of sensor observations, , (i.e., ), the proof follows Lemma 1 and Lemma 2. ∎

4 Experimental Setup

4.1 Datasets

We use three sizeable human activity datasets to evaluate the performance of ActiLabel. We refer to these datasets as PAMAP2, a physical activity monitoring dataset used in [19], DAS, daily & sport activity dataset used in [1], and Smartsock, a dataset containing ankle-worn sensor data used in [6]. Table 1 has provided a summary of the datasets utilized in this study.

Dataset #Sub. #Act. #Sample Sensors Locations
PAMAP2 9 24 3850505
ACC, GYR,
HR, TMP,
ORI, MAG
C, H, A
DAS 8 19 1140000
ACC, GYR
MAG
LA, RA,
LL, RL, T
Smartsock 12 12 9888 ACC, STR A
Table 1: Brief description of the datasets utilized for activity recognition. The sensor modalities include accelerometer: ACC, gyroscope: GYR, magnetometer: MAG, temperature: TMP, orientation: ORI, heart rate: HR, stretch sensor: STR, and locations are chest: C, ankle: A, hand: H, left arm: LA, left leg: LL, right arm: RA, right leg: RL, torso: T.
Figure 2: Performance comparison between core cluster identification in ActiLabel and standard clustering and communication detection algorithms.

4.2 Comparison Methods

We compare the performance of ActiLabel with the following algorithms. (i) Baseline, which learns a shallow classifier in the source domain and deploys it for activity recognition in the target domain. (ii) Deep Convolution LSTM, [13] which learns a deep classifier in the source domain and applies it for activity recognition in the target domain. (iii) DirectMap, which directly maps centroids of the clusters in the target to activity classes in the source domain to create a labeled dataset in the target. (iv) Upper-bound, which learns a classifier assuming that the actual labels are available in the target domain. We assess the performance of ActiLabel during three scenarios: (i) cross-modality transfer when sensors in the two domains have different modalities (e.g., accelerometer and gyroscope), (ii) cross-subject transfer across two different human subjects, and (iii) cross-location transfer when the target and source location of the wearable sensor is different.

4.3 Implementation Details

The datasets are divided into training, test, and

validation parts with no overlap to avoid bias. We extracted an exhaustive set of time-domain features from a sliding window of size 2 seconds with 25% overlap. The extracted features include mean value, peak amplitude, entropy, and energy of the signal which are shown to be useful in human physical activity estimation using inertial sensor data

[11, 21]. We reduce the features dimension using UMAP [12] algorithm before clustering.

We analyzed the effect of hyper-parameter in the -NN network graph on the performance of CCI as measured by NMI and purity. As shown in Figure 3, NMI achieved its highest value (i.e., for PAMAP2, for DAS, and for Smartsock) when was set to % or % of the graph network size. This translates into a = for PAMAP2 and Smartsock and for DAS datasets.

(a) NMI
(b) Purity
Figure 3: Performance of CCI versus parameter in network graph construction.

5 Results

We analyzed the effect of hyper-parameter in the -NN network graph on the performance of the core cluster identification as measured by normalized mutual information (NMI) and clustering purity [20]. The results demonstrate for PAMAP2 and Smartsock and for DAS datasets as an optimal value.

5.1 Performance of Core Cluster Identification

As shown in Figure 2, CCI outperforms state-of-the-art clustering and community detection algorithms. The NMI for the competing methods ranged from 0.37–0.65 for PAMAP2, 0.25–0.77 for DAS, and 0.52–0.76 for Smartsock. The proposed algorithm CCI increased NMI to 0.67, 0.87, and 0.85 for PAMAP2, DAS, and Smartsock datasets, respectively. Note that the clustering was generally more accurate for Smartsock and DAS datasets because PAMAP2 contained data from sensor modalities (e.g., temperature) that might not be a good representative of the activities of interest.

5.2 Labeling Accuracy of ActiLabel

In this section, we report the labeling accuracy of DWMatching algorithm proposed in Section 3.2 as the ratio of correctly labeled observations to all named labeling accuracy. The labeling accuracy of the DWMatching algorithm mainly depends on the purity of the activity clusters and similarity between distribution of the data in the source and target.

5.2.1 Cross-Modality

As shown in Figure 4(a), accelerometer, gyroscope, magnetometer, and orientation modalities higher labeling accuracy (i.e., ) as the target sensor across all three datasets. In PAMAP2, the labeling accuracy drops to the range %–% when orientation and heart rate sensors are the target modality which shows the weak clustering of their observations into the activity classes and diverse data distribution from other modalities such as accelerometer. In Smartsock, DWMatching achieves labeling accuracy between an accelerometer and a stretch sensor.

(a) PAMAP2
(b) DAS
(c) Smartsock
Figure 4: Labeling accuracy of ActiLabel for cross-modality scenario.

5.2.2 Cross-Location

Mappings between the same or similar body locations such as ”chest to chest”, and ”left arm to right arm” achieve high labeling accuracies (i.e., ). The labeling accuracy between dissimilar locations in the DAS dataset, such as ”left leg to right arm” and ”left arm to right leg”, drops to the range . Although chest, ankle, and hand are dissimilar body locations, mappings between them from the PAMAP2 dataset achieve labeling accuracy since data in each location comes from the rich collection of sensor modalities that provide sufficient information about inter-event structural similarities captured by our label learning algorithm. The cross-location transfer does not apply to the Smartsock dataset since it contained only one sensor location.

(a) PAMAP2
(b) DAS
Figure 5: Labeling accuracy of ActiLabel for cross-location scenario.

5.3 Performance of Activity Recognition

Table 2 shows activity recognition performance for ActiLabel as well as algorithms under comparison, including baseline (BL), deep convolution LSTM (CL), DirectMap (DM), and upper-bound (UB) as discussed previously. We report the F1-Score value for each method as it is a better representative of the performance for unbalanced datasets.

5.3.1 Cross-Modality Transfer

We examined transfer learning across accelerometer, gyroscope, magnetometer, orientation, temperature, heart rate, and stretch sensor modalities. The cross-modality results in Table 2 reflect average performance over all possible cross-modality scenarios. The baseline and ConvLSTM performed poorly overall three datasets, which shows the diverse distribution of data across sensors of different modalities. The DirectMap approach achieved F1-score over all three datasets. ActiLabel outperformed competing algorithms, in particular, DirectMap by , , and for PAMAP2, DAS, and Smartsock, respectively.

5.3.2 Cross-Location Transfer

We examined transfer learning among chest, ankle, hand, arms, legs, and torso locations. The cross-location results in Table 2 represent average values over all possible transfer scenarios. The relatively low F1-scores of the baseline and ConvLSTM algorithms can be explained by the high level of diversity between the source and target domains during cross-location transfer learning. The DirectMap and ActiLabel both outperformed the baseline and ConvLSTM models. Specifically, DirectMap and ActiLabel and F1-Scores for PAMAP2, and and F1-Scores for DAS.

5.3.3 Cross-Subject Transfer

The DirectMap approach and ActiLabel obtained F1-Scores of , and in PAMAP2, and in DAS, and , and

in Smartsock, respectively. Since there is a limited amount of data for each subject, ActiLabel could not capture high-level structures in the data. Therefore, it could not beat the state-of-the-art in all cases. All the algorithms achieved higher F1-score values compared to the cross-location and cross-modality scenarios. This observation suggests that data variations among different subjects can be normalized using techniques such as feature scaling, and feature selection before classification.

Scenario Dataset BL CL DM AL UB
Cross- modality PAMAP2 7.8 8.1 40.4 59.3 80.8
DAS 9.3 8.2 44.8 66.2 86.1
Smartsock 16.2 12.8 66.0 72.7 84.2
Cross- location PAMAP2 14.3 12.7 63.4 70.8 93.2
DAS 13.2 12.4 60.7 68.4 89.8
Cross- location PAMAP2 65.8 61.9 85.4 82.7 98.1
DAS 67.1 56.8 79.0 80.3 92.5
Smartsock 59.8 61.8 82.6 80.0 95.2
Average 31.6 29.3 63.4 71.9 89.9
Table 2: Activity recognition performance (F1-Score).

6 Conclusion

We introduced ActiLabel, a computational framework with combinatorial optimization methodologies for transferring physical activity knowledge across highly diverse domains. ActiLabel extracts high-level structures from sensor observations in the target and source domains and learns labels in the target domain by finding an optimal mapping between dependency graphs in the source and target domains. ActiLabel provides consistently high accuracy for cross-domain knowledge transfer in various learning scenarios. Our extensive experimental results showed that ActiLabel achieves average F1-scores of %, , and for cross-modality, cross-location, and cross-subject activity recognition, respectively. These results suggest that ActiLabel outperforms the competing algorithms by , , and in cross-modality, cross-location, and cross-subject learning, respectively.

References

  • [1] B. Barshan and M. C. Yüksek (2014)

    Recognizing daily and sports activities in two open source machine learning environments using body-worn sensor units

    .
    The Computer Journal 57 (11), pp. 1649–1667. Cited by: §4.1.
  • [2] T. Barton, T. Bruna, and P. Kordik (2019) Chameleon 2: an improved graph-based clustering algorithm. ACM Transactions on Knowledge Discovery from Data (TKDD) 13 (1), pp. 10. Cited by: §3.1.2.
  • [3] V. D. Blondel, J. Guillaume, R. Lambiotte, and E. Lefebvre (2008) Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008 (10), pp. P10008. Cited by: §2.2.
  • [4] J. Chen, H. Fang, and Y. Saad (2009)

    Fast approximate knn graph construction for high dimensional data via recursive lanczos bisection

    .
    Journal of Machine Learning Research 10 (Sep), pp. 1989–2012. Cited by: §2.2.
  • [5] D. Cook, K. D. Feuz, and N. C. Krishnan (2013) Transfer learning for activity recognition: a survey. Vol. 36, pp. 537–556. Cited by: §2.1.
  • [6] R. Fallahzadeh, M. Pedram, and H. Ghasemzadeh (2016) Smartsock: a wearable platform for context-aware assessment of ankle edema. In 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 6302–6306. Cited by: §4.1.
  • [7] C. Feichtenhofer, A. Pinz, and A. Zisserman (2016) Convolutional two-stream network fusion for video action recognition. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    ,
    pp. 1933–1941. Cited by: §2.1.
  • [8] L. N. Ferreira and L. Zhao (2016) Time series clustering via community detection in networks. Information Sciences 326, pp. 227–242. Cited by: §2.2.
  • [9] H. W. Kuhn (1955) The hungarian method for the assignment problem. Naval research logistics quarterly 2 (1-2), pp. 83–97. Cited by: §3.2.
  • [10] M. Maier, U. V. Luxburg, and M. Hein (2009) Influence of graph construction on graph-based clustering measures. In Advances in neural information processing systems, pp. 1025–1032. Cited by: §2.2.
  • [11] A. Mannini and A. M. Sabatini (2010) Machine learning methods for classifying human physical activity from on-body accelerometers. Sensors 10 (2), pp. 1154–1175. Cited by: §4.3.
  • [12] L. McInnes, J. Healy, and J. Melville (2018) Umap: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426. Cited by: §4.3.
  • [13] F. Ordóñez and D. Roggen (2016) Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors 16 (1), pp. 115. Cited by: §2.1, §4.2.
  • [14] S. J. Pan and Q. Yang (2010) A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22 (10), pp. 1345–1359. Cited by: §2.1.
  • [15] A. Pantelopoulos, N. G. Bourbakis, et al. (2010) A survey on wearable sensor-based systems for health monitoring and prognosis.. IEEE Trans. Systems, Man, and Cybernetics, Part C 40 (1), pp. 1–12. Cited by: §1.
  • [16] L. Piwek, D. A. Ellis, S. Andrews, and A. Joinson (2016) The rise of consumer health wearables: promises and barriers. PLoS medicine 13 (2), pp. e1001953. Cited by: §1.
  • [17] M. Puxeddu, M. Petti, F. Pichiorri, F. Cincotti, D. Mattia, and L. Astolfi (2017) Community detection: comparison among clustering algorithms and application to eeg-based brain networks. In 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 3965–3968. Cited by: §2.2.
  • [18] I. Redko, A. Habrard, and M. Sebban (2016) Theoretical analysis of domain adaptation with optimal transport. External Links: 1610.04420 Cited by: §2.1.
  • [19] A. Reiss and D. Stricker (2012) Introducing a new benchmarked dataset for activity monitoring. In Wearable Computers (ISWC), 2012 16th International Symposium on, pp. 108–109. Cited by: §4.1.
  • [20] E. Rendón, I. M. Abundez, C. Gutierrez, S. D. Zagal, A. Arizmendi, E. M. Quiroz, and H. E. Arzate (2011) A comparison of internal and external cluster validation indexes. In Proceedings of the 5th WSEAS International Conference on Computer Engineering and Applications, pp. 158–163. Cited by: §5.
  • [21] R. Saeedi, B. Schimert, and H. Ghasemzadeh (2014) Cost-sensitive feature selection for on-body sensor localization. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, pp. 833–842. Cited by: §4.3.
  • [22] J. Wang, V. W. Zheng, Y. Chen, and M. Huang (2018) Deep transfer learning for cross-domain activity recognition. In Proceedings of the 3rd International Conference on Crowd Science and Engineering, ICCSE’18, New York, NY, USA, pp. 16:1–16:8. External Links: ISBN 978-1-4503-6587-1, Link, Document Cited by: §1, §2.1.
  • [23] K. Weiss, T. M. Khoshgoftaar, and D. Wang (2016) A survey of transfer learning. Vol. 3, pp. 9. Cited by: §2.1.
  • [24] Q. Yang (2017) When deep learning meets transfer learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM ’17, New York, NY, USA, pp. 5–5. External Links: ISBN 978-1-4503-4918-5, Link, Document Cited by: §2.1.
  • [25] C. Zhang, L. Zhang, and J. Ye (2012) Generalization bounds for domain adaptation. In Advances in neural information processing systems, pp. 3320–3328. Cited by: §1.
  • [26] G. Zhu, L. Zhang, P. Shen, and J. Song (2017) Multimodal gesture recognition using 3-d convolution and convolutional lstm. IEEE Access 5, pp. 4517–4524. Cited by: §2.1.