Learning-to-Learn Personalised Human Activity Recognition Models

06/12/2020 ∙ by Anjana Wijekoon, et al. ∙ Robert Gordon University 0

Human Activity Recognition (HAR) is the classification of human movement, captured using one or more sensors either as wearables or embedded in the environment (e.g. depth cameras, pressure mats). State-of-the-art methods of HAR rely on having access to a considerable amount of labelled data to train deep architectures with many train-able parameters. This becomes prohibitive when tasked with creating models that are sensitive to personal nuances in human movement, explicitly present when performing exercises. In addition, it is not possible to collect training data to cover all possible subjects in the target population. Accordingly, learning personalised models with few data remains an interesting challenge for HAR research. We present a meta-learning methodology for learning to learn personalised HAR models for HAR; with the expectation that the end-user need only provides a few labelled data but can benefit from the rapid adaptation of a generic meta-model. We introduce two algorithms, Personalised MAML and Personalised Relation Networks inspired by existing Meta-Learning algorithms but optimised for learning HAR models that are adaptable to any person in health and well-being applications. A comparative study shows significant performance improvements against the state-of-the-art Deep Learning algorithms and the Few-shot Meta-Learning algorithms in multiple HAR domains.



There are no comments yet.


page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Machine Learning research in Human Activity Recognition (HAR) has a wide range of high impact applications in gait recognition, fall detection, orthopaedic rehabilitation and general fitness monitoring. HAR involves the task of learning reasoning models to recognise activities from human movements inferred from streams of sensor data. A data instance, in a HAR dataset, contains sensor data streams collected from individuals (i.e. person data). Unavoidably, sensor streams capture personal traits and nuances in some activity domains more than others. Typically with activities that involve greater degrees of freedom. Learning a single reasoning model to recognise the set of activity classes in a HAR task can be challenging because of the need for personalisation.

We propose it is more intuitive to treat a “person-activity” pair as the class label. This means for example if two people were to perform the same movement activity, they should be treated as separate classes because they are executed by 2 different people. Accordingly, each person’s data can be viewed as a dataset in its own right, and the HAR task involves learning a reasoning model for the person. Learning from only specific persons’ data has shown significant performance improvements in early research with both supervised learning and active learning methods 

Tapia et al. (2007); Longstaff et al. (2010). But these methods require considerable amounts of data obtained from the end-user, periodical end-user involvement and model re-training. In addition, current state-of-the-art Deep Learning algorithms require a large number of labelled data instances to avoid under-fitting.

Here we adopt the “person-activity” classes idea but attempt to learn with a limited number of data instances per class. This can be viewed as a Few-shot classification scenario Vinyals et al. (2016); Snell et al. (2017) where the aim is to learn with one or few data instances and is commonly evaluated with datasets such as Omniglot (1623 classes) and MiniImagenet (100 classes), but only limited by the number of data instances available for each class. Meta-Learning is arguably the state-of-the-art in Few-shot classification for image recognition Finn et al. (2017); Nichol et al. (2018). In a nutshell, Meta-Learning is described as learning-to-learn, where a wide range of tasks abstract their learning to a meta-model, such that, it is transferable to any unseen task. Meta-Learning algorithms such as MAML Finn et al. (2017), Relation Networks (RN) Sung et al. (2018) are grounded in theories of Metric Learning and parametric optimisation, and capable of learning generalised models, rapidly adaptable to new tasks with only a few instances of data.

The concept of learning-to-learn aligns well with personalisation where modelling a person can be viewed as a single task; whereby the meta-model must help learn a model that is rapidly adaptable to a new person. We propose Personalised Meta-Learning to create personalised models, by leveraging a small amount of sensing data (i.e. calibration data) extracted from a person. Accordingly, in this paper we make the following contributions,

  1. formalise Personalised Meta-Learning and propose two Personalised Meta-Learning Algorithms, Personalised MAML and Personalised RN;

  2. perform a comparative evaluation with 9 HAR datasets representing a wide range of activity domains to evidence the utility of Personalised Meta-Learning algorithms over conventional Deep Learning and Few-shot Meta-Learning algorithms;

  3. analyse train and test results of personalised vs. conventional meta-learners to understand how personalisation enhanced meta-learners that are able to adapt a generalised model at deployment; and

  4. present an exploratory study to explore hyper-parameter selection of personalised meta-learners.

Overall, we observe that the performance improvements achieved by Personalised Meta-Learning is possible using simple parametric models with limited number of trainable parameters that only require a limited amount of labelled data compared to conventional DL models. The rest of the paper is organised as follows: Section 

2 explore past research, challenges in the areas of Personalised HAR. Section 3 introduces the state-of-the art in Meta-Learning and our proposed approach and algorithms for Personalised Meta-Learning is presented in Section 4. Next we present our comparative study, including datasets and evaluation methodology in Section 5. In Section 6 we compare how personalisation improve the performance of Meta-Learners and in Section 7 we explore a number of hyper-parameters for optimal performance of Personalised Meta-Learners. Finally a discussion on practical implication, limitations and planned future work are presented in Section 8 and we present our conclusions in Section 9.

2 Related Work

Human Activity Recognition (HAR) is an active research challenge, where Deep Learning (DL) methods claim the state-of-the-art in many application domains Wijekoon et al. (2019); Wang et al. (2019); Ordóñez and Roggen (2016); Yao et al. (2017)

. Learning a generalised reasoning model adaptable to many user groups is a unique transfer learning challenge in the HAR domain. Sensors capture many personal nuances, that are most prominent in application domains such as Exercises or Activities of Daily Living (ADL), leading to poor performance. Given access to large quantities of end-user data, early research has achieved improved performance by learning personal models 

Tapia et al. (2007); Berchtold et al. (2010). Follow on work attempts to reduce the burden on end-user, by adopting semi-supervised Longstaff et al. (2010); Miu et al. (2015), active learning Longstaff et al. (2010) and multi-task Sun et al. (2012) methods that rely on periodical model re-training and continuous user involvement post-deployment.

Recent advancements in Few-shot Learning are adopted as an approach to personalisation in Personalised Matching Networks (Wijekoon et al. (2020); Vinyals et al. (2016). learns a parametric model, that is learning to match, leveraging a few data instances from the same user. At deployment, the network successfully transfers the learning to new users given only a few labelled data instances for matching, obtained through one-time micro-interactions. This approach avoids post-deployment re-training and only require a few data instances from the end-user. While this method achieves significant performance improvement over conventional methods in HAR, we believe moderated post-deployment re-training can be beneficial for improving personalisation.

Meta-Learning is an interesting approach that will facilitate Few-shot Learning and post-deployment re-training for adaptation. Meta-Learning or “Learning-to-learn” is the learning of a generalised classification model that is transferable to new learning tasks with only a few instances of labelled data. In recent research it is interpreted and implemented mainly in three approaches; firstly, “learning to match” approach implemented by Relation Networks (RN) Sung et al. (2018); secondly, model-specific approach like SNAIL Mishra et al. (2017); and finally, optimisation based algorithms such as MAML Finn et al. (2017) and Reptile Nichol et al. (2018).

MAML, including its variants such as First-Order MAML Finn et al. (2017) and Reptile Nichol et al. (2018), is an optimisation based Meta-Learning algorithm, learns a generalised model rapidly adaptable to any new task. Notably, these models are model-agnostic, which increases the adaptability in new domains where different feature-representation are preferred. In contrast, there exist, model-specific Meta-Learning algorithms, such as SNAIL Mishra et al. (2017) and MANN Santoro et al. (2016)

, where Meta-Learning is achieved using specific neural network constructs such as LSTM and Neural Turing Machine 

Graves et al. (2014). Model-agnostic methods are preferred in a HAR setting, where heterogeneous sensor modalities or modality combinations may require specific feature representation learning methods. Relation Networks (RN) Sung et al. (2018) are “learning to match” by learning similarity relationships. While there are significant commonalities between the Few-shot Learning algorithm MN Vinyals et al. (2016)

and RN, it is not limited by a distance metric for similarity calculation. The parametric model for similarity learning in RN, enhances the learning from a Few-shot classifier to a generalised meta-model.

In comparison to model-agnostic and model specific methods, RN is also model-agnostic where the architecture is modularised Sung et al. (2018)

such that the feature representation learning can be adapted to suit the HAR task. In contrast to model-agnostic and model-specific methods, RN has the potential to perform Open-ended HAR, by modelling the classification task as a matching task (i.e. no softmax layer with fixed class length) similar to Open-ended Matching Networks 

Wijekoon et al. (2020). In this paper we explore personalisation of Meta-Learning for HAR with two algorithms, MAML and RN. Personalised Meta-Learners will only require limited interaction with the end-user to obtain few data instances and facilitate optional post-deployment re-training, which are both essential features for personalised HAR.

3 Meta-Learning

Figure 1: Meta-Learning Tasks with Omniglot Dataset

Meta-Learning learns a meta-model, , trained over many tasks, where a task is equivalent to a “data instance” or ”labelled example” in conventional Machine Learning (ML). In practice, meta-learning is implemented as an optimisation over the set of tasks to learn a generalised model (i.e. meta-model

) that can be rapidly transferred to new, seen or unseen tasks. The concept of Meta-Learning is implemented in many branches of Machine Learning (ML), such as Few-shot learning, Reinforcement Learning and Regression, and here our focus is Few-shot Learning.

In Few-shot classification, Meta-Learning can be seen as optimisation of a parametric model over many few-shot tasks (i.e. meta-train). More formally, a task, is a few-shot learning problem with a set of train and test data instances, referred to as the “support set”, , and the “query set”, . The number of instances in a support set is ,, where is the set of classes and is the number of representatives from each class in the support set(often also referred to as a the shots in k-shot learning). For example, in Figure 1, a task support set contains distinct digits each of which forms a class (here = 5 with = 1). Each task contains an equal number of classes but not necessarily the same sub set of classes. Typically the query set, , has no overlap with the support set, similar to a train/test split in supervised learning; and unlike the support set, composition of the query set need not be constrained to represent all .

Once the meta-model is trained using the meta-train tasks, it is tested using the meta-test tasks. A meta-test task, , has a similar composition to a meta-train task, in that it also has a support set and a query set. Unlike traditional classifier testing; with meta-testing, we use the support set in conjunction with the trained meta-model to classify instances in the query sets. For few shot learning there are two common meta-learning approaches: the adaptation optimised algorithms such as MAML Finn et al. (2017) and Reptile Nichol et al. (2018); and the similarity optimised learning of Relation Networks Sung et al. (2018).

Figure 2: Model-Agnostic Meta-Learning

MAML Finn et al. (2017) is a versatile Meta-Learning algorithm applicable to any neural network model optimised with Gradient Descent (GD). Adaptation optimised Meta-Learner MAML is illustrated in Figure 2. In each training iteration, a set of tasks are sampled and each task, optimises its task-model, with the using one or few steps of GD referred to as gradient steps (). The meta-modal is then optimised using GD with the losses computed by the optimised task-models s against their respective s, referred to as the Meta-update. This process is repeated with many task samples, to learn a generic model prototype that can be rapidly adapted to a new task. A task, , not seen during training, uses its support set, to train a parametric model , initialised by the meta-model , for few gradient steps referred to as, meta-gradient-steps, . Thereafter, the adapted, is used to classify instances in its query set, .

Figure 3: Relation Network

Relation Network (RN) Sung et al. (2018) is a Few-shot Meta-Learning algorithm that “learns-to-match” or learns a non-linear similarity function for matching. RN has a similar goal to other Meta-Learners, of generalising over many tasks. The meta-task design for RN is as described in Figure 1. However with RN, each train-task consists of a number of meta-training instances, created by combining each from, with the support set . During training, RN learns to match each instance, to a matching instance in . As illustrated in Figure 3, the learns a feature representation for each instance; next each , in is paired with the

. The relation network then learns to estimate the similarity of the paired instances. The network is optimised end to end such that we learn a feature representation model,

, and a relation model, , that collectively maximises the similarity between pairs that belong to the same class. A meta-test task, , not seen during training, can use the RN to match a query instance in to an instance in it’s support set , and therein use the class of the matched support instance as the predicted class label.

4 Methods

Given a dataset, , Human Activity Recognition (HAR), like any other supervised learning problem, is the learning of the feature mapping between data instances, , and activity classes, , where is from the set of activity classes, .


In comparison to image or text classification, with HAR, each data instance in belongs to a person, . Given the set of data instances obtained from person is , is the collection of data instances from the population  (Equation 2). As before, all data instances in will belong to a class in , except for special tasks such as open-ended HAR where is not fully specified at training time.


4.1 Personalised Meta-Learning for HAR

Figure 4: Personalised Meta-Learning Tasks design for HAR

Personalised Meta-Learning for Human Activity Recognition (HAR) can be seen as the learning of a meta-model from a population while treating activity recognition for a person as an independent task. We propose the task design in Figure 4 for Personalised Meta-Learning. Given a dataset , of population , we create tasks such that, each “person-task”, , only contain data from a specific person, . We randomly select a number of labelled data instances from person stratified across activity classes, , such that there are amount of representatives for each class. We follow a similar approach when selecting a query set, , for . Given that existing HAR dataset are not strictly Few-shot Learning datasets, there can be a few to many data instances available to be sampled for the query set, . In comparison to Meta-Learning task design (Section 3), each “person-task” is learning to classify the set of ”person-activity” class labels.

A dataset is split for training and testing using a person-aware evaluation methodology, such as Leave-One-Person-Out or Persons Hold-out where 1 or few persons form the meta-test person-tasks and the rest form the meta-train person-tasks. At test time, the test person, , provides a few seconds of data for each activity class while being recorded by recommended sensor modalities, which forms the support set, , of the person-task, . Thereafter, the meta-model, in conjunction with the support set, predicts the class label for each query data instance, , in .

It is noteworthy that, contrary to conventional Meta-Learning, all personal models and the meta-model are learning to classify the same set of activity classes , but of different persons (i.e. “person-activity”). Therefore, it is seen as a Few-shot Meta-Learning classification problem with a number of classes. Personalised Meta-Learning is a methodology adaptable with any Meta-Learning algorithm to perform personalised HAR, and next we show how with two Meta-Learning algorithms, MAML and RN.

4.2 Personalised MAML

Personalised MAML () for HAR is adaptation optimised to learn the generic model prototype (i.e. meta-model), , such that it is adaptable to any new person encountered at test time. Task design for follow the similar process described in Section 4.1 where we select a support set, , and a query set for the person-task . The number of instance in the support set , determines the number of instances that need to be requested from a new person, , during testing. Thus, we keep small, similar to a few-shot learning scenario. We use all remaining data instances from each class, in . More formally, given there are instances per “person-activity”, and . The meta-model learning, is influenced by the loss generated by the of each person-task. Therefore we evaluate for a range of values in an exploratory study. We present the training of Personalised MAML in Algorithm 1.

A meta-test person provide instances per class forming the support set , that is used to train the personalised classification model, , initialised by the meta-model , for . This process can be seen as the personalised model adaptation from the meta-model . Thereafter, the class label is predicted for each incoming test data instances using as in Algorithm 2.

Personalised MAML is model-agnostic, with the opportunity to use a diverse range neural networks for feature representation learning. This is advantages for HAR applications where heterogeneous sensor modalities or modality combinations are used. We note that we refer to First-Order MAML Finn et al. (2017) when implementing Personalised MAML, which is computationally less intensive, yet achieves comparable performances in comparison to MAML Finn et al. (2017).

0:  : HAR dataset; distribution over persons
0:  , , , , : step sizes, batch size and gradient-steps hyper-parameters
1:  randomly initialise
2:  while not done do
3:     Sample persons
4:     for all  do
5:        }
6:        for  to  do
7:           Evaluate w.r.t.
8:           Compute adapted parameters with gradient descent:
9:        end for
10:        }
11:        Evaluate w.r.t
12:     end for
13:     Meta-update:
14:  end while
Algorithm 1 Personalised MAML Training
0:   for test person obtained via micro-interactions,
0:  ; Meta-model
1:  Initialise
2:  for  to  do
3:     Evaluate w.r.t.
4:     Compute adapted parameters with gradient descent:
5:  end for
6:  for all  do
7:     predict
8:  end for
Algorithm 2 Personalised MAML Testing

4.3 Personalised RN

Personalised Relation Networks  learns a matching, generalisable to new persons encountered at test time. Design of person-tasks () follow the methodology in Section 4.1 where the the support set, , and the query set , is selected from the same person. Similar to , we select amount of data instances per class to create the support set, and we select all remaining data instances of the class to create the query set . We create meta-training instance for by combining each data instance , in , with the support set, . Therefore, the matching is always performed against their own data in the support set. This method is similar to personalisation methods described for Personalised Matching Networks Wijekoon et al. (2020). At test time, a meta-test person provide a support set, with representatives for each class in . This support set is thereafter combined with each test query instance for predicting class label using the model as in Algorithm 4. We present the training of Personalised RN in Algorithm 3. A has two parametric modules, one for feature representation learning, and one for similarity learning,  (Figure 3), and similar to , feature representation learning module can be configured to suit heterogeneous sensor modalities or modality combinations.

0:  : HAR dataset; distribution over persons
0:  : step size hyper-parameter
1:  randomly initialise and
2:  while not done do
3:     Sample a person
6:     for all  do
7:        Create train data instance ()
8:     end for
9:     Evaluate w.r.t. number of train data instances
10:     Update
11:  end while
Algorithm 3 Personalised RN Training
0:  Support set for test person obtained via micro-interactions,
0:  ; Relation Network Model
1:  for all  do
2:     predict
3:  end for
Algorithm 4 Personalised RN Testing

5 Comparative Study

We compare the performance of Personalised Meta-learning algorithms, Personalised MAML () and Personalised RN () against a number of baselines and the state-of-the-art algorithms as listed below;


Best performing DL algorithm from benchmark performances published in Wijekoon et al. (2019)


Matching Networks from Vinyals et al. (2016); Few-shot Learning classifier


Model-Agnostic Meta-Learner Finn et al. (2017) (detailed in Section 3) The state-of-the-art for Few-shot Image classification


Relation Networks Sung et al. (2018) (detailed in Section 3) State-of-the-art for Few-shot Image classification


Personalised Matching Networks from Wijekoon et al. (2020); Few-shot Learning classifier, state-of-the-art for personalised HAR

MAML (Ours):

Personalised MAML introduced in Section 4.2

RN (Ours):

Personalised Relation Networks introduced in Section 4.3

5.1 Datasets and Pre-processing

We use three data sources to create 9 datasets in single modality sensing. Both and are model agnostic, such that the feature representation learning models are interchangeable to suit any modality combination.

MEx 111https://archive.ics.uci.edu/ml/datasets/MEx is a Physiotherapy Exercises dataset complied with 30 participants performing 7 exercises. A participant performs one exercise for only 60 seconds. A depth camera (DC), a pressure mat (PM) and two accelerometers on the wrist (ACW) and the thigh (ACT) provide four sensor data streams creating four datasets. PAMAP2 222http://archive.ics.uci.edu/ml/datasets/pamap2+physical+activity+monitoring dataset contains 8 Activities of Daily Living recorded with 8 participants. Three accelerometers on the hand (H), the chest (C) and the ankle (A) provide three sensor data streams creating three datasets. selfBACK 333https://github.com/rgu-selfback/Datasets is a HAR dataset with 6 ambulatory and 3 stationary activities. These activities are recorded with 33 participants using two accelerometers on the wrist (W) and the thigh (T), creating two datasets.

A sliding window method is applied on each sensor data stream to obtain data instances. Window size of 5 seconds is applied for all 9 datasets and an overlap of 3, 1 and 2.5 for data sources MEx, PAMAP2 and selfBACK, resulted in 30, 76 and 88 data instance per person-activity on average. A few pre-processing steps are applied on data instances, specific to their sensor modalities. DC and PM modalities use a reduced frame rate from to and DC frame size is reduced from to . Accelerometer data apply DCT feature transformation on every 1 second data slice of each axis and select the 60 most prominent DCT coefficients. Resulting input sizes for of RN and of MAML are , and for DC, PM and AC modalities respectively.

5.2 Implementation

Datasets Architecture
Table 1: Best performing DL Network Architectures of the 9 datasets, :TimeDistributedLayer, :ConvolutionalLayer with kernels of kernel size , :MaxPoolingLayer with pool size , :DenseLayer with units, :BatchNormalisation
Figure 5: Network Architectures

DL benchmark results with the MEx datasets are published in Wijekoon et al. (2019) and for comparability we implement the same network architectures and evaluate with the best performing baselines for PAMAP2 and selfBACK datasets. We detail the selected network architectures in Table 1.

We use the network in Figure 4(a) as the feature representation learning model in , , and algorithms, where the input sizes for modalities Accelerometer, Depth camera and Pressure mat are reshaped to 900, 960 and 1280 respectively. and use the network in Figure 4(b) to learn the feature representations, where the input sizes for modalities Accelerometer, Depth camera and Pressure mat are (), () and () respectively with 1 channel. Relation module uses the network in Figure 4(c) where the input is twice the size of the output of (See Figure 3 for complete network architecture).

We use the best performing hyper-parameter settings for and from Wijekoon et al. (2020)

where the networks are 5-shot classifiers, trained for 20 epochs with early stopping using Categorical Cross-entropy as the objective. We train

and similar to , but using Mean Squared Error as the objective function Sung et al. (2018) where , , trained for 300 epochs and apply early stopping. Our initial empirical evaluations showed that using and trained using Categorical Cross-entropy yields comparable results and achieves model convergence faster, compared to using Mean Squared Error. and are using Categorical Cross-entropy as the objective function and use , and for training and testing. We use and , and is trained for 100 epochs. All models are trained with the Adam optimiser and the Meta-Learning models do not use mini-batching.

5.3 Evaluation Methodology

We follow the person-aware evaluation methodology, Leave-One-Person-Out (LOPO) in our experiments. We leave data from one person to create meta-test tasks and use the rest to create meta-train tasks. We note that during testing, even , and

preserve the personalisation aspect because of the LOPO evaluation strategy where only one user is present in the meta-test tasks. The meta-train and meta-test tasks are created while maintaining class balance; accordingly we report the accuracy of each experiment averaged over the number of person folds. LOPO evaluation methodology require a non-parametric statistical significance test as they produce results that are not normally distributed. We use the Wilcoxon signed-rank test for paired samples to evaluate the statistical significance at 95% confidence and highlight the significantly improved performances in bold text.

5.4 Results

Algorithm MEx MEx MEx MEx
DL 0.9015 0.6335 0.8720 0.7408
0.9073 0.4620 0.5065 0.6187
0.9155 0.6663 0.9342 0.8205
0.8673 0.6525 0.9629 0.9283
0.9327 0.7279 0.8189 0.8145
(Ours) 0.9106 0.6834 0.9795 0.9408
(Ours) 0.9436 0.7719 0.9205 0.8520
Table 2: Personalised Meta-Learner performance comparison for Exercise Recognition

Table 2 presents the comparison of performances obtained by the algorithms in Section 5 for the Exercise Recognition (ExRec) task using the four datasets derived from MEx. As expected personalised Meta-Learning models significantly outperformed conventional DL and Meta-Learning models in all four experiments. Notably the two datasets with accelerometer data recorded best performance with while datasets with visual data; MEx and MEx, recorded best performance with . It is noteworthy that the Personalised Few-shot Learning algorithm , achieves comparable performance against model of the MEx dataset and outperform model of the MEx dataset. When comparing conventional Meta-Learners (i.e. , ) and Personalised Few-Shot Learner , we highlight that, models achieve comparable performances or significantly outperform at least one conventional Meta-Learner with all four experiments. These results further confirm the importance of personalisation for ExRec.

Table 3 presents results for Ambulatory, Stationary and ADL activities using 5 datasets for PAMAP2 and selfBACK. Similar to ExRec, Personalised Meta-Learning models have significantly outperformed conventional DL models, with both ambulatory and stationary activity data. Notably, two experiments with ADL data, have significantly outperformed DL models with at least one of the personalised Meta-Learner models (PMP: , and , PMP: ). However Personalised Meta-Learner models fail to outperform DL models using the PMP dataset. All models significantly outperform its original counterpart , and significantly outperform with four experiments with the exception of PMP where performance is comparable with . While two of the five experiments significantly outperform Personalised , three experiments fail to outperform . But all experiments achieve their best performance with a personalised algorithms further confirming the significance of Personalisation in different domains of HAR. ’s use of the simpler similarity metric (such as cosine) has proven to be sufficient for PAMAP2 in particular, compared to the sophisticated similarity model learnt with .

Best 0.7880 0.6997 0.7505 0.7878 0.8075
MN 0.8392 0.7669 0.6625 0.7536 0.7361
0.9124 0.8653 0.7484 0.8548 0.8330
MAML 0.8398 0.7532 0.7593 0.7626 0.6830
RN 0.9334 0.8276 0.7818 0.8170 0.7527
(Ours) 0.8625 0.8075 0.8037 0.7822 0.7256
(Ours) 0.9487 0.8528 0.7868 0.8294 0.7761
Table 3: Personalised meta-Learner performance comparison for Ambulatory, Stationary and ADL Activity Recognition

Overall, considering all personalisation algorithms, we find that experiments with visual data prefer the optimisation based meta-learning algorithm (i.e. ) and experiments with time-series data prefer learning to compare methods (i.e. and ). It is noteworthy that and use a 1-dense layer network (Figure 5) and uses a 1-convolution layer network for feature representation learning while achieving significant performance improvements. While meta-learning attributes to this improvement, it is further enhanced by the personalised learning strategies. These results highlight that Meta-Learners and Personalisation are positively contributing towards eliminating the need for parametric models with many deep layers that require a large labelled data collection for training. This is highly significant in the domain of HAR, where even a comprehensive data collection fails to cover all possible personal nuances and traits that a reasoning model may encounter in the future.

Evidently, model adaptation by re-training using a few data instances at test time, significantly improved meta-model performance for , suggesting that learnt meta-model was transferable and obtaining an activity label for a given test query, is a simple inference task using the adapted model. While

does not require model-retraining, obtaining the activity class label for a given query involves a more complex inference process; each data instance in the user provided support set and the query instance is converted to feature vectors and later concatenated (as described in Section 

4.3) to obtain the relation scores and derive the activity class label. We calculate the average time elapsed for obtaining an activity label on the MEx query data instance, using both algorithms in a computer with 8GB RAM and 3.1 GHz Dual-Core processor. While takes 0.0156 milliseconds for a single classification task, takes 2.4982 milliseconds in a 1-shot setting and 3.7218 milliseconds in a 5-shot setting. This is an important difference, when selecting an algorithm to operate in an edge device with limited resources, where a high response rate is necessary while maintaining performance.

6 Conventional vs. Personalised Meta-Learners

Here we look closer at training aspects to understand how personalisation improves performance of Meta-Learners using MEx dataset. We select MEx because it is representative of a few-shot learning HAR dataset with only 30 data instances for each “person-activity” class.

6.1 MAML vs. MAML

We first investigate the performance improvements achieved by after using the Personalised Meta-Learning methodology against . Accordingly we compare three algorithms, where meta-train and test tasks are created disregarding any person identifiers; , as described in Section 4.2; and Person-aware . Here Person-aware can be seen as a lazy personalisation of MAML where a meta-train task is comprised of data instances selected from a set of persons. The support set contains, , data instances for each activity class, where data for any given activity must be obtained from a single person, but different activity classes may obtain data from different persons. The query set will have data from a single person who may not have been selected to form the support set. This method still preserve the concept of “person-activity” only at the class label level, but not over the entire support set level. We visualise the impact of model adaptation at test time using these different algorithms in both, , and , settings on the MEx dataset.

Figure 6: MAML vs. Person-aware MAML vs. MAML with MEx when
Figure 7: MAML vs. Person-aware MAML vs. MAML with MEx when

Figures 6 and 7 compare , and Person-aware using MEx in and settings respectively. Here we plot test-person accuracy (y-axis) evaluated at every 10 meta-train epochs (2 row of the x-axis); at each of these evaluation points, the meta-test support set is used to adapt the current meta-model for a further 10 meta-gradient steps (1 row of the x-axis). During the adaptation steps we also record accuracy using the meta-test query. Through this process we can observe the impact of the partially optimised general meta-model when being adapted for personalisation at test time (or at deployment) at different stages of optimisation. and Person-aware significantly outperformed in both settings. When comparing and Person-aware , algorithm achieves a more generalised meta-model even without performing meta-gradient steps for meta-model adaptation (0 on row of the x-axis); this is most significant in the setting. These observations verify the advantage of creating personalised tasks; that even with the Person-aware algorithm, where each task contains data from multiple people, but each “person-activity” class only contains data from one person has clear benefits. Accordingly, personalised algorithms ensure that the task-models are trained for a set of “person-activity” classes instead of “activity” classes. where all “person-activity” data belongs to the same person, provides further generalisation with rapid adaptation. Another indication of the significance of personalisation is found when investigating performance over the training epochs. While improves overall performance as the meta-model train, meta-test accuracy before adaptation (every meta-gradient step), declines consistently. This is most significant when , which indicates that the meta-model learned with is not generalised when an activity class in a meta-train task support set contains data from multiple people. In comparison, meta-model learned with , performs well on meta-test tasks, even before adaptation.

6.2 RN vs. RN

Similarly we compare the performance between the two algorithms Relation Networks (RN) and Personalised RN () to understand the effect of personalisation on network training and testing. For this purpose we train the two algorithms, with the MEx dataset in two setting and for 300 epochs, and evaluate the model at every 10 epochs using meta-test tasks.

Figure 8: RN vs. RN with MEx meta-model tested at every 10 meta-train epochs

In Figures 7(a) and 7(b) plot the test accuracy on meta-test tasks obtained at every 10 meta-train epochs for the two algorithms RN and in the two settings and . It is evident that personalisation has stabilised the meta-training process, where meta-test model performs consistently better with models. In contrast meta-test evaluation on the models is erratic, especially evident when . When training in the setting, a task is created by disregarding the person parameter, as a result, an activity class contains data instances from more than one person and learning similarities to many people has adversely affected the learning of the meta-model. Similarly, in the setting, when a task contains only one data instance per class, learning from ones own data with is advantages in comparison to where the data instance for a class is from one person but not strictly similar to query person.

7 Personalised Meta-Learner Hyper-parameter selection

We explore three hyper-parameters of Personalised Meta-Learners using the 4 datasets from MEx for Exercise Recognition. The 4 MEx datasets give us the opportunity to compare how different modalities recorded for the same set of activities respond in different hyper-parameter settings when adapting to new unseen persons.

7.1 Meta-train query set size comparison for MAML

First we explore the most effective value for training . Originally, experiments used for image classification Finn et al. (2017). As shown in Algorithm 1 line 8, determine how many data instances are considered in meta-update loss calculation, that later affect the meta-model learning. Given each meta-train task for , now belong to a specific person, we expect to find the effect of using a fewer or larger number of data instance in meta-task evaluation. Accordingly we explore three values 5, 10 and where the is all available data instances for an “person-activity” class (30 on average for all MEx datasets)

Both settings, and achieved comparable performances with all three values. This suggests that meta-update is not affected adversely by creating personalised tasks, where the meta-update uses the loss computed from a batch of person-specific task models. We plot the meta-test performance during the 10 meta-gradient steps of meta-model adaptation to visualise the effect of learning a meta-model from different values in Figure 9 (Here is set to 5). Both MEx and MEx achieve similar performances with all three meta-models post-adaptation, but the most generalised meta-model is learned when using . This is more significantly seen with the MEx experiment where outperform other variants before and during model adaptation (before: 6.55% difference with MEx, 4.8% difference with MEx). These results suggest, limiting in each meta-train task improves generalisation of the meta-model for later adaptation with persons not seen during model training.

(a) PM
(b) ACT
Figure 9: MAML: Meta-model adaptation with meta-models trained with different sizes

7.2 Support set size comparison for MAML

Next we perform an exploratory study of using a range of values. Finding the the balance between and performance of is important because, at deployment, the test-person is expected to provide amount of data instances per activity class, therefore its is desirable to keep to a required minimum. We perform experiments with all 4 datasets from MEx for values 1, 3, 5, 7 and 10.

(a) Results of different with each ExRec dataset
(b) Meta-model adaptation with meta-models trained with different values using MEx dataset
Figure 10: MAML: Exploring meta-training with different

Figure 9(a) plot meta-test performances obtained by datasets MEx, MEx, MEx and MEx. Increasing consistently improve performance up to and report decreased or similar performance when with all datasets. A significant performance improvement is observed when increasing from 1 to 3. While modalities ACT and ACW achieve highest performance with modalities DC and PM gradually improve performance even at . The meta-test accuracy at every meta-gradient update with increasing for MEx and MEx for a randomly selected person is visualised in Figure 9(b). These figures indicate that meta-gradient adaptation converge faster when is smaller. Overall they validate that that increasing improve meta-test performance, before, during and after adaptation, most significantly seen when increasing from 1 to 3.

7.3 Support set size comparison for RN

Finally we explore different values to find the most optimal value for . Similar to , determine how many data instances are required from a new person for the optimal personalisation of the model; therefore, we try to find the balance between performance improvement and . Accordingly we create 10 experiments with each MEx dataset where range from 1 to 10.

Figure 11: RN: Exploring meta-training with different for each ExRec dataset

We plot accuracy against different settings for each dataset in Figure 11. The figure clearly indicate that different sensor modalities prefer different values. Both MEx and MEx exhibit dome shaped behaviour in increasing settings where after it is detrimental to the network to have many examples of the same activity class for comparison. In contrast, MEx dataset show continuous performance improvement with increasing and at it starts to dome like MEx and MEx modalities. We note that MEx behave differently to others when increasing , while there is a domed behaviour for from 2 to 8, performances when and

are outliers that we will further investigate in future.

As mentioned in Section 5.4, with the algorithm, a larger value not only increase the amount of data instances requested from a test person, but also increases the memory and computational requirements. Larger settings increases the number of comparisons and takes longer to perform a single classification task. Overall, we find exhibit a proper balance between performance vs. memory requirements.

8 Discussion

The comparative results from Section 5.4 show that while achieve best performance with many HAR datasets, the response rate is 248 times slower compared to . An HAR algorithm should be able to recognise activities as they are performed in real-time for the best user-experience, and the processor and memory requirements along with the response time is crucial considerations for edge device deployment. In comparison,

require post-deployment model re-training, which require the algorithm to perform in a development friendly environment using libraries like TensorFlow Light or PyTorch Mobile.

A limitation of Personalised MAML and MAML in general is the inability to perform open-ended HAR. Both and perform Zero-shot Learning for image classification Finn et al. (2017); Sung et al. (2018) for a fixed class length. Specifically, is restricted to performing multi-class classification with a conventional soft-max layer; for instance 5 outputs for a 5-class (5-way) classification task. Open-ended HAR require dynamic expansion of the decision layer as the person add new activities in addition to the activities that are already included. Few-shot classifiers such as Matching Networks (MN) Vinyals et al. (2016) does not have a strict decision layer which inspired Open-ended MN Wijekoon et al. (2020) for Open-ended HAR. Similarities of Relation Networks (RN) to MN presents the opportunity to improve Open-ended HAR using RN, which we will explore in future.

When a Personalised Meta-Learning model is trained and embedded in the fitness application, there is a initial configuration step that is required for collecting the calibration data(i.e. support set) of the end-user. The end-user will be instructed to record a few seconds of data for each activity using the sensor modalities synchronised with the fitness application. This is similar to demographic configurations users perform when installing new fitness applications (on-boarding). Thereafter this support set will be used by the algorithm either to re-train the model () or for comparison (). Both and provides the opportunity to provide new calibration data if the physiology of the user change to improve performance. Such changes include a gait change, a disability or a dramatic weight change that affect their personal activity rhythms.

9 Conclusion

In this paper, we presented Personalised Meta-Learning, a methodology for learning to optimise for personalisation of Human Activity Recognition (HAR) using only q few labelled data. This is achieved by treating the ”person-activity” pair in a HAR dataset as a class label, where each class now only has few instances of data for training. Accordingly, we implement Personalised Meta-Learning with two Meta-Learning algorithms for few-shot classification Personalised MAML ( ) and Personalised Relation Networks () where a meta-model is learned, such that it can be rapidly adapted to any person not seen during training. Both algorithms require only a few instances of calibration data from the end-user to personalised the meta-model, where at deployment, uses calibration data for adaptation with model re-training and uses calibration data directly for matching (without re-training). Our evaluation with 9 HAR datasets shows that both algorithms achieve significant performance improvements in a range of HAR domains while outperforming conventional Deep Learning, Few-shot Learning and Meta-Learning algorithms. We highlight that personalisation achieves higher model generalisation, compared to non-personalised Meta-Learners, which results is faster model adaptation. Importantly we find, while outperform with a majority of HAR datasets, performs is significantly faster than due to the gains over paired matching.


  • M. Berchtold, M. Budde, D. Gordon, H. R. Schmidtke, and M. Beigl (2010) Actiserv: activity recognition service for mobile phones. In International Symposium on Wearable Computers (ISWC) 2010, pp. 1–8. Cited by: §2.
  • C. Finn, P. Abbeel, and S. Levine (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1126–1135. Cited by: §1, §2, §2, §3, §3, §4.2, item MAML:, §7.1, §8.
  • A. Graves, G. Wayne, and I. Danihelka (2014) Neural turing machines. arXiv preprint arXiv:1410.5401. Cited by: §2.
  • B. Longstaff, S. Reddy, and D. Estrin (2010)

    Improving activity classification for health applications on mobile devices using active and semi-supervised learning

    In 2010 4th International Conference on Pervasive Computing Technologies for Healthcare, pp. 1–7. Cited by: §1, §2.
  • N. Mishra, M. Rohaninejad, X. Chen, and P. Abbeel (2017) A simple neural attentive meta-learner. arXiv preprint arXiv:1707.03141. Cited by: §2, §2.
  • T. Miu, P. Missier, and T. Plötz (2015) Bootstrapping personalised human activity recognition models using online active learning. In 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, pp. 1138–1147. Cited by: §2.
  • A. Nichol, J. Achiam, and J. Schulman (2018) On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999. Cited by: §1, §2, §2, §3.
  • F. J. Ordóñez and D. Roggen (2016)

    Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition

    Sensors 16 (1), pp. 115. Cited by: §2.
  • A. Santoro, S. Bartunov, M. Botvinick, D. Wierstra, and T. Lillicrap (2016) Meta-learning with memory-augmented neural networks. In International conference on machine learning, pp. 1842–1850. Cited by: §2.
  • J. Snell, K. Swersky, and R. Zemel (2017) Prototypical networks for few-shot learning. In Advances in neural information processing systems, pp. 4077–4087. Cited by: §1.
  • X. Sun, H. Kashima, and N. Ueda (2012) Large-scale personalized human activity recognition using online multitask learning. IEEE Transactions on Knowledge and Data Engineering 25 (11), pp. 2551–2563. Cited by: §2.
  • F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. Torr, and T. M. Hospedales (2018) Learning to compare: relation network for few-shot learning. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    pp. 1199–1208. Cited by: §1, §2, §2, §2, §3, §3, item RN:, §5.2, §8.
  • E. M. Tapia, S. S. Intille, W. Haskell, K. Larson, J. Wright, A. King, and R. Friedman (2007) Real-time recognition of physical activities and their intensities using wireless accelerometers and a heart rate monitor. In 2007 11th IEEE international symposium on wearable computers, pp. 37–40. Cited by: §1, §2.
  • O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al. (2016) Matching networks for one shot learning. In Advances in neural information processing systems, pp. 3630–3638. Cited by: §1, §2, §2, item MN:, §8.
  • J. Wang, Y. Chen, S. Hao, X. Peng, and L. Hu (2019) Deep learning for sensor-based activity recognition: a survey. Pattern Recognition Letters 119, pp. 3–11. Cited by: §2.
  • A. Wijekoon, N. Wiratunga, and K. Cooper (2019) MEx: multi-modal exercises dataset for human activity recognition. arXiv preprint arXiv:1908.08992. Cited by: §2, item DL:, §5.2.
  • A. Wijekoon, N. Wiratunga, S. Sani, and K. Cooper (2020) A knowledge-light approach to personalised and open-ended human activity recognition. Knowledge-based systems 192, pp. 105651. Cited by: §2, §2, §4.3, item MN:, §5.2, §8.
  • S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. Abdelzaher (2017) Deepsense: a unified deep learning framework for time-series mobile sensing data processing. In Proceedings of the 26th International Conference on World Wide Web, pp. 351–360. Cited by: §2.