Ensemble Machine Learning Model Trained on a New Synthesized Dataset Generalizes Well for Stress Prediction Using Wearable Devices

by   Gideon Vos, et al.

Introduction. We investigate the generalization ability of models built on datasets containing a small number of subjects, recorded in single study protocols. Next, we propose and evaluate methods combining these datasets into a single, large dataset. Finally, we propose and evaluate the use of ensemble techniques by combining gradient boosting with an artificial neural network to measure predictive power on new, unseen data. Methods. Sensor biomarker data from six public datasets were utilized in this study. To test model generalization, we developed a gradient boosting model trained on one dataset (SWELL), and tested its predictive power on two datasets previously used in other studies (WESAD, NEURO). Next, we merged four small datasets, i.e. (SWELL, NEURO, WESAD, UBFC-Phys), to provide a combined total of 99 subjects,. In addition, we utilized random sampling combined with another dataset (EXAM) to build a larger training dataset consisting of 200 synthesized subjects,. Finally, we developed an ensemble model that combines our gradient boosting model with an artificial neural network, and tested it on two additional, unseen publicly available stress datasets (WESAD and Toadstool). Results. Our method delivers a robust stress measurement system capable of achieving 85 25 Conclusion. Models trained on small, single study protocol datasets do not generalize well for use on new, unseen data and lack statistical power. Ma-chine learning models trained on a dataset containing a larger number of varied study subjects capture physiological variance better, resulting in more robust stress detection.


page 9

page 16

page 19

page 25


Machine Learning for Stress Monitoring from Wearable Devices: A Systematic Literature Review

Introduction. The stress response has both subjective, psychological and...

Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty

We present our proposed solution to the BabyLM challenge [arXiv:2301.117...

On the Generalizability of ECG-based Stress Detection Models

Stress is prevalent in many aspects of everyday life including work, hea...

Forecasting COVID-19 spreading trough an ensemble of classical and machine learning models: Spain's case study

In this work we evaluate the applicability of an ensemble of population ...

Semi-Supervised Learning and Data Augmentation in Wearable-based Momentary Stress Detection in the Wild

Physiological and behavioral data collected from wearable or mobile sens...

An optimized hybrid solution for IoT based lifestyle disease classification using stress data

Stress, anxiety, and nervousness are all high-risk health states in ever...

Comparative Study on the Effects of Noise in ML-Based Anxiety Detection

Wearable health devices are ushering in a new age of continuous and noni...

Please sign up or login with your details

Forgot password? Click here to reset