Smart Fog: Fog Computing Framework for Unsupervised Clustering Analytics in Wearable Internet of Things

12/25/2017 ∙ by Debanjan Borthakur, et al. ∙ University of Rhode Island 0

The increasing use of wearables in smart telehealth generates heterogeneous medical big data. Cloud and fog services process these data for assisting clinical procedures. IoT based ehealthcare have greatly benefited from efficient data processing. This paper proposed and evaluated use of low resource machine learning on Fog devices kept close to the wearables for smart healthcare. In state of the art telecare systems, the signal processing and machine learning modules are deployed in the cloud for processing physiological data. We developed a prototype of Fog-based unsupervised machine learning big data analysis for discovering patterns in physiological data. We employed Intel Edison and Raspberry Pi as Fog computer in proposed architecture. We performed validation studies on real-world pathological speech data from in home monitoring of patients with Parkinson's disease (PD). Proposed architecture employed machine learning for analysis of pathological speech data obtained from smartwatches worn by the patients with PD. Results showed that proposed architecture is promising for low-resource clinical machine learning. It could be useful for other applications within wearable IoT for smart telehealth scenarios by translating machine learning approaches from the cloud backend to edge computing devices such as Fog.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

As described in [1] Fog is a new architecture for computing, storage, control and networking that brings these services closer to end users.In simple words, the decentralization of services at the edge of the network is achieved.The computation and control closer to the sensors make the concept of Fog a better alternative to the cloud.In our proposed architecture of smart Fog, we leveraged the idea of Fog for speech signal processing for telehealth monitoring. Speech signal processing and Machine learning are fundamental blocks for detection and evaluation of speech disorders like dysarthria in patients with Parkinson’s diseases that affects a significant portion of the world population. Telehealth monitoring is very effective for the speech-language pathology, and smart devices like EchoWear [2] can be useful in such situations. Several signs indicate the relationship of dysarthria, speech prosody, and acoustic features. As authors in [3] mentions dysarthria always accompanies patients with Parkinson’s disease Characterized by the monotony of speech, reduced stress, variable rate, imprecise consonants, and a breathy and harsh voice Authors in [4] [5] suggested that extreme F0 variation and range in speakers with severe dysarthria exist. Another important acoustic feature for dysarthria is the amplitude of the speech uttered by the patients with Parkinson’s disease. In [6] authors mention about reduced vocal intensity in hypokinetic dysarthria in Parkinson disease. This paper presents a Fog Computing architecture, SmartFog that relied on unsupervised clustering for discovering patterns in pathological speech data obtained from patients with Parkinson’s disease(PD). The patients with PD use smartwatch while performing speech exercises at home. The speech data were routed into the Fog computer via a nearby tablet/smartphone. The Fog computer extracts loudness and fundamental frequency features for quantifying pathological speech. The speech features were normalized and processed with k-means clustering. When we see an abnormal change in features, results are uploaded to the cloud. In other situations, data is only processed locally. In this way, Fog device could perform "smart" decision on when to upload the data to cloud backend and when not. We developed two prototypes using Intel Edison and Raspberry Pi. Both of the prototypes were used for comparative analysis of computation time. Both systems were tested on real world pathological speech data from telemonitoring of patients with Parkinson’s disease. The increasing use of wearables in smart telehealth system led to the generation of huge medical big data [7, 8, 9, 10]

. The telehealth services leverage these data for assisting clinical procedures. This paper suggests use of low-resource machine learning on Fog devices kept close to the wearable for smart telehealth. For traditional telecare systems, the signal processing and machine learning modules are deployed in the cloud that processes physiological data. In our analysis, we have chosen the average fundamental frequency(F0) in hertz and average intensity in decibel for K-means clustering analysis. The algorithm efficiently clusters the unlabeled data into groups of similarity that was done on the fog platform.One use of this analysis can be for real time Parkinson’s phenotypic sub-groupings based on the clusters.

Figure 2: K-means clustering plot

2 Related Works

2.1 Telehealth and Associate Challenges

The Fog Architecture shifts computation, networking, and storage to the edge of the network.Various authors have described a different architecture for Fog. FIT as described in [11] has the following components: (1) Smartwatch; (2) Fog computer; and (3) Cloud backend. In the paper [12] authors presents an effort to conceptualize WIoT concerning their design, function, and applications.The paper [13] demonstrates the Fog Data that is a service oriented architecture for Fog computing.This literature emphasizes the importance and versatility of Fog computing.The challenges IoT faces are described in [1] are the requirement of stringent low latency, IoT applications such as gaming, virtual reality demands this. The issue of Network Bandwidth and Resource-constrained devices are another challenges to the emerging field of IoT. Thus arises the importance of fog that distributes computing, control, storage and networking functions closer to the end user [1].

2.2 Big data and Telehealth

Tele-Health utilizes the recent developments of Big Data in the context of biomedical and healthcare.Fields like medical and health informatics, translational bioinformatics, sensor informatics etc can avail the benefit of the personalized information from a diverse range of data sources[14, 15]. Authors in [13], proposes, validates and evaluates Fog Data architecture for Fog computing. The proposed architecture is a low power embedded computer that carries out data mining and analysis on data collected from various wearable sensors used for telehealth applications.[16] mentions about European project ’PERFORM’ that is a sophisticated multi-parametric system FOR the continuous effective assessment and monitoring of motor status in Parkinson Disease and other neurodegenerative diseases. It provides a telehealth system for remote monitoring of Parkinson Patients.The paper also summarizes the technical performance of the system and the feedback received from the patients in terms of usability and wearability.We in our work used Parkinson speech data analysis for our proposed smart-fog framework.

2.3 Wearable Internet of Thing for Telehealth

IoT Device that interacts with the fog node is composed of sensors that are capable of collecting and transmitting data via wireless means.IoT allows handling of objects remotely across the network. The versatility of IoT makes it more suitable for smart grids, smart homes, smart cities and wearable health monitoring systems.This paper focuses on the health aspect of IoT.Integration with the internet offers IoT devices an IP address for better communication. Big data and Internet of Things work collectively, and we tried to leverage this relationship in our proposed architecture. We used Raspberry Pi and Intel Edison as Fog computing device for the analysis discussed in this paper. Fog Interface as described in [11, 17] is a low-power embedded computer that acts as a smart interface between the smartwatch and the cloud. It is used for collection, storage, and processing of the data before sending features to secure cloud storage. Raspberry Pi is used as Fog device for this work. The Raspberry Pi is a series of credit card-sized single board computers that has gained much popularity owing to its small size and multipurpose utility. It has ARM compatible central processing unit and on-chip graphic processing units.

2.4 Fog Computing

Cloud computing provides shared computer processing and data analysis, in other terms Cloud is a hub of computing resources such as computer networks, servers, storage, and services. The availability of high-capacity networks, low-cost computers, and storage devices makes cloud a highly demanded service for the users seeking for high computing power. Cloud can interact with the IoT device via the fog node. This paper concentrates on the side of fog computing, which allows users a higher computing power at the instrument end.Reliance on fog will help cut the costs associated with the Cloud to an extent [18, 19].

3 Fog-based Low-Resource Machine Learning

3.1 Feature Extraction

Feature engineering is the initial step in any machine learning analysis. It is the process of proper selection of data metric to input as features into a machine learning algorithm. In K-means clustering analysis, the selection of features that are capable of capturing the variability of the data is essential for the algorithm to find the groups based on similarity. Our subjects were patients with Parkinson’s disease and the features chosen were the average fundamental frequency (F0) and Average amplitude of the speech utterance. Speech data from the patients with Parkinson’s disease were collected. For analysis, 164 speech samples were considered.These samples comprised of sound files with utterances as a short /a/, a long /a/, a normal then high pitched /a/, a normal then low pitched /a/ and phrases. The feature extraction is done with the help of Praat scripting language 

[20]. For pitch , the algorithm performs an acoustic periodicity detection on the basis of an accurate autocorrelation method. For calculating the intensity the values in the sound are first squared, then convolved with a Gaussian analysis window. The intensity is calculated in decibels.

3.2 K-means Clustering

K-means clustering is a type of unsupervised learning, that is used for exploratory data analysis of no labeled data 


. K-means is a method of vector quantization and is quite extensively used in data mining.The goal of this algorithm is to find groups in the data, the number of groups represented by the variable K. The algorithm works iteratively to assign each data point to one of K groups based on the features that are provided. The input to the algorithm are the features, and the value of K. K centroids are initially randomly selected, then the algorithm iterates until convergence. This algorithm aims to minimize the squared error function J.

Where Euclidean distance is chosen between the data point and cluster center. Feature Engineering is an essential part of this algorithm. Authors in [22]

, uses optimized K-means, that clusters the statistical properties such as the variance of the probability density functions of the clusters extracted features. In 

[23] the authors have used clustering on a database containing feature vectors derived from Malay digits utterances. The features extracted in [23] were the Mel-Frequency Cepstral Coefficients (MFCC). In our work, we have chosen the average fundamental frequency and average intensity as features extracted from the speech files for applying K-means clustering.

4 Results & Discussions

For our analysis we have chosen speakers with 164 speech samples with utterances that are a short /a/, a long /a/, a normal then high pitched /a/,a normal then low pitched /a/ and phrases. The features were chosen are average fundamental frequency and intensity. Feature extraction is done using praat [20] an acoustic analysis software and using Praat scripts that use standard algorithms to extract pitch and intensity mentioned in the discussion above. The results are shown in the form of plots.The k-means clustering analysis is done on Python programming language.The plots below show the Clusters of the speech data samples used in the analysis.Different colors represent different mutually exclusive groups. The analysis is done with 2, 3 and 4 number of clusters, i.e. the value of k chosen as 2 and 3 and four respectively.
Figure (a) shows the K-means clustering plot for 2 clusters shown with different colors.The python script is run on Raspberry Pi and Intel Edison to generate the results.
Figure (b) displays the k-means cluster plot for 4 Clusters designated with four different colors in a 3D plot.Each observation belongs to the cluster with the nearest mean in k-means clustering.We have used k-means for feature learning performed in the fog device.
Figure(c) shows the k-means clustering plot for 3 clusters with different colors in 3D.The python script run on Raspberry Pi and Intel Edison was used for generating the results displayed in the figure.

4.1 Performance Comparison

The Raspberry Pi provides a low-cost computing terminal. The Edison is a deeply embedded IoT computing module. There is a difference of processor speed and power consumption in Edison and Raspberry Pi . The Machine Learning algorithms were run on both of the devices and their Run time, average CPU usage and Memory usage have been calculated.

Figure 3: A comparison of Intel Edison and Raspberry pi.

Figure 3 shows the comparison of Intel Edison and raspberry Pi fog devices.The ideal system will minimize runtime, maximize CPU usage, and use a modest amount of memory. The raspberry Pi either outperformed or matched the Edison in each of this criterion. The raspberry Pi was not capable of generating a graphical output for this type of analysis in a real-time response threshold of 200ms. However, without a need for complex graphics, the raspberry Pi was able to reach the threshold clocking in at 160ms.

5 Conclusions

Fog computing emphasizes proximity to end-users unlike cloud computing along with local resource pooling, reduction in latency, better quality of service and better user experiences. This paper relied on Fog computer for low-resource machine learning. As a use case, we employed K-means clustering on clinical speech data obtained from patients with Parkinson’s disease (PD).Proposed Smart-Fog architecture can be useful for health problems like speech disorders and clinical speech processing in real time as discussed in this paper.Fog computing reduced the onus of dependence on Cloud services with availability of big data.There will be more aspects of this proposed architecture that can be investigated in future.We can expect Fog architecture to be crucial in shaping the way big data handling and processing happens in near future.

6 Acknowledgement

Authors would like to thank George and Anne Ryan Institute for Neuroscience for their support and help.


  • [1] M. Chiang and T. Zhang, “Fog and iot: An overview of research opportunities,” IEEE Internet of Things Journal, vol. 3, no. 6, pp. 854–864, 2016.
  • [2] H. Dubey, J. C Goldberg, M. Abtahi, L. Mahler, and K. Mankodiya, “EchoWear: smartwatch technology for voice and speech treatments of patients with parkinson’s disease,” in Proceedings of the conference Wireless Health. ACM, 2015.
  • [3] S. Zhao, F. Rudzicz, L. G. Carvalho, C. Márquez-Chin, and S. Livingstone, “Automatic detection of expressed emotion in parkinson’s disease,” in IEEE ICASSP, 2014.
  • [4] T. H. Falk, W. Chan, and F. Shein, “Characterization of atypical vocal source excitation, temporal dynamics and prosody for objective measurement of dysarthric word intelligibility,” Speech Communication, vol. 54, no. 5, pp. 622–631, 2012.
  • [5] R. Patel, K. C. Hustad, K. P. Connaghan, and W. Furr, “Relationship between prosody and intelligibility in children with dysarthria,” Journal of medical speech-language pathology, vol. 20, no. 4, 2012.
  • [6] C. R. Watts, “A retrospective study of long-term treatment outcomes for reduced vocal intensity in hypokinetic dysarthria,” BMC Ear, Nose and Throat Disorders, vol. 16, no. 1, pp. 2, 2016.
  • [7] H. Dubey, N. Constant, A. Monteiro, M. Abtahi, D. Borthakur, L. Mahler, Y. Sun, Q. Yang, and K. Mankodiya, “Fog computing in medical internet-of-things: Architecture, implementation, and applications,” in Handbook of Large-Scale Distributed Computing in Smart Healthcare. 2017, Springer International Publishing AG.
  • [8] H. Dubey, R. Kumaresan, and K. Mankodiya,

    “Harmonic sum-based method for heart rate estimation using ppg signals affected with motion artifacts,”

    Journal of Ambient Intelligence and Humanized Computing, pp. 1–14, 2016.
  • [9] H. Dubey, M. R. Mehl, and K. Mankodiya, “Bigear: Inferring the ambient and emotional correlates from smartphone-based acoustic big data,” in EEE First International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), Washington, DC, 2016, pp. 78-83., 2016, number doi: 10.1109/CHASE.2016.46.
  • [10] H. Dubey, J. C. Goldberg, K. Mankodiya, and L. Mahler, “A multi-smartwatch system for assessing speech characteristics of people with dysarthria in group settings,” in IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom),, 2015.
  • [11] A. Monteiro, H. Dubey, L. Mahler, Q. Yang, and K. Mankodiya, “Fit: A fog computing device for speech tele-treatments,” in IEEE Smart Computing (SMARTCOMP), 2016.
  • [12] S. Hiremath and K. Yang, G.and Mankodiya, “Wearable internet of things: Concept, architectural components and promises for person-centered healthcare,” in Mobihealth Conference. IEEE, 2014.
  • [13] H. Dubey, J. Yang, N. Constant, A. M. Amiri, Q. Yang, and K. Makodiya, “Fog data: Enhancing telehealth big data through fog computing,” in Fifth ASE BigData 2015, Kaohsiung, Taiwan. ACM.
  • [14] J. Andreu-Perez, C. C. Y. Poon, R. D. Merrifield, S. T.C. Wong, and G. Yang, “Big data for health,” IEEE journal of biomedical and health informatics, vol. 19, no. 4, pp. 1193–1208, 2015.
  • [15] R. K. Barik, H. Dubey, C. Misra, D. Borthakur, N. Constant, S. A. Sasane, R. K. Lenka, B. S. P. Mishra, H. Das, and K. Mankodiya, “Fog assisted cloud computing in era of big data and internet-of-things: Systems, architectures and applications,” in Cloud Computing for Optimization: Foundations, Applications, Challenges, p. 23. Springer, 2018.
  • [16] J. Cancela, M. Pastorino, M. T. Arredondo, and O. Hurtado, “A telehealth system for parkinson’s disease remote monitoring. the perform approach,” in 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC),, 2013, pp. 7492–7495.
  • [17] N. Constant, D. Borthakur, M. Abtahi, H. Dubey, and K. Mankodiya, “Fog-assisted wiot: A smart fog gateway for end-to-end analytics in wearable internet of things,” in The 23rd IEEE Symposium on High Performance Computer Architecture HPCA 2017,Austin, Texas, USA, 2017.
  • [18] R. Barik, H. Dubey, R. K. Lenka, K. Mankodiya, T. Pratik, and S. Sharma, “Mistgis: Optimizing geospatial data analysis using mist computing,” in International Conference on Computing Analytics and Networking (ICCAN 2017). Springer, 2017.
  • [19] R. K. Barik, H. Dubey, R. K. Lenka, N.V.R. Simha, S. A. Sasane, C. Misra, and K. Mankodiya, “Fog computing-based enhanced geohealth big data analysis,” in 2017 International Conference on Intelligent Computing and Control (I2C2). IEEE, 2017.
  • [20] Paulus Petrus Gerardus Boersma et al., “Praat, a system for doing phonetics by computer,” Glot international, vol. 5, 2002.
  • [21] Christopher M Bishop, Neural networks for pattern recognition, Oxford university press, 1995.
  • [22] H. A. Kadhim, L. Woo, and S. Dlay, “Novel algorithm for speech segregation by optimized k-means of statistical properties of clustered features,” in IEEE Progress in Informatics and Computing (PIC) Conference, 2015.
  • [23] S Majeed, H Husain, S Samad, and A Hussain, “Hierarchical k-means algorithm applied on isolated malay digit speech recognition,” International proceedings of computer science & information technology, vol. 34, pp. 33–37, 2012.