Malaria is one of the most severe public health problems in the developing world. The World Health Organization estimated that there were 212 million malaria cases worldwide in 2015, leading to 429,000 related deathsOrganization (2016). Vector-control efforts have achieved significant improvement in the past few decades Bhatt et al. (2015). However, the effect of malaria interventions remains poorly understood due to the absence of reliable surveillance data.
Malaria is transmitted through the bite of an infected Anopheles mosquito. Among the approximately 3,600 species of mosquitoes, only about 60 out of the 450 Anopheles species transmit malaria (i.e., are vectors) Neafsey et al. (2015); Wilkerson et al. (2015). The ability to quickly detect the presence of these mosquito species is therefore crucial for control programmes and targeted intervention strategies.
As sensor-rich embedded devices, smartphones provide a perfect platform for environmental sensing Lane et al. (2010). They are programmable and equipped with cheap yet powerful sensors, such as microphones, GPS, digital compasses and cameras. The touch screen is ideal for displaying real-time feedback to the user and inputting peripheral information in data acquisition. The built-in WiFi and cellular access make them extremely useful for data streaming or synchronisation. These desirable properties have enabled a wide range of applications of smartphones in environmental monitoring Guo et al. (2015).
The identification of disease-carrying mosquitoes by their flight tones has been researched for more than half a century Jr. and Kahn (1949); Raman et al. (2007). However, to the best of our knowledge, there is no acoustic mosquito sensing pipeline that is cost-effective and deployable in large-field studies. The HumBug project 111humbug.ac.uk aims to accomplish real-time acoustic mosquito detection using low-cost smartphones to alert users of the presence of vector species and guide control programmes, as well as relate such detections to geographic variables such as vegetation type and climate. Following an initial proof-of-concept phase, the system is going through field tests to improve and evaluate the detection model with more field data.
In this paper, we describe our efforts in developing a real-time mosquito detection app that acts both as an early warning device and an integral part of an automatic data acquisition pipeline. Firstly, we present the machine learning algorithms deployed on low-cost smartphones for live detection; subsequently, we propose the use of a citizen science platform to enable crowdsourcing of data labels for improving the accuracy of the detection algorithm.
2.1 Data acquisition
To enable acoustic detection of mosquitoes, an easy-to-use data acquisition system needs to be set up for retrieving and transmitting data for training of detection models and live prediction of mosquito presence. The MozzWear Android app we developed provides a simple graphical user interface for sound recording and data synchronisation (Figure 1(a)). It currently supports the “Record only”, the “Record and detect”, and will support “Record on detection” functions. The “Record and detect” and the “Record only” both record sound. In addition, the “Record and detect” function launches the detection module and displays our real time predictions of the mosquito presence (Figure 1(b)). Users of the app may have certain prior information to enter in the data collection stage, e.g. peripheral information of the environment, species categories for cage mosquitoes. The app provides a pull-down menu (Figure 1(c)) for users to enter such information (if available). All recordings and corresponding peripheral information will be synchronised to the online HumBug project database through WiFi or cellular connections.
2.2 Classification pipeline
The key objective of the classification algorithms for our specific task is high detection accuracy with a low computational and memory load. We describe our adopted feature extraction methods, two-stage multi-species classification algorithm, as well as the model training strategy.
Audio information are often more distinctive in the frequency domain. Mel-frequency cepstral coefficients (MFCCs) are one of the most widely used audio features in speech recognition and acoustic scene classificationBarchiesi et al. (2015). MFCCs provide a compact representations of spectral envelopes, by performing the discrete cosine transform (DCT) on the log power spectrum grouped according to the mel scale of frequency bands. It usually compresses the high-dimensional spectrum into a much lower space, e.g. 13-dimensional coefficients. Thus, it is well suited for usage in low-power devices. In our experiments comparing 11 different common audio features (omitted here), MFCCs lead to the best detection accuracy.
Two-stage multi-species classification:
We adopt a two-stage classification paradigm for detection. In the first stage, a binary class support vector machine (SVM)Cortes and Vapnik (1995) is used to detect the presence of mosquitoes. The SVM maps the feature into a higher dimensional space for more effective classification through the so-called kernel trick.
Once a mosquito is identified, a multi-species classification using the one-versus-one multi-class strategy with the SVM is used in the second stage to identify the exact mosquito species. We will soon introduce an option of cost-sensitive SVMs in the first stage to provide a more direct control on the false positive rate Li et al. (2017).
We explored different training strategies to achieve the most effective classification accuracy in testing, with a small amount of training data. Only small amounts of training data are available, as labelling recordings is a manual and time-consuming process.
We divide recordings into audio clips with a duration of 0.1 seconds (which we call samples), and assign labels to each clip according to data tagging results from humans. A balanced training dataset is first created so that each class has the same number of training samples. The random sampling strategy, which randomly samples short audio clips without replacement in the balanced training set, was found to produce the best detection performance (detailed results are again omitted).
2.3 Data tagging and crowdsourcing
Our current data labels were obtained through collective data tagging from the project team members. Data tagging involves marking segments of audio clips where mosquito sound can be heard. With large-scale field deployment, the number of recordings requiring data tagging will be beyond the capacity of experts and researchers in the HumBug project. We hence resort to the power of crowdsourcing, creating a project on Zooniverse222zooniverse.org, the world’s largest citizen science platform Simpson et al. (2014), to solicit labels from millions of volunteers. Volunteers listen to short sound clips and can see the corresponding spectrograms, to give their decisions on whether mosquito sound exists in audio clips (Figure 2). Detection results described in Section 2.2 are used to filter data uploaded to Zooniverse. By only uploading audio clips predicted to contain mosquito sound, we may accelerate the data tagging process.
3 Experiments and Results
We report off-line detection results produced with a dataset containing audio recordings collected in the Centers for Disease Control and Prevention (CDC) in the US and the US Army Military Research Unit in Kisumu, Kenya (USAMRU-K). The CDC dataset contains sound recordings from 20 mosquito species. The amount of recordings for a majority species is small – for only 6 species we have no fewer than 62 0.1 second samples. So, we include acoustic data from these 6 species into the examined dataset. The USAMRU-K dataset contains sound recordings of the An. Gambiae species. We create a balanced dataset by downsampling data from each of these 7 mosquito species and background recordings, so that each class contains 62 samples. The balanced dataset simplifies training and leads to easier interpretation of prediction results on the test set.
100 trials were conducted by sampling 100 different copies of the balanced datasets through random initialisations. Hence data splitting as well as algorithm random seeds are different in different trials. The “random sampling” training strategy described previously is adopted to train the detection algorithm with of samples in the balanced dataset. All tests were conducted with the MozzWear installed on Alcatel One Touch 4009X mobile phones, available at a network operator’s shop in the UK for £20. The app is therefore suitable for low-end devices, aiding the ease of global deployment.
We report statistics of detection performance on test data among 100 trials in Table 1
. Average detection accuracies for different mosquito species vary from 0.68 to 0.92. The standard deviations (SD) calculated based on detection accuracies in 100 trials are low for most classes.-values calculated based on the area-under-curve (AUC) of the macro-average of receiver operating characteristics (ROC) curves for each class are less than in all trials. Detection accuracies for Anopheles mosquitoes, which are malaria vectors, are impressive albeit worse than detection accuracies for the other examined mosquito species. These results demonstrate that the current detection model, trained with a limited amount of data, has achieved effective multi-species detection with a dataset involving two experiments conducted in two different locations. Further field trips are planned to train more robust models and better assess detection performance using field data.
Our acoustic mosquito detection system, despite using low-cost smartphones, provides a promising avenue for live detection – and species classification – of mosquitoes known to vector malaria. Our approach provides an automatic mosquito data acquisition pipeline with little additional cost. We demonstrate that the detection pipeline is efficient and can run smoothly on smartphones costing only $20. We are currently performing more field tests with this system. Furthermore, our crowdsourcing platform provides an attractive solution for large-scale data tagging from the growing database of acoustic recordings as we expand deployment.
This work is part-funded by a Google Impact Challenge award. The authors would like to thank Paul I. Howell at the Centers for Disease Control and Prevention (CDC), BEI Resources in Atlanta, USA, Dustin Miller in CDC Foundation, Centers for Disease Control and Prevention in Atlanta, Dr. Sheila Ogoma, US Army Military Research Unit, Kisumu, Kenya (USAMRU-K), and Dr. Theeraphap Chareonviriyaphap, Kasersart University, Thailand for their collaborations on data collection and system deployment.
Barchiesi et al. (2015)
Barchiesi, D., Giannoulis, D., Stowell, D., and Plumbley, M. D. (2015).
Acoustic scene classification: Classifying environments from the sounds they produce.IEEE Signal Processing Mag., 32(3):16–34.
- Bhatt et al. (2015) Bhatt et al., S. (2015). The effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015. Nature, 526(7572):207–211.
- Cortes and Vapnik (1995) Cortes, C. and Vapnik, V. (1995). Support-vector networks. Mach. Learn., 20(3):273–297.
- Guo et al. (2015) Guo et al., B. (2015). Mobile crowd sensing and computing: The review of an emerging human-powered sensing paradigm. ACM Comput. Surv., 48(1):7:1–7:31.
- Jr. and Kahn (1949) Jr., W. H. O. and Kahn, M. C. (1949). The sounds of disease-carrying mosquitoes. The Journal of the Acoustical Society of America, 21:462 – 463.
- Lane et al. (2010) Lane, N. D., Miluzzo, E., Lu, H., Peebles, D., Choudhury, T., and Campbell, A. T. (2010). A survey of mobile phone sensing. IEEE Commun. Mag., 48(9):140–150.
Li et al. (2017)
Li, Y., Kiskin, I., Sinka, M., Chan, H., and Roberts, S. (2017).
Cost-sensitive detection with variational autoencoders for environmental acoustic sensing.In NIPS Workshop on Machine Learning for Audio Signal Processing, Long Beach, USA.
- Neafsey et al. (2015) Neafsey et al., D. E. (2015). Highly evolvable malaria vectors: The genomes of 16 anopheles mosquitoes. Science, 347(6217).
- Organization (2016) Organization, W. H. (2016). World malaria report 2015. World Health Organization.
- Raman et al. (2007) Raman, D. R., Gerhardt, R. R., and Wilkerson, J. B. (2007). Detecting insect flight sounds in the field: Implications for acoustical counting of mosquitoes. Transactions of the ASABE, 50(4):1481–1485.
- Simpson et al. (2014) Simpson, R., Page, K. R., and De Roure, D. (2014). Zooniverse: Observing the world’s largest citizen science platform. In Proc. Int. Conf. World Wide Web, WWW ’14 Companion, pages 1049–1054.
- Wilkerson et al. (2015) Wilkerson et al., R. C. (2015). Making mosquito taxonomy useful: A stable classification of tribe aedini that balances utility with current knowledge of evolutionary relationships. PloS ONE, 10:e0133602.