On pseudo-absence generation and machine learning for locust breeding ground prediction in Africa

11/06/2021
by   Ibrahim Salihu Yusuf, et al.
15

Desert locust outbreaks threaten the food security of a large part of Africa and have affected the livelihoods of millions of people over the years. Machine learning (ML) has been demonstrated as an effective approach to locust distribution modelling which could assist in early warning. ML requires a significant amount of labelled data to train. Most publicly available labelled data on locusts are presence-only data, where only the sightings of locusts being present at a location are recorded. Therefore, prior work using ML have resorted to pseudo-absence generation methods as a way to circumvent this issue. The most commonly used approach is to randomly sample points in a region of interest while ensuring that these sampled pseudo-absence points are at least a specific distance away from true presence points. In this paper, we compare this random sampling approach to more advanced pseudo-absence generation methods, such as environmental profiling and optimal background extent limitation, specifically for predicting desert locust breeding grounds in Africa. Interestingly, we find that for the algorithms we tested, namely logistic regression, gradient boosting, random forests and maximum entropy, all popular in prior work, the logistic model performed significantly better than the more sophisticated ensemble methods, both in terms of prediction accuracy and F1 score. Although background extent limitation combined with random sampling boosted performance for ensemble methods, for LR this was not the case, and instead, a significant improvement was obtained when using environmental profiling. In light of this, we conclude that a simpler ML approach such as logistic regression combined with more advanced pseudo-absence generation, specifically environmental profiling, can be a sensible and effective approach to predicting locust breeding grounds across Africa.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2023

Let's Predict Who Will Move to a New Job

Any company's human resources department faces the challenge of predicti...
research
02/28/2021

Machine learning for detection of stenoses and aneurysms: application in a physiologically realistic virtual patient database

This study presents an application of machine learning (ML) methods for ...
research
07/25/2023

Accuracy Amplification in Differentially Private Logistic Regression: A Pre-Training Approach

Machine learning (ML) models can memorize training datasets. As a result...
research
08/14/2019

Predicting Eating Events in Free Living Individuals -- A Technical Report

This technical report records the experiments of applying multiple machi...
research
07/17/2020

Design And Modelling An Attack on Multiplexer Based Physical Unclonable Function

This paper deals with study of the physical unclonable functions and spe...
research
10/15/2018

Unsupervised Ensemble Learning via Ising Model Approximation with Application to Phenotyping Prediction

Unsupervised ensemble learning has long been an interesting yet challeng...
research
05/31/2022

The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase Selection

Down-sampling training data has long been shown to improve the generaliz...

Please sign up or login with your details

Forgot password? Click here to reset