Deep Hurdle Networks for Zero-Inflated Multi-Target Regression: Application to Multiple Species Abundance Estimation

10/30/2020
by   Shufeng Kong, et al.
43

A key problem in computational sustainability is to understand the distribution of species across landscapes over time. This question gives rise to challenging large-scale prediction problems since (i) hundreds of species have to be simultaneously modeled and (ii) the survey data are usually inflated with zeros due to the absence of species for a large number of sites. The problem of tackling both issues simultaneously, which we refer to as the zero-inflated multi-target regression problem, has not been addressed by previous methods in statistics and machine learning. In this paper, we propose a novel deep model for the zero-inflated multi-target regression problem. To this end, we first model the joint distribution of multiple response variables as a multivariate probit model and then couple the positive outcomes with a multivariate log-normal distribution. By penalizing the difference between the two distributions' covariance matrices, a link between both distributions is established. The whole model is cast as an end-to-end learning framework and we provide an efficient learning algorithm for our model that can be fully implemented on GPUs. We show that our model outperforms the existing state-of-the-art baselines on two challenging real-world species distribution datasets concerning bird and fish populations.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/26/2019

Clarifying species dependence under joint species distribution modeling

Joint species distribution modeling is attracting increasing attention t...
09/28/2016

Deep Multi-Species Embedding

Understanding how species are distributed across landscapes over time is...
09/30/2021

Bayesian Multi-Species N-Mixture Models for Unmarked Animal Communities

We propose an extension of the N-mixture model which allows for the esti...
09/07/2018

Joint species distribution modeling with additive multivariate Gaussian process priors and heteregenous data

In this work, we propose JSDMs where the responses to environmental cova...
09/17/2017

Multi-Entity Dependence Learning with Rich Context via Conditional Variational Auto-encoder

Multi-Entity Dependence Learning (MEDL) explores conditional correlation...
03/09/2021

HOT-VAE: Learning High-Order Label Correlation for Multi-Label Classification via Attention-Based Variational Autoencoders

Understanding how environmental characteristics affect bio-diversity pat...
12/14/2021

Zero-inflated Beta distribution regression modeling

A frequent challenge encountered with ecological data is how to interpre...