A path in regression Random Forest looking for spatial dependence: a taxonomy and a systematic review
Random Forest (RF) is a well-known data-driven algorithm applied in several fields thanks to its flexibility in modeling the relationship between the response variable and the predictors, also in case of strong non-linearities. In environmental applications, it often occurs that the phenomenon of interest may present spatial and/or temporal dependence that is not taken explicitly into account by RF in its standard version. In this work, we propose a taxonomy to classify strategies according to when (Pre-, In- and/or Post-processing) they try to include the spatial information into regression RF. Moreover, we provide a systematic review and classify the most recent strategies adopted to "adjust" regression RF to spatially dependent data, based on the criteria provided by the Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA). The latter consists of a reproducible methodology for collecting and processing existing literature on a specified topic from different sources. PRISMA starts with a query and ends with a set of scientific documents to review: we performed an online query on the 25^th October 2022 and, in the end, 32 documents were considered for review. The employed methodological strategies and the application fields considered in the 32 scientific documents are described and discussed.
READ FULL TEXT