Correcting sampling biases via importancereweighting for spatial modeling

09/09/2023
by   Boris Prokhorov, et al.
0

In machine learning models, the estimation of errors is often complex due to distribution bias, particularly in spatial data such as those found in environmental studies. We introduce an approach based on the ideas of importance sampling to obtain an unbiased estimate of the target error. By taking into account difference between desirable error and available data, our method reweights errors at each sample point and neutralizes the shift. Importance sampling technique and kernel density estimation were used for reweighteing. We validate the effectiveness of our approach using artificial data that resemble real-world spatial datasets. Our findings demonstrate advantages of the proposed approach for the estimation of the target error, offering a solution to a distribution shift problem. Overall error of predictions dropped from 7

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2017

Visualization of Big Spatial Data using Coresets for Kernel Density Estimates

The size of large, geo-located datasets has reached scales where visuali...
research
10/25/2021

Kernel density estimation-based sampling for neural network classification

Imbalanced data occurs in a wide range of scenarios. The skewed distribu...
research
09/09/2022

Fast and Accurate Importance Weighting for Correcting Sample Bias

Bias in datasets can be very detrimental for appropriate statistical est...
research
02/09/2023

Importance Sampling Deterministic Annealing for Clustering

A current assumption of most clustering methods is that the training dat...
research
02/21/2021

Adaptive Importance Sampling for Efficient Stochastic Root Finding and Quantile Estimation

In solving simulation-based stochastic root-finding or optimization prob...
research
07/01/2021

Mandoline: Model Evaluation under Distribution Shift

Machine learning models are often deployed in different settings than th...
research
06/14/2020

Support Estimation with Sampling Artifacts and Errors

The problem of estimating the support of a distribution is of great impo...

Please sign up or login with your details

Forgot password? Click here to reset