Scale invariant proper scoring rules Scale dependence: Why the average CRPS often is inappropriate for ranking probabilistic forecasts
Averages of proper scoring rules are often used to rank probabilistic forecasts. In many cases, the individual observations and their predictive distributions in these averages have variable scale (variance). We show that some of the most popular proper scoring rules, such as the continuous ranked probability score (CRPS), up-weight observations with large uncertainty which can lead to unintuitive rankings. If a scoring rule has this property we say that it is scale dependent. To solve this problem, a scaled CRPS (SCRPS) is proposed. This new proper scoring rule is scale invariant and therefore works in the case of varying uncertainty, and it shares many of the appealing properties with the CRPS. We also define robustness of scoring rules and introduce a new class of scoring rules which, besides the CRPS and the SCRPS, contains scoring rules that are robust against outliers as special cases. In three different applications from spatial statistics, stochastic volatility models, and regression for count data, we illustrate why scale invariance and robustness are important properties, and show why the SCRPS should be used instead of the CRPS.
READ FULL TEXT