Grouping Shapley Value Feature Importances of Random Forests for explainable Yield Prediction

04/14/2023
by   Florian Huber, et al.
0

Explainability in yield prediction helps us fully explore the potential of machine learning models that are already able to achieve high accuracy for a variety of yield prediction scenarios. The data included for the prediction of yields are intricate and the models are often difficult to understand. However, understanding the models can be simplified by using natural groupings of the input features. Grouping can be achieved, for example, by the time the features are captured or by the sensor used to do so. The state-of-the-art for interpreting machine learning models is currently defined by the game-theoretic approach of Shapley values. To handle groups of features, the calculated Shapley values are typically added together, ignoring the theoretical limitations of this approach. We explain the concept of Shapley values directly computed for predefined groups of features and introduce an algorithm to compute them efficiently on tree structures. We provide a blueprint for designing swarm plots that combine many local explanations for global understanding. Extensive evaluation of two different yield prediction problems shows the worth of our approach and demonstrates how we can enable a better understanding of yield prediction models in the future, ultimately leading to mutual enrichment of research and application.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2019

Explainable AI for Trees: From Local Explanations to Global Understanding

Tree-based machine learning models such as random forests, decision tree...
research
06/23/2021

groupShapley: Efficient prediction explanation with Shapley values for feature groups

Shapley values has established itself as one of the most appropriate and...
research
05/04/2021

Comparison of Machine Learning Methods for Predicting Winter Wheat Yield in Germany

This study analyzed the performance of different machine learning method...
research
08/13/2021

Data-driven advice for interpreting local and global model predictions in bioinformatics problems

Tree-based algorithms such as random forests and gradient boosted trees ...
research
03/17/2022

An Explainable Stacked Ensemble Model for Static Route-Free Estimation of Time of Arrival

To compare alternative taxi schedules and to compute them, as well as to...
research
09/30/2010

Mantis: Predicting System Performance through Program Analysis and Modeling

We present Mantis, a new framework that automatically predicts program p...
research
06/07/2021

Accurate and robust Shapley Values for explaining predictions and focusing on local important variables

Although Shapley Values (SV) are widely used in explainable AI, they can...

Please sign up or login with your details

Forgot password? Click here to reset