Prediction of the FIFA World Cup 2018 - A random forest approach with an emphasis on estimated team ability parameters

06/08/2018
by   Andreas Groll, et al.
0

In this work, we compare three different modeling approaches for the scores of soccer matches with regard to their predictive performances based on all matches from the four previous FIFA World Cups 2002 - 2014: Poisson regression models, random forests and ranking methods. While the former two are based on the teams' covariate information, the latter method estimates adequate ability parameters that reflect the current strength of the teams best. Within this comparison the best-performing prediction methods on the training data turn out to be the ranking methods and the random forests. However, we show that by combining the random forest with the team ability parameters from the ranking methods as an additional covariate we can improve the predictive power substantially. Finally, this combination of methods is chosen as the final model and based on its estimates, the FIFA World Cup 2018 is simulated repeatedly and winning probabilities are obtained for all teams. The model slightly favors Spain before the defending champion Germany. Additionally, we provide survival probabilities for all teams and at all tournament stages as well as the most probable tournament outcome.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2019

Prediction of the 2019 IHF World Men's Handball Championship - An underdispersed sparse count data regression model

In this work, we compare several different modeling approaches for count...
research
06/03/2019

Hybrid Machine Learning Forecasts for the FIFA Women's World Cup 2019

In this work, we combine two different ranking methods together with sev...
research
07/13/2023

Ranking Handball Teams from Statistical Strength Estimation

In this work, we present a methodology to estimate the strength of handb...
research
01/26/2021

The Probabilistic Final Standing Calculator: a fair stochastic tool to handle abruptly stopped football seasons

The COVID-19 pandemic has left its marks in the sports world, forcing th...
research
10/30/2014

A random forest system combination approach for error detection in digital dictionaries

When digitizing a print bilingual dictionary, whether via optical charac...
research
05/30/2023

Revisiting Random Forests in a Comparative Evaluation of Graph Convolutional Neural Network Variants for Traffic Prediction

Traffic prediction is a spatiotemporal predictive task that plays an ess...
research
04/18/2023

Club coefficients in the UEFA Champions League: Time for the shift to an Elo-based formula

One of the most popular club football tournaments, the UEFA Champions Le...

Please sign up or login with your details

Forgot password? Click here to reset