VAT tax gap prediction: a 2-steps Gradient Boosting approach

12/08/2019
by   Giovanna Tagliaferri, et al.
0

Tax evasion is the illegal non-payment of taxes by individuals, corporations, and trusts. It results in a loss of state revenue that can undermine the effectiveness of government policies. One measure of tax evasion is the so-called tax gap: the difference between the income that should be reported to the tax authorities and the amount actually reported. However, economists lack a robust method for estimating the tax gap through a bottom-up approach based on fiscal audits. This is difficult because the declared tax base is available on the whole population but the income reported to the tax authorities is generally available only on a small, non-random sample of audited units. This induces a selection bias which invalidates standard statistical methods. Here, we use machine learning based on a 2-steps Gradient Boosting model, to correct for the selection bias without requiring any strong assumption on the distribution. We use our method to estimate the Italian VAT Gap related to individual firms based on information gathered from administrative sources. Our algorithm estimates the potential VAT turnover of Italian individual firms for the fiscal year 2011 and suggests that the tax gap is about 30 potential tax base. Comparisons with other methods show our technique offers a significant improvement in predictive performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2021

Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection

Gradient Boosting Machines (GBM) are among the go-to algorithms on tabul...
research
10/08/2019

NGBoost: Natural Gradient Boosting for Probabilistic Prediction

We present Natural Gradient Boosting (NGBoost), an algorithm which bring...
research
05/07/2018

Wavelet Decomposition of Gradient Boosting

In this paper we introduce a significant improvement to the popular tree...
research
02/03/2022

Deselection of Base-Learners for Statistical Boosting – with an Application to Distributional Regression

We present a new procedure for enhanced variable selection for component...
research
11/02/2020

Gradient Boosting for Linear Mixed Models

Gradient boosting from the field of statistical learning is widely known...
research
04/27/2022

Performance and Interpretability Comparisons of Supervised Machine Learning Algorithms: An Empirical Study

This paper compares the performances of three supervised machine learnin...
research
12/08/2019

Contrast Trees and Distribution Boosting

Often machine learning methods are applied and results reported in cases...

Please sign up or login with your details

Forgot password? Click here to reset