To tree or not to tree? Assessing the impact of smoothing the decision boundaries

10/07/2022
by   Anthea Mérida, et al.
0

When analyzing a dataset, it can be useful to assess how smooth the decision boundaries need to be for a model to better fit the data. This paper addresses this question by proposing the quantification of how much should the 'rigid' decision boundaries, produced by an algorithm that naturally finds such solutions, be relaxed to obtain a performance improvement. The approach we propose starts with the rigid decision boundaries of a seed Decision Tree (seed DT), which is used to initialize a Neural DT (NDT). The initial boundaries are challenged by relaxing them progressively through training the NDT. During this process, we measure the NDT's performance and decision agreement to its seed DT. We show how these two measures can help the user in figuring out how expressive his model should be, before exploring it further via model selection. The validity of our approach is demonstrated with experiments on simulated and benchmark datasets.

READ FULL TEXT
research
06/20/2020

Model family selection for classification using Neural Decision Trees

Model selection consists in comparing several candidate models according...
research
01/10/2022

A Study on Mitigating Hard Boundaries of Decision-Tree-based Uncertainty Estimates for AI Models

Outcomes of data-driven AI models cannot be assumed to be always correct...
research
05/25/2018

Topological Data Analysis of Decision Boundaries with Application to Model Selection

We propose the labeled Čech complex, the plain labeled Vietoris-Rips com...
research
06/22/2020

How fair can we go in machine learning? Assessing the boundaries of fairness in decision trees

Fair machine learning works have been focusing on the development of equ...
research
10/24/2022

We need to talk about random seeds

Modern neural network libraries all take as a hyperparameter a random se...
research
03/21/2019

Empirical Evaluations of Seed Set Selection Strategies for Predictive Coding

Training documents have a significant impact on the performance of predi...
research
11/24/2022

Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision Models

In recent years, various watermarking methods were suggested to detect c...

Please sign up or login with your details

Forgot password? Click here to reset