Fully-Dynamic Decision Trees

12/01/2022
by   Marco Bressan, et al.
0

We develop the first fully dynamic algorithm that maintains a decision tree over an arbitrary sequence of insertions and deletions of labeled examples. Given ϵ > 0 our algorithm guarantees that, at every point in time, every node of the decision tree uses a split with Gini gain within an additive ϵ of the optimum. For real-valued features the algorithm has an amortized running time per insertion/deletion of O(d log^3 n/ϵ^2), which improves to O(d log^2 n/ϵ) for binary or categorical features, while it uses space O(n d), where n is the maximum number of examples at any point in time and d is the number of features. Our algorithm is nearly optimal, as we show that any algorithm with similar guarantees uses amortized running time Ω(d) and space Ω̃ (n d). We complement our theoretical results with an extensive experimental evaluation on real-world data, showing the effectiveness of our algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2023

Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees

We give the first algorithm that maintains an approximate decision tree ...
research
10/18/2022

Generalization Properties of Decision Trees on Real-valued and Categorical Features

We revisit binary decision trees from the perspective of partitions of t...
research
07/16/2019

The Quantum Version Of Classification Decision Tree Constructing Algorithm C5.0

In the paper, we focus on complexity of C5.0 algorithm for constructing ...
research
11/30/2020

Using dynamical quantization to perform split attempts in online tree regressors

A central aspect of online decision tree solutions is evaluating the inc...
research
11/18/2021

Interactive Set Discovery

We study the problem of set discovery where given a few example tuples o...
research
08/22/2023

Minwise-Independent Permutations with Insertion and Deletion of Features

In their seminal work, Broder et. al. <cit.> introduces the minHash algo...
research
01/23/2013

Fast Learning from Sparse Data

We describe two techniques that significantly improve the running time o...

Please sign up or login with your details

Forgot password? Click here to reset