Universal guarantees for decision tree induction via a higher-order splitting criterion

10/16/2020

∙

We propose a simple extension of top-down decision tree learning heuristics such as ID3, C4.5, and CART. Our algorithm achieves provable guarantees for all target functions f: {-1,1}^n →{-1,1} with respect to the uniform distribution, circumventing impossibility results showing that existing heuristics fare poorly even for simple target functions. The crux of our extension is a new splitting criterion that takes into account the correlations between f and small subsets of its attributes. The splitting criteria of existing heuristics (e.g. Gini impurity and information gain), in contrast, are based solely on the correlations between f and its individual attributes. Our algorithm satisfies the following guarantee: for all target functions f : {-1,1}^n →{-1,1}, sizes s∈ℕ, and error parameters ϵ, it constructs a decision tree of size s^Õ((log s)^2/ϵ^2) that achieves error ≤ O(𝗈𝗉𝗍_s) + ϵ, where 𝗈𝗉𝗍_s denotes the error of the optimal size s decision tree. A key technical notion that drives our analysis is the noise stability of f, a well-studied smoothness measure.

READ FULL TEXT

Universal guarantees for decision tree induction via a higher-order splitting criterion

Sign in with Google

Consider DeepAI Pro