Big Data Classification Using Augmented Decision Trees

10/26/2017
by   Rajiv Sambasivan, et al.
0

We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble methods, the models produced by the algorithm can be easily interpreted. The algorithm is based on a divide and conquer strategy and consists of two steps. The first step consists of using a decision tree to segment the large dataset. By construction, decision trees attempt to create homogeneous class distributions in their leaf nodes. However, non-homogeneous leaf nodes are usually produced. The second step of the algorithm consists of using a suitable classifier to determine the class labels for the non-homogeneous leaf nodes. The decision tree segment provides a coarse segment profile while the leaf level classifier can provide information about the attributes that affect the label within a segment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2020

dtControl: Decision Tree Learning Algorithms for Controller Representation

Decision tree learning is a popular classification technique most common...
research
08/19/2021

Simple is better: Making Decision Trees faster using random sampling

In recent years, gradient boosted decision trees have become popular in ...
research
01/10/2019

A Bayesian Decision Tree Algorithm

Bayesian Decision Trees are known for their probabilistic interpretabili...
research
01/13/2017

What Can I Do Now? Guiding Users in a World of Automated Decisions

More and more processes governing our lives use in some part an automati...
research
04/14/2015

HHCART: An Oblique Decision Tree

Decision trees are a popular technique in statistical data classificatio...
research
08/18/2017

Induction of Decision Trees based on Generalized Graph Queries

Usually, decision tree induction algorithms are limited to work with non...
research
05/15/2022

Optimization of Decision Tree Evaluation Using SIMD Instructions

Decision forest (decision tree ensemble) is one of the most popular mach...

Please sign up or login with your details

Forgot password? Click here to reset