DPPred: An Effective Prediction Framework with Concise Discriminative Patterns

10/31/2016
by   Jingbo Shang, et al.
0

In the literature, two series of models have been proposed to address prediction problems including classification and regression. Simple models, such as generalized linear models, have ordinary performance but strong interpretability on a set of simple features. The other series, including tree-based models, organize numerical, categorical and high dimensional features into a comprehensive structure with rich interpretable information in the data. In this paper, we propose a novel Discriminative Pattern-based Prediction framework (DPPred) to accomplish the prediction tasks by taking their advantages of both effectiveness and interpretability. Specifically, DPPred adopts the concise discriminative patterns that are on the prefix paths from the root to leaf nodes in the tree-based models. DPPred selects a limited number of the useful discriminative patterns by searching for the most effective pattern combination to fit generalized linear models. Extensive experiments show that in many scenarios, DPPred provides competitive accuracy with the state-of-the-art as well as the valuable interpretability for developers and experts. In particular, taking a clinical application dataset as a case study, our DPPred outperforms the baselines by using only 40 concise discriminative patterns out of a potentially exponentially large set of patterns.

READ FULL TEXT

page 2

page 14

research
05/31/2020

Interpretable Time Series Classification using Linear Models and Multi-resolution Multi-domain Symbolic Representations

The time series classification literature has expanded rapidly over the ...
research
06/18/2021

It's FLAN time! Summing feature-wise latent representations for interpretability

Interpretability has become a necessary feature for machine learning mod...
research
02/08/2021

In-Order Chart-Based Constituent Parsing

We propose a novel in-order chart-based model for constituent parsing. C...
research
04/10/2021

Random Intersection Chains

Interactions between several features sometimes play an important role i...
research
02/16/2021

Pattern Sampling for Shapelet-based Time Series Classification

Subsequence-based time series classification algorithms provide accurate...
research
02/24/2021

HiPaR: Hierarchical Pattern-aided Regression

We introduce HiPaR, a novel pattern-aided regression method for tabular ...
research
06/10/2020

Hybrid Tree-based Models for Insurance Claims

Two-part models and Tweedie generalized linear models (GLMs) have been u...

Please sign up or login with your details

Forgot password? Click here to reset