Differentially-Private Decision Trees with Probabilistic Robustness to Data Poisoning

05/24/2023
by   Daniël Vos, et al.
0

Decision trees are interpretable models that are well-suited to non-linear learning problems. Much work has been done on extending decision tree learning algorithms with differential privacy, a system that guarantees the privacy of samples within the training data. However, current state-of-the-art algorithms for this purpose sacrifice much utility for a small privacy benefit. These solutions create random decision nodes that reduce decision tree accuracy or spend an excessive share of the privacy budget on labeling leaves. Moreover, many works do not support or leak information about feature values when data is continuous. We propose a new method called PrivaTree based on private histograms that chooses good splits while consuming a small privacy budget. The resulting trees provide a significantly better privacy-utility trade-off and accept mixed numerical and categorical data without leaking additional information. Finally, while it is notoriously hard to give robustness guarantees against data poisoning attacks, we prove bounds for the expected success rates of backdoor attacks against differentially-private learners. Our experimental results show that PrivaTree consistently outperforms previous works on predictive accuracy and significantly improves robustness against backdoor attacks compared to regular decision trees.

READ FULL TEXT
research
01/29/2022

Private Boosted Decision Trees via Smooth Re-Weighting

Protecting the privacy of people whose data is used by machine learning ...
research
09/21/2023

S-GBDT: Frugal Differentially Private Gradient Boosting Decision Trees

Privacy-preserving learning of gradient boosting decision trees (GBDT) h...
research
12/19/2020

Scalable and Provably Accurate Algorithms for Differentially Private Distributed Decision Tree Learning

This paper introduces the first provably accurate algorithms for differe...
research
10/26/2014

Differentially- and non-differentially-private random decision trees

We consider supervised learning with random decision trees, where the tr...
research
06/15/2020

Differentially Private Median Forests for Regression and Classification

Random forests are a popular method for classification and regression du...
research
09/12/2023

Level Up: Private Non-Interactive Decision Tree Evaluation using Levelled Homomorphic Encryption

As machine learning as a service continues gaining popularity, concerns ...
research
01/05/2016

Optimally Pruning Decision Tree Ensembles With Feature Cost

We consider the problem of learning decision rules for prediction with f...

Please sign up or login with your details

Forgot password? Click here to reset