Random Similarity Forests

04/11/2022
by   Maciej Piernik, et al.
0

The wealth of data being gathered about humans and their surroundings drives new machine learning applications in various fields. Consequently, more and more often, classifiers are trained using not only numerical data but also complex data objects. For example, multi-omics analyses attempt to combine numerical descriptions with distributions, time series data, discrete sequences, and graphs. Such integration of data from different domains requires either omitting some of the data, creating separate models for different formats, or simplifying some of the data to adhere to a shared scale and format, all of which can hinder predictive performance. In this paper, we propose a classification method capable of handling datasets with features of arbitrary data types while retaining each feature's characteristic. The proposed algorithm, called Random Similarity Forest, uses multiple domain-specific distance measures to combine the predictive performance of Random Forests with the flexibility of Similarity Forests. We show that Random Similarity Forests are on par with Random Forests on numerical data and outperform them on datasets from complex or mixed data domains. Our results highlight the applicability of Random Similarity Forests to noisy, multi-source datasets that are becoming ubiquitous in high-impact life science projects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2016

Comments on: "A Random Forest Guided Tour" by G. Biau and E. Scornet

This paper is a comment on the survey paper by Biau and Scornet (2016) a...
research
02/12/2018

Random Hinge Forest for Differentiable Learning

We propose random hinge forests, a simple, efficient, and novel variant ...
research
03/05/2022

Fuzzy Forests For Feature Selection in High-Dimensional Survey Data: An Application to the 2020 U.S. Presidential Election

An increasingly common methodological issue in the field of social scien...
research
10/29/2020

Analyzing the tree-layer structure of Deep Forests

Random forests on the one hand, and neural networks on the other hand, h...
research
08/06/2020

Modeling of time series using random forests: theoretical developments

In this paper we study asymptotic properties of random forests within th...
research
05/15/2023

Fast Inference of Tree Ensembles on ARM Devices

With the ongoing integration of Machine Learning models into everyday li...
research
01/05/2023

Random forests, sound symbolism and Pokemon evolution

This study constructs machine learning algorithms that are trained to cl...

Please sign up or login with your details

Forgot password? Click here to reset