Do we still need fuzzy classifiers for Small Data in the Era of Big Data?

by   Mikel Elkano, et al.

The Era of Big Data has forced researchers to explore new distributed solutions for building fuzzy classifiers, which often introduce approximation errors or make strong assumptions to reduce computational and memory requirements. As a result, Big Data classifiers might be expected to be inferior to those designed for standard classification tasks (Small Data) in terms of accuracy and model complexity. To our knowledge, however, there is no empirical evidence to confirm such a conjecture yet. Here, we investigate the extent to which state-of-the-art fuzzy classifiers for Big Data sacrifice performance in favor of scalability. To this end, we carry out an empirical study that compares these classifiers with some of the best performing algorithms for Small Data. Assuming the latter were generally designed for maximizing performance without considering scalability issues, the results of this study provide some intuition around the tradeoff between performance and scalability achieved by current Big Data solutions. Our findings show that, although slightly inferior, Big Data classifiers are gradually catching up with state-of-the-art classifiers for Small data, suggesting that a unified learning algorithm for Big and Small Data might be possible.


page 1

page 2

page 3

page 4


CFM-BD: a distributed rule induction algorithm for building Compact Fuzzy Models in Big Data classification problems

Interpretability has always been a major concern for fuzzy rule-based cl...

Challenges of Big Data Analysis

Big Data bring new opportunities to modern society and challenges to dat...

Convex Optimization for Big Data

This article reviews recent advances in convex optimization algorithms f...

A General Memory-Bounded Learning Algorithm

In an era of big data there is a growing need for memory-bounded learnin...

Diversification on Big Data in Query Processing

Recently, in the area of big data, some popular applications such as web...

Development details and computational benchmarking of DEPAM

In the big data era of observational oceanography, passive acoustics dat...

We need to talk about nonprobability samples

It is well known that, in most circumstances, probability sampling is th...

Please sign up or login with your details

Forgot password? Click here to reset