High dimensionality: The latest challenge to data analysis

02/12/2019
by   A. M. Pires, et al.
0

The advent of modern technology, permitting the measurement of thousands of characteristics simultaneously, has given rise to floods of data characterized by many large or even huge datasets. This new paradigm presents extraordinary challenges to data analysis and the question arises: how can conventional data analysis methods, devised for moderate or small datasets, cope with the complexities of modern data? The case of high dimensional data is particularly revealing of some of the drawbacks. We look at the case where the number of characteristics measured in an object is at least the number of observed objects and conclude that this configuration leads to geometrical and mathematical oddities and is an insurmountable barrier for the direct application of traditional methodologies. If scientists are going to ignore fundamental mathematical results arrived at in this paper and blindly use software to analyze data, the results of their analyses may not be trustful, and the findings of their experiments may never be validated. That is why new methods together with the wise use of traditional approaches are essential to progress safely through the present reality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2023

Tensor Regression

Regression analysis is a key area of interest in the field of data analy...
research
08/07/2013

Challenges of Big Data Analysis

Big Data bring new opportunities to modern society and challenges to dat...
research
01/30/2020

NCVis: Noise Contrastive Approach for Scalable Visualization

Modern methods for data visualization via dimensionality reduction, such...
research
03/01/2018

Model-Based Clustering and Classification of Functional Data

The problem of complex data analysis is a central topic of modern statis...
research
08/14/2016

Julia Implementation of the Dynamic Distributed Dimensional Data Model

Julia is a new language for writing data analysis programs that are easy...
research
04/13/2020

Connecting the Dots: Discovering the "Shape" of Data

Scientists use a mathematical subject called 'topology' to study the sha...
research
06/27/2023

A new classification framework for high-dimensional data

Classification is a classic problem but encounters lots of challenges wh...

Please sign up or login with your details

Forgot password? Click here to reset