Classification of Big Data with Application to Imaging Genetics

05/16/2016
by   Magnus O. Ulfarsson, et al.
0

Big data applications, such as medical imaging and genetics, typically generate datasets that consist of few observations n on many more variables p, a scenario that we denote as p>>n. Traditional data processing methods are often insufficient for extracting information out of big data. This calls for the development of new algorithms that can deal with the size, complexity, and the special structure of such datasets. In this paper, we consider the problem of classifying p>>n data and propose a classification method based on linear discriminant analysis (LDA). Traditional LDA depends on the covariance estimate of the data, but when p>>n the sample covariance estimate is singular. The proposed method estimates the covariance by using a sparse version of noisy principal component analysis (nPCA). The use of sparsity in this setting aims at automatically selecting variables that are relevant for classification. In experiments, the new method is compared to state-of-the art methods for big data problems using both simulated datasets and imaging genetics datasets.

READ FULL TEXT

page 11

page 12

page 15

research
11/07/2011

Discriminant Analysis with Adaptively Pooled Covariance

Linear and Quadratic Discriminant analysis (LDA/QDA) are common tools fo...
research
09/17/2015

Sparse Fisher's Linear Discriminant Analysis for Partially Labeled Data

Classification is an important tool with many useful applications. Among...
research
10/10/2013

Feature Selection with Annealing for Computer Vision and Big Data Learning

Many computer vision and medical imaging problems are faced with learnin...
research
04/05/2018

Robust Fusion Methods for Structured Big Data

We address one of the important problems in Big Data, namely how to comb...
research
01/05/2018

Principal component analysis for big data

Big data is transforming our world, revolutionizing operations and analy...
research
03/02/2022

Providing A Compiler Technology-Based Alternative For Big Data Application Infrastructures

The unprecedented growth of data volumes has caused traditional approach...
research
10/11/2020

The Knowledge Graph for Macroeconomic Analysis with Alternative Big Data

The current knowledge system of macroeconomics is built on interactions ...

Please sign up or login with your details

Forgot password? Click here to reset