Stochastic functional analysis with applications to robust machine learning

It is well-known that machine learning protocols typically under-utilize information on the probability distributions of feature vectors and related data, and instead directly compute regression or classification functions of feature vectors. In this paper we introduce a set of novel features for identifying underlying stochastic behavior of input data using the Karhunen-Loéve (KL) expansion, where classification is treated as detection of anomalies from a (nominal) signal class. These features are constructed from the recent Functional Data Analysis (FDA) theory for anomaly detection. The related signal decomposition is an exact hierarchical tensor product expansion with known optimality properties for approximating stochastic processes (random fields) with finite dimensional function spaces. In principle these primary low dimensional spaces can capture most of the stochastic behavior of `underlying signals' in a given nominal class, and can reject signals in alternative classes as stochastic anomalies. Using a hierarchical finite dimensional KL expansion of the nominal class, a series of orthogonal nested subspaces is constructed for detecting anomalous signal components. Projection coefficients of input data in these subspaces are then used to train an ML classifier. However, due to the split of the signal into nominal and anomalous projection components, clearer separation surfaces of the classes arise. In fact we show that with a sufficiently accurate estimation of the covariance structure of the nominal class, a sharp classification can be obtained. We carefully formulate this concept and demonstrate it on a number of high-dimensional datasets in cancer diagnostics. This method leads to a significant increase in precision and accuracy over the current top benchmarks for the Global Cancer Map (GCM) gene expression network dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/16/2020

Change Detection: A functional analysis perspective

We develop a new approach for detecting changes in the behavior of stoch...
research
07/11/2022

Stochastic Functional Analysis and Multilevel Vector Field Anomaly Detection

Massive vector field datasets are common in multi-spectral optical and r...
research
10/10/2004

Adaptive Cluster Expansion (ACE): A Hierarchical Bayesian Network

Using the maximum entropy method, we derive the "adaptive cluster expans...
research
12/19/2012

Feature vector regularization in machine learning

Problems in machine learning (ML) can involve noisy input data, and ML c...
research
12/20/2015

ATD: Anomalous Topic Discovery in High Dimensional Discrete Data

We propose an algorithm for detecting patterns exhibited by anomalous cl...
research
03/09/2022

AFD Types Sparse Representations vs. the Karhunen-Loeve Expansion for Decomposing Stochastic Processes

This article introduces adaptive Fourier decomposition (AFD) type method...
research
08/07/2015

Mismatch in the Classification of Linear Subspaces: Sufficient Conditions for Reliable Classification

This paper considers the classification of linear subspaces with mismatc...

Please sign up or login with your details

Forgot password? Click here to reset