A One-Sided Classification Toolkit with Applications in the Analysis of Spectroscopy Data

06/12/2018
by   Frank G. Glavin, et al.
0

This dissertation investigates the use of one-sided classification algorithms in the application of separating hazardous chlorinated solvents from other materials, based on their Raman spectra. The experimentation is carried out using a new one-sided classification toolkit that was designed and developed from the ground up. In the one-sided classification paradigm, the objective is to separate elements of the target class from all outliers. These one-sided classifiers are generally chosen, in practice, when there is a deficiency of some sort in the training examples. Sometimes outlier examples can be rare, expensive to label, or even entirely absent. However, this author would like to note that they can be equally applicable when outlier examples are plentiful but nonetheless not statistically representative of the complete outlier concept. It is this scenario that is explicitly dealt with in this research work. In these circumstances, one-sided classifiers have been found to be more robust that conventional multi-class classifiers. The term "unexpected" outliers is introduced to represent outlier examples, encountered in the test set, that have been taken from a different distribution to the training set examples. These are examples that are a result of an inadequate representation of all possible outliers in the training set. It can often be impossible to fully characterise outlier examples given the fact that they can represent the immeasurable quantity of "everything else" that is not a target. The findings from this research have shown the potential drawbacks of using conventional multi-class classification algorithms when the test data come from a completely different distribution to that of the training samples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2018

Analysis of the Effect of Unexpected Outliers in the Classification of Spectroscopy Data

Multi-class classification algorithms are very widely used, but we argue...
research
03/26/2020

Robust Classification of High-Dimensional Spectroscopy Data Using Deep Learning and Data Synthesis

This paper presents a new approach to classification of high dimensional...
research
05/10/2019

Prediction and outlier detection in classification problems

We consider the multi-class classification problem when the training dat...
research
07/29/2009

On Classification from Outlier View

Classification is the basis of cognition. Unlike other solutions, this s...
research
08/17/2022

Semi-Supervised Anomaly Detection Based on Quadratic Multiform Separation

In this paper we propose a novel method for semi-supervised anomaly dete...
research
11/25/2013

Are all training examples equally valuable?

When learning a new concept, not all training examples may prove equally...
research
07/29/2020

Evaluation of Sampling Methods for Scatterplots

Given a scatterplot with tens of thousands of points or even more, a nat...

Please sign up or login with your details

Forgot password? Click here to reset