Assessment of Multiple-Biomarker Classifiers: fundamental principles and a proposed strategy

10/30/2019
by   Waleed A. Yousef, et al.
0

The multiple-biomarker classifier problem and its assessment are reviewed against the background of some fundamental principles from the field of statistical pattern recognition, machine learning, or the recently so-called "data science". A narrow reading of that literature has led many authors to neglect the contribution to the total uncertainty of performance assessment from the finite training sample. Yet the latter is a fundamental indicator of the stability of a classifier; thus its neglect may be contributing to the problematic status of many studies. A three-level strategy is proposed for moving forward in this field. The lowest level is that of construction, where candidate features are selected and the choice of classifier architecture is made. At that point, the effective dimensionality of the classifier is estimated and used to size the next level of analysis, a pilot study on previously unseen cases. The total (training and testing) uncertainty resulting from the pilot study is, in turn, used to size the highest level of analysis, a pivotal study with a target level of uncertainty. Some resources available in the literature for implementing this approach are reviewed. Although the concepts explained in the present article may be fundamental and straightforward for many researchers in the machine learning community they are subtle for many practitioners, for whom we provided a general advice for the best practice in <cit.> and elaborate here in the present paper.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/21/2020

Wrapper Feature Selection Algorithm for the Optimization of an Indicator System of Patent Value Assessment

Effective patent value assessment provides decision support for patent t...
research
03/18/2019

Elements and Principles of Data Analysis

The data revolution has led to an increased interest in the practice of ...
research
12/11/2016

On Choosing Training and Testing Data for Supervised Algorithms in Ground Penetrating Radar Data for Buried Threat Detection

Ground penetrating radar (GPR) is one of the most popular and successful...
research
10/31/2021

Classification of fetal compromise during labour: signal processing and feature engineering of the cardiotocograph

Cardiotocography (CTG) is the main tool used for fetal monitoring during...
research
06/30/2023

Redeeming Data Science by Decision Modelling

With the explosion of applications of Data Science, the field is has com...
research
11/20/2018

Machine Learning Distinguishes Neurosurgical Skill Levels in a Virtual Reality Tumor Resection Task

Background: Virtual reality simulators and machine learning have the pot...
research
08/09/2021

Unified Regularity Measures for Sample-wise Learning and Generalization

Fundamental machine learning theory shows that different samples contrib...

Please sign up or login with your details

Forgot password? Click here to reset