Evaluation of statistical approaches for association testing in noisy drug screening data

by   Petr Smirnov, et al.

Identifying associations among biological variables is a major challenge in modern quantitative biological research, particularly given the systemic and statistical noise endemic to biological systems. Drug sensitivity data has proven to be a particularly challenging field for identifying associations to inform patient treatment. To address this, we introduce two semi-parametric variations on the commonly used Concordance Index: the robust Concordance Index and the kernelized Concordance Index (rCI, kCI), which incorporate measurements about the noise distribution from the data. We demonstrate that common statistical tests applied to the concordance index and its variations fail to control for false positives, and introduce efficient implementations to compute p-values using adaptive permutation testing. We then evaluate the statistical power of these coefficients under simulation and compare with Pearson and Spearman correlation coefficients. Finally, we evaluate the various statistics in matching drugs across pharmacogenomic datasets. We observe that the rCI and kCI are better powered than the concordance index in simulation and show some improvement on real data. Surprisingly, we observe that the Pearson correlation was the most robust to measurement noise among the different metrics.


page 1

page 2

page 3

page 4


Men Are from Mars, Women Are from Venus: Evaluation and Modelling of Verbal Associations

We present a quantitative analysis of human word association pairs and s...

Metrics to find a surrogate endpoint of OS in metastatic oncology trials: a simulation study

Surrogate endpoint (SE) for overall survival (OS) in cancer patients is ...

Location and Type of Crimes in The Philippines: Insights for Crime Prevention and Management

The purpose of this study was to determine the association of location a...

Association study between gene expression and multiple phenotypes in omics applications of complex diseases

Studying phenotype-gene association can uncover mechanism of diseases an...

Adaptive Independence Tests with Geo-Topological Transformation

Testing two potentially multivariate variables for statistical dependenc...

Theoretical Justification of the Bi Error Method

Incorrect usage of p-values, particularly within the context of signific...

A Bandit Approach to Multiple Testing with False Discovery Control

We propose an adaptive sampling approach for multiple testing which aims...

Please sign up or login with your details

Forgot password? Click here to reset