Supervised Learning and Anti-learning of Colorectal Cancer Classes and Survival Rates from Cellular Biology Parameters
In this paper, we describe a dataset relating to cellular and physical conditions of patients who are operated upon to remove colorectal tumours. This data provides a unique insight into immunological status at the point of tumour removal, tumour classification and post-operative survival. Attempts are made to learn relationships between attributes (physical and immunological) and the resulting tumour stage and survival. Results for conventional machine learning approaches can be considered poor, especially for predicting tumour stages for the most important types of cancer. This poor performance is further investigated and compared with a synthetic, dataset based on the logical exclusive-OR function and it is shown that there is a significant level of 'anti-learning' present in all supervised methods used and this can be explained by the highly dimensional, complex and sparsely representative dataset. For predicting the stage of cancer from the immunological attributes, anti-learning approaches outperform a range of popular algorithms.
READ FULL TEXT