The Future AI in Healthcare: A Tsunami of False Alarms or a Product of Experts?
Recent significant increases in affordable and accessible computational power and data storage have enabled machine learning to provide almost unbelievable classification and prediction performances compared to well-trained humans. There have been some promising (but limited) results in the complex healthcare landscape, particularly in imaging. This promise has led some individuals to leap to the conclusion that we will solve an ever-increasing number of problems in human health and medicine by applying `artificial intelligence' to `big (medical) data'. The scientific literature has been inundated with algorithms, outstripping our ability to review them effectively. Unfortunately, I argue that most, if not all of these publications or commercial algorithms make several fundamental errors. I argue that because everyone (and therefore every algorithm) has blind spots, there are multiple `best' algorithms, each of which excels on different types of patients or in different contexts. Consequently, we should vote many algorithms together, weighted by their overall performance, their independence from each other, and a set of features that define the context (i.e., the features that maximally discriminate between the situations when one algorithm outperforms another). This approach not only provides a better performing classifier or predictor but provides confidence intervals so that a clinician can judge how to respond to an alert. Moreover, I argue that a sufficient number of (mostly) independent algorithms that address the same problem can be generated through a large international competition/challenge, lasting many months and define the conditions for a successful event. Finally, I propose introducing the requirement for major grantees to run challenges in the final year of funding to maximize the value of research and select a new generation of grantees.
READ FULL TEXT