Mitigating the Effect of Dataset Bias on Training Deep Models for Chest X-rays
Deep learning has gained tremendous attention on CAD (Computer-aided Diagnosing) application, particularly biomedical imaging analysis. We analyze three large-scale publicly available CXR (Chest X-ray) datasets and find that vanilla training of deep models on diagnosing common Thorax Diseases are subject to dataset bias, leading to severe performance degradation when evaluated on unseen test set. In this work, we frame the problem as multi-source domain generalization task and make two contributions to handle dataset bias: 1. we improve the classical Max-margin loss function by making it more general and smooth; 2. we propose a new training framework named MCT (Multi-layer Cross-gradient Training) for unseen data argumentation. Empirical studies show that our methods significantly improve the model generalization and robustness to dataset bias.
READ FULL TEXT