An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text
Unstructured clinical text in EHRs contains crucial information for applications including decision support, trial matching, and retrospective research. Recent work has applied BERT-based models to clinical information extraction and text classification, given these models' state-of-the-art performance in other NLP domains. However, BERT is difficult to apply to clinical notes because it doesn't scale well to long sequences of text. In this work, we propose a novel fine-tuning approach called SnipBERT. Instead of using entire notes, SnipBERT identifies crucial snippets and then feeds them into a truncated BERT-based model in a hierarchical manner. Empirically, SnipBERT not only has significant predictive performance gain across three tasks but also provides improved interpretability, as the model can identify key pieces of text that led to its prediction.
READ FULL TEXT