Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines

09/16/2018
by   Boyi Yang, et al.
0

Health professionals can use natural language processing (NLP) technologies when reviewing electronic health records (EHR). Machine learning free-text classifiers can help them identify problems and make critical decisions. We aim to develop deep learning neural network algorithms that identify EHR progress notes pertaining to diabetes and validate the algorithms at two institutions. The data used are 2,000 EHR progress notes retrieved from patients with diabetes and all notes were annotated manually as diabetic or non-diabetic. Several deep learning classifiers were developed, and their performances were evaluated with the area under the ROC curve (AUC). The convolutional neural network (CNN) model with a separable convolution layer accurately identified diabetes-related notes in the Brigham and Womens Hospital testing set with the highest AUC of 0.975. Deep learning classifiers can be used to identify EHR progress notes pertaining to diabetes. In particular, the CNN-based classifier can achieve a higher AUC than an SVM-based classifier.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset