Log In Sign Up

Automated Strabismus Detection based on Deep neural networks for Telemedicine Applications

by   Jiewei Lu, et al.

Strabismus is one of the most influential ophthalmologic diseases in humans life. Timely detection of strabismus contributes to its prognosis and treatment. Telemedicine, which has great potential to alleviate the growing demand of the diagnosis of ophthalmologic diseases, is an effective method to achieve timely strabismus detection. In addition, deep neural networks are beneficial to achieve fully automated strabismus detection. In this paper, a tele strabismus dataset is founded by the ophthalmologists. Then a new algorithm based on deep neural networks is proposed to achieve automated strabismus detection on the founded tele strabismus dataset. The proposed algorithm consists of two stages. In the first stage, R-FCN is applied to perform eye region segmentation. In the second stage, a deep convolutional neural networks is built and trained in order to classify the segmented eye regions as strabismus or normal. The experimental results on the founded tele strabismus dataset shows that the proposed method can have a good performance on automated strabismus detection for telemedicine application. Code is made publicly available at:


page 2

page 3


Identification and Recognition of Rice Diseases and Pests Using Deep Convolutional Neural Networks

An accurate and timely detection of diseases and pests in rice plants ca...

A Robust Image Watermarking System Based on Deep Neural Networks

Digital image watermarking is the process of embedding and extracting wa...

Parking Stall Vacancy Indicator System Based on Deep Convolutional Neural Networks

Parking management systems, and vacancy-indication services in particula...

Exploring Large Context for Cerebral Aneurysm Segmentation

Automated segmentation of aneurysms from 3D CT is important for the diag...

Activated Gradients for Deep Neural Networks

Deep neural networks often suffer from poor performance or even training...

Automated Steel Bar Counting and Center Localization with Convolutional Neural Networks

Automated steel bar counting and center localization plays an important ...

Detection of Perineural Invasion in Prostate Needle Biopsies with Deep Neural Networks

Background: The detection of perineural invasion (PNI) by carcinoma in p...

1 Introduction

Strabismus is an ophthalmologic disease in which eyes can not be aligned at the same location burian1985burian ; van2004amblyopia , which often occurs in childhood. Strabismus is caused by the problems occurring on optic nerve, brain or extraocular muscle rutstein2011optometric . Risk factors contain premature birth and familial inheritance lorenz2002genetics . Strabismus has a serious impact on human’s life. Strabismus can prevent the brain from merging the two images received from both eyes, which leads to amblyopia kiorpes1998neuronal . The under-treated amblyopic eyes may degenerate, resulting in blindness tommila1981incidence . Also the double vision and depth perception of strabismus patients are lower than the normal persons. As a result, the prognosis and treatment of strabismus become more and more important in which strabismus detection is the first and one of the most essential steps.

Traditional strabismus detection is performed on the hospital. Doctors use the Hirschberg test eskridge1988hirschberg to determine whether the patient has strabismus: a thin beam of light is sent into a patient’s eyes for the purpose of verifying whether the reflections of each eye is located at the same place on both corneas. In addition, strabismus detection can be performed with digital tools. In abrahamsson1986photorefraction , Abrahamsson uses photorefraction to achieve small angle strabismus detection. In loudon2011rapid , Loudon utilizes the pediatric vision scanner to perform strabismus detection. In de2012computational , Almeida applies a digital camera and the Hirschberg test to identify strabismus. In valente2017automatic , Valente achieves strabismus detection in digital videos through cover test. In chen2018strabismus , Chen uses an eye tracking system and convolutional neural networks to detect strabismus. These strabismus detection methods have the following disadvantages: (1) It requires the on-site assistance of specialists, and increases the burden of specialists; (2) People in remote districts or isolated communities can not receive timely strabismus diagnosis. In order to overcome the above problems, telemedicine is applied to alleviate the growing demand of strabismus detection. Telemedicine means making use of telecommunication and computer technology to offer clinical health care from a distance. In helveston2001telemedicine , Helveston use telemedicine to make the diagnosis of strabismus in the places where specialists are unavailable. In such situation, patients’ images were captured with digital cameras, and then sent with computers to specialists so that specialists could make the analysis and diagnosis of strabismus from a distance. However, this method still requires the specialists to spend time in diagnosis.

Figure 1: Some exemplary images in the established tele strabismus dataset. (a) The normal image in the dataset. (b) The strabismus image (with a complete human face) in the dataset. (c) The strabismus image (with parts of a human face) in the dataset.

To the best of our knowledge, this is the first research of achieving automated strabismus detection for telemedicine application. The major reason is that there are no published tele strabismus datasets. Establishing a tele strabismus datasets is not a trivial task since it needs the collaborations of ophthalmologists and patients. Moreover, the images in the tele strabismus datasets are with different sizes, resolutions and backgrounds, and some images only contain parts of the human faces. These factors make it difficult to achieve automated strabismus detection for telemedicine application.

In order to achieve automated strabismus detection for telemedicine application, in this paper, a tele strabismus dataset is established firstly, which has been carefully collected and labeled by the ophthalmologists. Some exemplary images 111In order to protect the privacies of patients, parts of the human faces are blocked. in the established strabismus dataset are shown in Figure 1. Then an end-to-end framework RF-CNN is proposed to perform automated strabismus detection. The proposed RF-CNN comprises two stages: eye region segmentation and strabismus diagnosis (as shown in Figure 2). In the eye region segmentation stage, R-FCN dai2016r is used to segment the eye regions in the images on the tele strabismus dataset, which aims at reducing the influence of background and focusing on the eye regions. In the strabismus diagnosis stage, automated strabismus detection is achieved according to the segmented eye regions with deep convolutional neural networks (CNN) lecun1989backpropagation . The experimental results show that the proposed algorithm can obtain a good detection performance on the tele strabismus dataset.

The rest of this paper is structured as follows. Section 2 introduces some knowledge about telemedicine. Section 3 presents the details of the proposed method. Section 4 provides the experimental details. In Section 5, the conclusions of this paper are provided.

Figure 2:  The proposed RF-CNN framework. The dash arrows mean that event annotations are only used in the training stage.
Figure 3:  The overall architecture of R-FCN for automated strabismus detection. In our work, the last layer of R-FCN generates position-sensitive score maps. means two categories: eye region and background. Each RoI is divided into bins, and pooling is only operated on one of the score maps, which are represented by different colors.

2 Telemedicine

Telemedicine means providing interactive health care from a distance by using telecommunication and modern technology, which can achieve the transfer of clinical data and provide the remote clinical diagnosis independent of physical proximity to the patient. Telemedicine has an important influence on patients in remote districts and isolated communities since patients can receive health care from doctors (or specialists) far away and do not have to travel to visit doctors berman2005technology . Telemedicine can alleviate the growing demand of diagnosing various diseases, such as diabetic retinopathy and ophthalmologic diseases. It can provide access of specialists to geographically remote districts. Telemedicine consists of three main categories: store-and-forward, remote monitoring and real-time interactive services. Store-and-forward indicates obtaining medical data, and then sending these data to doctors or specialists for offline physical examination american2012telemedicine . It is not necessary for doctors and patients to present at the same time. Remote monitoring provides access for specialists to monitor patients remotely with various technological equipments. Real-time interactive services indicates providing real-time interactions between patients and specialists. In this paper, automated strabismus detection belongs to the store-and-forward category.

3 Methodology

The proposed framework RF-CNN can be regarded as a two-stage strabismus detection method. As shown in Figure 2, the images in the tele strabismus dataset are sequentially fed into R-FCN, which finally outputs the locations of eye regions. Then the eye region images are cropped and fed into CNNs. RF-CNN is directly learned from image data, and the event annotations are used for the training on both stages.

3.1 Eye Region Segmentation

Figure 4: The images with different backgrounds in the tele strabismus dataset: (1) car, (2) wall, (3) home.

There are two reasons of performing eye region segmentation: (1) According to ophthalmologists, the diagnosis of strabismus is based on the eye regions of humans; (2) It contributes to reducing the background influence. Figure 4 shows some images with different backgrounds 222In order to protect the privacies of patients, parts of the human faces are blocked.. R-FCN is built and used to alleviate the above problems in our work. The used R-FCN is composed of two stages: region proposals and region classification. The overall architecture of R-FCN for automated strabismus detection is shown in Figure 3. In the region proposals stage, candidate eye regions are obtained by the Region Proposal Network (RPN) ren2015faster . After generating the proposal eye regions (RoIs), R-FCN is further dividing the RoIs into the eye region category or the background category. The last layer of R-FCN generates position-sensitive score maps for the eye regions and background, which results in a -channel layer. In our paper, is set as dai2016r and the 9 score maps encode (top-left, top-center, top-right,…, bottom-center, bottom-right) cases for eye region or background. Then each RoI is voted by averaging the scores:


where is the average response for the -th category; means all learnable parameters in R-FCN; is the response in the -th bin for the -th category, which is defined as:


where is one score map out of the maps, is the top-left case of an RoI, means the pixel number in the bin.

Finally, the R-FCN architecture is trained by optimizing the loss function



Here is the bounding box regression for the eye region as defined in girshick2015fast , is the cross-entropy loss for classification:


where is the RoI’s ground-truth label, and represent the background and eye region, respectively.

A well trained R-FCN model is used to perform eye region segmentation, as shown in Figure 3.

Figure 5:  Some exemplary instances of segmented eye regions with normal labels.
Figure 6:  Some exemplary instances of segmented eye regions with strabismus labels.

3.2 Strabismus Diagnosis

Strabismus diagnosis is performed by applying CNNs based on the segmented eye regions. Figure 5 and 6 show some exemplary instances of segmented eye regions with strabismus and normal labels. CNNs are the commonly used architectures of deep neural networks. They are initially applied to solve the challenging problems like handwritten character recognition lecun1990handwritten . Nowadays CNNs have been developed rapidly and used for a a large spectrum of vision problems, such as remote sensing fan2018automatic and medical applications liskowski2016segmenting ; zhou2017cell .

Figure 7:

 The network architecture composed of five convolutional layers, three pooling layers and three fully connected layers. Layer names are followed by the number of feature maps. Square brackets specify the kernel size and stride. It is noted that ’conv’, ’maxpool’ and ’fc’ are short for convolutional layer, max pooling layer and fully connected layer, respectively.

Generally CNNs consist of convolutional layers, pooling layers and fully connected layers (as shown in Figure 8). Convolutional layers can extract meaningful and effective features. If the input to the convolutional layer is a image and the kernel size is , then the convolutional layer can obtain features maps with size

. Pooling layer aggregates the neurons’ output within a rectangular neighborhood, which can reduce the number of parameters for CNNs. Max-pooling

graham2014fractional is the most commonly used pooling function. Fully connected layer maps the excitation into output neurons, each corresponding to one decision class.

Figure 8:  A CNN architecture composed of one convolutional layer, one pooling layer and one fully connected layer. The convolutional layer utilizes a kernel with stride , while the pooling layer utilizes a kernel with stride

. Flatten means converting the feature matrices into vectors.

After confirming the CNN architecture, the parameters of CNNs are learned by optimizing the objective function :


where is the number of training examples. represents the th training examples.

is the activation function.

4 Experiments

The proposed RF-CNN framework is evaluated on an established tele strabismus dataset. In this section, after introducing the tele strabismus dataset, the training and evaluation details of RF-CNN are presented, which includes the training setup, evaluation metrics and experimental results.

4.1 Datasets

The tele strabismus dataset contains 5685 images. Each image contains only a human face. 5310 images contain complete human faces while 375 images contain parts of the human faces. The tele strabismus dataset is divided into a training dataset containing 3409 images and a testing dataset containing 2276 images. The training dataset consists of 701 strabismus images and 2708 normal images, while the test dataset is composed of 470 strabismus images and 1806 normal images. These images are captured by various equipments, such as mobile phone and vidicon. In addition, these images have various resolutions, ranging from to . This dataset has been carefully annotated by the ophthalmologists.

4.2 Training Setup

The proposed RF-CNN is trained sequentially. First R-FCN is trained to segment the eye regions. Then a CNN is learned to achieve strabismus diagnosis based on the segmented eye regions. Training setup of RF-CNN is introduced as follows.

R-FCN: ResNet-101 he2016deep is adopted as the backone of R-FCN, and online hard example mining (OHEM) shrivastava2016training is used to train R-FCN. The learning rate is and the momentum is . A number of eye regions are segmented by R-FCN, which are further resized into .

CNN: The network architecture consists of five convolutional layers and three pooling layers, followed by three fully connected layers, as shown in Figure 7

. Each convolutional layer is followed by a Relu layer

nair2010rectified , an effective activation function to improve the performance of the CNNs. In addition, the dropout strategy krizhevsky2012imagenet

is used in the first two fully connected layers in order to prevent overfitting. The network training is performed by the stochastic gradient descent method

bottou2012stochastic . regularization with the weight decay is used in the network training. The dropout ratio is set as 0.5. The batch size is set as 32. The learning rate is initially set as 0.01 and the training is stopped after 5000 iterations.

The implementation of RF-CNN was based on Tensorflow

abadi2016tensorflow , an effective toolbox to train deep neural networks. The training was conducted on a Intel Xeon E5-2690 CPU with a TITAN Xp GPU.

Metrics TP TN FP FN Se Sp Acc AUC
RF-CNN 452 1685 121 18 0.9330 0.9617 0.9389 0.9865
Table 1: The detection performance of RF-CNN on the established tele strabismus dataset

4.3 Evaluation Metrics

Four commonly used evaluation metrics Sensitivity, Specificity, Accuracy and AUC are used in this experiment in order to evaluate the performance of RF-CNN:

where TP (true positive), TN (true negative), FP (false positive) and FN (false negavtive) are the numbers of correctly identified strabismus images, correctly identified normal images, incorrectly identified strabismus images and incorrectly identified normal images, respectively. Sensitivity (Se) and Specificity (Sp) indicate the RF-CNN’s ability of identifying normal images and strabismus images. Accuracy (Acc) is used for evaluating the overall detection performance. In addition, the area under the receiver operating characteristic (ROC) fawcett2006introduction curve (AUC) is also applied to evaluate the overall detection performance of RF-CNN.

Figure 9: The ROC curve of RF-CNN.
Figure 10: Variations of Se, Sp, Acc and AUC of RF-CNN with the increase of the number of training examples

4.4 Results

The detection results of RF-CNN on the established dataset are shown in Table 1 and Figure 9. It can be observed that RF-CNN can achieve high scores of Sensitivity=0.9330 and Specificity=0.9617, which indicates that RF-CNN can perform well on identifying strabismus images and normal images. The high scores of Accuracy=0.9389 and AUC=0.9865 are obtained by RF-CNN, which means that RF-CNN can achieve overall good detection results on the established strabismus dataset.

Moreover, the sensitivity analysis of RF-CNN to the number of training examples is shown in Figure 10. From Figure 10, it can be observed that the evaluation metrics Se, Sp, Acc and AUC become better with the increase of training examples. With less than 1500 training examples, the detection results improve significantly with the increase of training examples. With more than 1500 training examples, the detection results vary slightly with the increase of training examples. From the above observation, we can choose 1500 training examples to train the RF-CNN in our work

5 Conclusion

Nowadays strabismus has become an influential ophthalmologic diseases in humans life. Strabismus detection plays an important role in the prognosis and treatment of strabismus. Telemedicine is an effective method to achieve timely detection of strabismus. Concretely, store-and-forward, a category of telemedicine, is applied to achieve timely strabismus detection in this work, which means collecting the medical data, and then sending the data to doctors for the physical diagnosis and examination.

In this paper, in order to achieve automated strabismus detection for telemedicine application, a tele strabismus dataset is established firstly, in which the image data are collected and labeled by the specialists. Then an end-to-end framework RF-CNN is proposed to achieve automated strabismus detection. The proposed RF-CNN first uses R-FCN to perform eye region segmentation, and then classifies the segmented eye regions as strabismus or normal with a deep convolutional neural network. The detection results on the established tele strabismus dataset demonstrate that the proposed RF-CNN performs well on automated strabismus detection for telemedicine application. Moreover, to the best of our knowledge, this is the first research that achieves automated strabismus detection for telemedicine application.

In the future, we will try to release the established tele strabismus dataset by negotiating with doctors and patients. Moreover, we will continue to work closely with doctors, and try to apply this research to clinical applications.

6 Acknowledgement

The authors would like to thank Doctor Ce Zheng for providing the tele strabismus dataset.

This research work was supported by Guangdong Key Laboratory of Digital Signal and Image Processing, the National Natural Science Foundation of China under Grant (61175073, 61300159, 61332002, 51375287), the Natural Science Foundation of Jiangsu Province of China under grant SBK2018022017.


  • (1) H. M. Burian, G. K. Von Noorden, Burian-von Noorden’s Binocular vision and ocular motility: theory and management of strabismus, CV Mosby, 1985.
  • (2) E. Van de Graaf, G. Van der Sterre, J. R. Polling, H. Van Kempen, B. Simonsz, H. Simonsz, Amblyopia and strabismus questionnaire: design and initial validation, Strabismus 12 (3) (2004) 181–193.
  • (3) R. P. Rutstein, M. S. Cogen, S. A. Cotter, O. K. M. Daum, J. F. Amos, O. Barry Barresi, K. L. Beebe, O. J. Cavallerano, Optometric clinical practice guideline care of the patient with strabismus: Esotropia and exotropia, Lindbergh Blvd. St. Louis: American Optometric Association.
  • (4) B. Lorenz, Genetics of isolated and syndromic strabismus: facts and perspectives, Strabismus 10 (2) (2002) 147–156.
  • (5) L. Kiorpes, D. C. Kiper, L. P. O’keefe, J. R. Cavanaugh, J. A. Movshon, Neuronal correlates of amblyopia in the visual cortex of macaque monkeys with experimental strabismus and anisometropia, Journal of Neuroscience 18 (16) (1998) 6411–6424.
  • (6) V. Tommila, A. Tarkkanen, Incidence of loss of vision in the healthy eye in amblyopia., British Journal of Ophthalmology 65 (8) (1981) 575–577.
  • (7) J. B. Eskridge, B. Wick, D. Perrigin, The hirschberg test: a double-masked clinical evaluation., American journal of optometry and physiological optics 65 (9) (1988) 745–750.
  • (8) M. Abrahamsson, G. Fabian, J. Sjöstrand, Photorefraction: a useful tool to detect small angle strabismus, Acta ophthalmologica 64 (1) (1986) 101–104.
  • (9) S. E. Loudon, C. A. Rook, D. S. Nassif, N. V. Piskun, D. G. Hunter, Rapid, high-accuracy detection of strabismus and amblyopia using the pediatric vision scanner, Investigative ophthalmology & visual science 52 (8) (2011) 5043–5048.
  • (10) J. D. S. De Almeida, A. C. Silva, A. C. De Paiva, J. A. M. Teixeira, Computational methodology for automatic detection of strabismus in digital images through hirschberg test, Computers in biology and medicine 42 (1) (2012) 135–146.
  • (11) T. L. A. Valente, J. D. S. de Almeida, A. C. Silva, J. A. M. Teixeira, M. Gattass, Automatic diagnosis of strabismus in digital videos through cover test, Computer methods and programs in biomedicine 140 (2017) 295–305.
  • (12) Z. Chen, H. Fu, W.-L. Lo, Z. Chi, Strabismus recognition using eye-tracking data and convolutional neural networks, Journal of healthcare engineering 2018.
  • (13) E. M. Helveston, F. H. Orge, R. Naranjo, L. Hernandez, Telemedicine: Strabismus e-consultation, Journal of American Association for Pediatric Ophthalmology and Strabismus 5 (5) (2001) 291–296.
  • (14) J. Dai, Y. Li, K. He, J. Sun, R-fcn: Object detection via region-based fully convolutional networks, in: Advances in neural information processing systems, 2016, pp. 379–387.
  • (15)

    Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, Backpropagation applied to handwritten zip code recognition, Neural computation 1 (4) (1989) 541–551.

  • (16) M. Berman, A. Fenaughty, Technology and managed care: patient benefits of telemedicine in a rural health care network, Health economics 14 (6) (2005) 559–573.
  • (17) A. T. Association, et al., What is telemedicine, Retrieved form http://www. americantelemed. org/learn.
  • (18) S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, in: Advances in neural information processing systems, 2015, pp. 91–99.
  • (19)

    R. Girshick, Fast r-cnn, in: Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.

  • (20) Y. LeCun, B. E. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. E. Hubbard, L. D. Jackel, Handwritten digit recognition with a back-propagation network, in: Advances in neural information processing systems, 1990, pp. 396–404.
  • (21) Z. Fan, J. Lu, M. Gong, H. Xie, E. D. Goodman, Automatic tobacco plant detection in uav images via deep neural networks, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11 (3) (2018) 876–887.
  • (22) P. Liskowski, K. Krawiec, Segmenting retinal blood vessels with deep neural networks, IEEE transactions on medical imaging 35 (11) (2016) 2369–2380.
  • (23) Y. Zhou, H. Mao, Z. Yi, Cell mitosis detection using deep neural networks, Knowledge-Based Systems 137 (2017) 19–28.
  • (24) B. Graham, Fractional max-pooling, arXiv preprint arXiv:1412.6071.
  • (25)

    K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.

  • (26) A. Shrivastava, A. Gupta, R. Girshick, Training region-based object detectors with online hard example mining, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 761–769.
  • (27)

    V. Nair, G. E. Hinton, Rectified linear units improve restricted boltzmann machines, in: Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807–814.

  • (28)

    A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances in neural information processing systems, 2012, pp. 1097–1105.

  • (29) L. Bottou, Stochastic gradient descent tricks, in: Neural networks: Tricks of the trade, Springer, 2012, pp. 421–436.
  • (30) M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Tensorflow: a system for large-scale machine learning., in: OSDI, Vol. 16, 2016, pp. 265–283.
  • (31) T. Fawcett, An introduction to roc analysis, Pattern recognition letters 27 (8) (2006) 861–874.