“Learning Disabilities” (or learning disorders) is an “umbrella” term describing a number of specific disorders, such as dyslexia, dyspraxia, dysgraphia, etc. A detailed view can be found in the book Diagnostic and Statistical Manual of Mental Disorders  (also known as DSM-5) from the American Psychiatric Association. Diverse studies suggest that learning disabilities are characterized by subtle and spatially distributed variations in brain anatomy. As such, they should not be confused with learning problems which could be the result of visual, hearing, or motor handicaps, or even social issues.
Clarifying the neurological underpinning of a learning disability has been a serious goal of research over the past twenty years. Despite much progress has been made across diverse research fields, learning disabilities causes are still not well understood. These learning disorders can be investigated from a lot of other viewpoints as well, such as:
Nevertheless, we can agree with  who thinks that: it is possible to identify dyslexia with a high reliability, although the exact nature of dyslexia is still unknown. We consider this is also valid with other learning disorders. Despite a lack of understanding of the causes, the symptoms are generally clear and described in a comprehensive way in the DSM–5 under the terms of Developmental Coordination Disorders. To make it short:
Dyslexia is a learning disorder which impacts the individual ability to read.
Motor dysgraphia is a learning disorder which impacts the individual ability to write.
As suggested in , motor dysgraphia might also be a marker of Developmental Coordination Disorder (DCD) such as dyspraxia. These disorders remain over the age but can be mitigated with appropriate training sessions. Obviously, it is not a matter of Yes or No and the symptoms range from mild to severe. It is widely admitted that more or less 10% of the world population has (to some extend) a learning disorder. One can refer for instance to  from Duke University or again to DSM-5. It is well known that any combination of the disorders described in DSM-5 often leads to academic failure. Nevertheless, provided with an appropriate education strategy, a child with learning difficulties will acquire the same skills as a standard child. Very often, these children can also get government supports in diverse ways (specific teaching lessons, extra tuition, extra time for exams, specific staff helping during the classroom, etc.). To get such a support, the criteria is to provide a certificate coming from an accredited specialist, who is in charge of assessing the child. The assessment may be lengthy, costly and emotionally painful. Moreover the limited number of accredited pathologists may make this process time consuming. This situation obviously prevents a lot of people, often among the most targeted population, to carry on an assessment. This makes the search of fast, effective and widely available assessments (or pre-assessments) of primary interest. Our approach is typically a candidate solution to this issue.
In fact, there has been an increasing interest to carry over the success stories of machine learning in diverse domains, from natural language understanding for chat bots to cancer detection. We feel it makes sense to investigate machine learning techniques to assess dyslexia and dysgraphia from :
Writing: a picture of handwritten text for motor dysgraphia
Reading: a relatively small number of audio recordings for dyslexia
Our working hypothesis is that a properly trained machine learning algorithm might be able to distinguish between standard children and children with dyslexia and/or dysgraphia from a set of simple features. In the case of dyslexia we consider features extracted from 32 audio files of word reading (known words and non-sense words) per people. For dysgraphia, we use only features based on a picture of handwritten text (one per individual). We have experimented this approach on a restricted set of data coming from speech pathologists partners in Australia. Our preliminary results are promising. This suggests that a screening could be implemented, providing a simple and accurate solution available for a large population at limited cost.
This paper is organized as follows. In Section 2, we provide a very brief review of the machine learning process, explaining what we need in order to build a successful predictor. In Sections 3 and 4, we describe the main principles underlying our method for dyslexia and dysgraphia. Section 5 is dedicated to our experiments: we describe the dataset, the protocol and the results we get. In Section 6, we review some related works, essentially for dyslexia as dysgraphia has not been widely investigated so far. We provide future works in our conclusion Section 7.
2 Machine learning: (very) brief summary
Machine Learning (ML) is a sub-field of Artificial Intelligence which has been very successful over the past 10 years. The main idea is to start from a set (the bigger the better) of data and to try to automatically extract patterns or features which allow to provide a conclusion for unknown new data (see 
for a good introduction). With the progress in the field of neural networks, it is today possible to build a predictor (for car or bicycle) which has an accuracy of 98% i.e. its prediction is right in 98% of the cases.
The ’car’ example seems basic but it is exactly the same as taking a picture of a small protuberance on your skin and asking ’Is it a skin cancer ?’ Today, we are able to build a predictor more accurate than any human expert in terms of skin cancer detection (see for instance  from Stanford University). Obviously, there is a huge mathematical background behind the scene, from statistics to complexity theory, through convex optimization. Building upon these theoretical achievements, powerful libraries have been developed allowing to abstract from the mathematical details, to design and implement clever algorithms leading to very accurate predictors. This is the core theoretical and technical framework underpinning our work in this paper.
3 Dyslexia prediction principle
One of the main symptoms of dyslexia is difficulty in reading. Our idea is then to gather reading audio recordings, from both dyslexic and non dyslexic readers, then to apply a machine learning algorithm. We expect the algorithm will learn the hidden characteristics allowing to distinguish between dyslexic and non dyslexic people. Let us start with what a user is supposed to produce.
3.1 Words selection and generation
Our process is to have every child to read 32 words (no sentences, only words). It is also well-known that dyslexic children struggle when it comes to reading words they have never seen or heard. They have also difficulties with some letters, or combination of letters ( and for instance) and syllables. Our initial corpus is coming from a set of 82 children’s books from the Gutenberg Project . We clean the texts and remove proper nouns. We obtain a list of around 100 000 words. Then we produce two lists : one with words from 4 to 6 letters, one with words from 7 to 9 letters. In each list, we consider only words with high frequency of apparitions to guarantee the words are known from children. After filtering, each of the two lists contains around 2000 words.
In a second step, we create two lists of pseudo-words which do not belong to the English dictionary but which look like
English words. We also need the guarantee that the pseudo-word is pronounceable. In order to achieve that, we build a Long-Short Term Memory neural networks (LSTM) that learn to build such pseudo-words. We are then able to generate an infinite list of pseudo-words. As for the real words, we build two lists of pseudo-words with different size and we keep only pseudo-words that fit with the following constraints :
The word is not in the English dictionary
Every subset of 4 consecutive letters exists in an English word (to guarantee the word is pronounceable)
It contains difficult letters or difficult combination of letters for dyslexic people.
The final list of 32 words to be read by a child is obtained by choosing 16 words in the list of real words and 16 words in the list of pseudo-words. We change the length of the words with respect to the age of the user that performs the assessment :
List 1: From 6 to 8 years old (included) the list is ordered this way:
2 easy real words
of words from 4 to 6 letters (50% real, 50% pseudo):
of words from 7 to 9 letters (50% real, 50% pseudo):
List 2: From 9 to 13 years old (included) the list is ordered this way:
2 easy real words
of words from 4 to 6 letters (50% real, 50% pseudo):
of words from 7 to 9 letters (50% real, 50% pseudo):
List 3: 14 years old and over the list is ordered this way:
2 easy real words
100% of words from 7 to 9 letters (50% real, 50% pseudo):
These constrained lists of words are randomly generated and are age-related: short words with simple syllables for children from 7 to 8, more difficult for children from 9 to 13, then difficult for children over 14.
3.2 Dyslexia input parameters
For every word audio record, we consider 3 parameters:
The fact that the word (real or nonsense) has been properly pronounced according to the English pronunciation rules. This is a binary parameter (Yes or No).
The fact that the word has been read one shot or the user, facing a difficulty, gets back to the beginning. This is still a binary parameter (Yes or No).
The reading time measuring the interval between the display of the word and the reading start. The time unit is millisecond (ms). It is a real-valued parameter.
Each record is manually tagged (with a dedicated in-house software) except the reaction time which is evaluated by the computer.
4 Dysgraphia prediction principle
As pointed out in our introduction, dysgraphia can be observed from a handwritten document and interferes at diverse levels of the handwriting process:
formation of the letters,
letter size regularity,
ability to follow a straight line,
gap between the average size of x-height letters and ascending/descending letters,
4.1 Dysgraphia input parameters
In order to train our algorithm, we need a set of labelled data. We consider only pictures with at least 4 lines of handwritten text. It is also required to have only written text and no other signs on the picture. To ensure the network to be as robust as possible, the analysis has to work in various situations: the text is free, the paper can have lines or not, the handwritten text doesn’t have to be centered, we can have scribbles, etc.. The text has to be written with Roman alphabet but apart from this limitation, the language is free. As usual in machine learning, the more diverse the data, the better the prediction. The output of the algorithm is not only a likelihood of dysgraphia, but also an analysis of the handwriting features. These features have two interests at least. First, it will help to explain the prediction. The difficulty of providing explanations for a prediction is a major issue in machine learning. By providing these features, we can check if the features are realistic (most of them are easily understandable) and are consistent with the likelihood estimation. Second, as we will see in the next section, these features will also help for the training process. The features we consider are listed below:
Slant : The slant feature corresponds to the direction of the handwriting. We normalize by converting the values from 0 (left slant) to 1 (right slant).
Pressure: The estimated pressure of the handwriting from 0 (low) to 1 (high).
Amplitude : This is the average gap size between x-height and ascending/descending letters, from 0 (low) to 1 (high).
Letter Spacing: This estimates the average spacing between letters in a word. Typically, a cursive writing style will lead to 0, from 0 (small spacing0 to 1 (large spacing).
Word spacing: This estimates the average spacing between words in a sentence, from 0 (small spacing) to 1 (large spacing).
Slant Regularity: from 0 (not regular) to 1 (highly regular).
Size Regularity: from 0 (not regular) to 1 (highly regular). Measure if the same letters vary.
Horizontal Regularity: from 0 (the text doesn’t follow an horizontal line) to 1 (the text follows an horizontal line).
Currently, all features are manually labelled with a dedicated in-house software. In the medium terms, this process will be done automatically via machine learning again.
5.1 Context and metrics
Machine learning is a data-driven technology. Apart from designing an algorithm, we have to gather proper data, with their own labels. Not only the quality, but also the quantity is important as training neural networks requires a lot of data in order to be accurate. In 
, the author points out that the difficulty to diagnose dyslexia is mainly coming from the unbalanced population. Only a relatively low percentage of the population has dyslexia or dysgraphia, and so to train a mathematical model with a so unbalanced population is always a challenge. When it comes to measure the performance of the algorithm, standard accuracy can then be a misleading metric. Assuming we have 10% of the population having dyslexia or dysgraphia, a baseline algorithm declaring everybody as non dyslexic/dysgraphic will ensures a 90% accuracy. As a consequence, other metrics are needed like precision, recall and F1-score. Let us recall below their classical definitions. For a binary classifier, we have at our disposal a set of positive examples (dyslexic children) and a set of negative examples (non dyslexic children). We notethe number of positive examples predicted as positive, the number of negative examples predicted as negative, the number of negative examples predicted as positive (false positive), and the number of positive examples predicted as negative (false negative). The metrics are defined as follows:
The accuracy measures the probability that the class predicted by the model is the right one. In the latter we will use percentage for this accuracy.
The precision is the probability of being positive if the example is predicted as positive. In some sense, this measures the correctness of the predictor when it predicts an example as positive. The bigger this number, the better the predictor is.
The recall is the probability of a positive example to be predicted as positive. In some sense, this measures the ability of the predictor to predict all positive example as positive. Still, the bigger this number, the better the predictor is.
The f1-score is a balance between precision and recall. Thus, accuracy focus on the performances of the model in general when precision, recall and f1-score focus on performances of the model on positive examples only.
We compare the performances of four state of the art classifiers: Dummy classifier that chooses classes randomly with the a priori probability of classes computed on the training set, Naive Bayes classifier, logistic classifier and random forest. In order to have average estimation of the metrics previously described, we use a-fold cross-validation scheme for each experiment (5-folds for dyslexia and 10 folds for dysgraphia).
5.2 Dyslexia results
|Majority class||47.9 [11.0]||0.58 [0.12]||0.48 [0.11]||0.52 [0.11]|
|Naïve Bayes||89.6 [7.3]||0.94 [0.06]||0.88 [0.07]||0.91 [0.06]|
|Logistic reg.||87.1 [9.5]||0.92 [0.12]||0.88 [0.11]||0.89 [0.08]|
|Random Forest||90.0 [5.7]||0.95 [0.06]||0.88 [0.07]||0.91 [0.05]|
As pointed out in Section 3, we have developed a mobile app (Android and IOS) enabling professionals, specialist clinics and schools to participate in our research by helping us collecting data. The experiment is done on people, including peoples with official dyslexia diagnostic. For each individual, we get audio files that are manually tagged. Thus, for each session we obtain the following 10 features:
age in year
average reading time
average error for English words
average backtracking for English words
average reading time for English words
average error for generated words
average backtracking for generated words
average reading time for generated words
The average values of metrics (with standard deviation in brackets) are described in Table1. We can observe that machine learning algorithm clearly outperforms random guess. The best results are obtained with Random Forest which achieve of accuracy. The precision of indicates that when the models predict the class dyslexic, it is right of the time. The recall indicates that it detect of the dyslexic people in average. These results could be close to optimal for two reasons : i) most of non-dyslexic individuals have never passed a dyslexia test with a clinician, ii) some of dyslexic individuals in the dataset attend training sessions to lower the symptoms of dyslexia. Thus, there is some noise in the dataset which could be partially overcome by considering more data. In this case, we can reasonably expect an higher accuracy.
The results suggest that the simple test we propose is effective for detecting dyslexia. Especially Figure 5 shows that, as expected, the error rate is significantly higher for dyslexic people. It also demonstrate the generated words are far more difficult to read for dyslexic people than for non-dyslexic one. This difference also appears significantly when considering the reading time (Figure 5).
5.3 Dysgraphia results
Our dataset is composed of 1481 pictures of free handwritten text. Each text has at least five lines of text. The dataset contains 198 pictures of peoples diagnosed with dysgraphia, about 13% of the dataset. For each picture of the dataset, the 9 features has been reported manually by the same person. In order to avoid biases, this person was not aware of the diagnostic associated to the picture. In that case, we are also in a position to predict the features values.
|Majority class||74.7 [1.2]||0.10 [0.04]||0.11 [0.06]||0.10 [0.05]|
|Naïve Bayes||90.8 [2.9]||0.62 [0.08]||0.93 [0.11]||0.73 [0.06]|
|Logistic reg.||95.6 [2.9]||0.87 [0.12]||0.82 [0.19]||0.82 [0.14]|
|Random Forest||96.2 [2.7]||0.92 [0.08]||0.78 [0.19]||0.83 [0.14]|
The metrics for the dysgraphia prediction for the four algorithms are presented in Table 2. As for dyslexia, the best model is Random forests with an accuracy of (random guess achieves ). The precision and recall are quite high (resp and against and
for random guess). Since the classes are unbalanced, these last scores are more significant than accuracy. The confusion matrix for random forests on a test set is presented in Figure6. This shows dysgraphia can be predicted with high accuracy from a simple analysis of an handwritten text. Once again, we can assume than the classes are noisy and the accuracy could be improved by collecting more data with official diagnostics.
6 Related works
So far, much more efforts have been deployed in the field of dyslexia detection than for dysgraphia. In , the authors start from the hypothesis (Asynchrony Theory) that dyslexia could come from a gap in the speed of processing between the different brain entities activated in the word decoding process. This gap may prevent the synchronization of information necessary for an accurate reading process. Starting from this, they monitor a population of 32 children, with a more or less 50/50 percentage of dyslexic/standard readers. At this stage, we do not know how the authors balance the fact that their dataset does not match the 90/10 usual split of population between dyslexic and standard children.
Making the children to read 96 real words and 96 non-sense words, they record brain activity via electroencephalogram (EGG) and implement a binary classification algorithm (namely Support Vector Machines SVM) to distinguish between dyslexic and standard readers. They obtain an accuracy of, with a precision equal to , a recall equals to and f1-score equals to . Apart from the fact that our population is lesser than in , it is clear that our technology is far less invasive than using EGG and obtain better results. As such, our approach seems a more realistic approach when it comes to implement real applications.
Following the same idea of investigating brain activity, but in that case using neuro-imaging scans of the brain,  start from a population of 49 students in a 50/50 split dyslexic/standard. The authors still implement a SVM classifier to get the following result : accuracy of , with a precision equal to , a recall equals to and f1-score equals to .
After training the SVM on 49 data, they tested on a another dataset made of 876 subjects whose only 60 (7%) where dyslexic. Then the accuracy goes down to with a with a precision equal to , a recall equals to and f1-score equals to .
On view of the numerical results, it seems that their studies provide more support for the use of machine learning in anatomical brain imaging than for a realistic non invasive dyslexia screening application. We can also cite the works of  where for the first time, an eye tracking technology associated to an SVM classifier was used to predict dyslexia starting from a dataset of 97 subjects, 48 of then with diagnosed dyslexia. The eye tracking technology allows to extract information such as Number of visits (Total number of visits to the area of interest in the text), Mean of visit (Duration of each individual visit within the area of interest in the text), etc. The resulting accuracy is in the range of 80% which is quite good with regard to the 50/50 split of the dataset. Nevertheless, this could be far from the ground truth knowing that we have far less than 50% of dyslexic in the real world. More recently, we can cite the works of , still using eye tracking associated to an SVM algorithm and getting a very good 97% accuracy rate over a set of 69 children, 32 being dyslexic. Their system DysLexML could ultimately be the basis of a screening tool as soon as the cost of eye-trackers allows to reach a larger population.
 start from a different hypothesis. Considering that the available datasets coming from speech pathologists contain uncertain information, they rely on a mix of fuzzy set theory and genetic algorithms to rigorously aggregate the data and deal with uncertainty. Their dataset (available on KEEL project web site
start from a different hypothesis. Considering that the available datasets coming from speech pathologists contain uncertain information, they rely on a mix of fuzzy set theory and genetic algorithms to rigorously aggregate the data and deal with uncertainty. Their dataset (available on KEEL project web sitehttp://www.keel.es) is coming from 65 children. It is not easy to compare with the other approaches as they use more than 2 classes in their result and the metrics used in fuzzy logic are not the standard ones. Nevertheless, their algorithm is part of a web-based, automated pre-screening application that can be used by parents.
7 Future works and conclusion
In this paper, we show that it is possible to predict dyslexia and dyspraxia based on very simple tasks by using Machine Learning technologies. It has been recently proved that properly trained ML-based predictors can be more accurate than human experts on specific task. Based on these facts and our encouraging results, we think there is a huge potential using machine learning to help people with learning disorders. As usual with machine learning, accuracy can still be improved by gathering more data. In the same way, machine learning algorithms, and particularly deep learning, could be used for processing pictures and audio data with minimal pre-processing. This would avoid the need of manual analysis and the global performances may also be improved. From another viewpoint and regarding dysgraphia, it is likely that it can be detected on a drawing only. If we are able to do that, this will make a screening for dysgraphia available for children under the age of writing. In the future, a better understanding of the correlation between the different disorders could also help in providing accurate predictions. This knowledge could come from the cognitive science community.
-  (2013) Diagnostic and statistical manual of mental disorders: DSM-5. 5th ed. edition, Autor, Washington, DC. Cited by: §1.
-  (2019) DysLexML: screening tool for dyslexia using machine learning. CoRR abs/1903.06274. Cited by: §6.
-  (2017-01) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, pp. 115 – 118. Cited by: §2.
-  (2018) Features and machine learning for correlating and classifying between brain areas and dyslexia. CoRR abs/1812.10622. Cited by: §6, §6.
-  (2004-11) Identifying reading disabilities by responsiveness-to-instruction: specifying measures and criteria. Learning Disability Quarterly 27. Cited by: 3rd item.
-  (1971) Project gutenberg.. Note: https://www.gutenberg.org/ Cited by: §3.1.
-  (1997) Long Short-Term Memory. Neural Computing 9 (8), pp. 1735–1780. Cited by: §3.1.
-  (2018) Developmental dysgraphia is often associated with minor neurological dysfunction in children with developmental coordination disorder (dcd). Neurophysiologie Clinique 48 (4), pp. 207–217. Cited by: §1.
-  (2015) Neural networks and deep learning. Determination Press. Cited by: §2.
-  (2010) Diagnosis of dyslexia with low quality data with genetic fuzzy systems. International Journal of Approximate Reasoning 51 (8), pp. 993 – 1009. Note: North American Fuzzy Information Processing Society Annual Conference NAFIPS ’2007 Cited by: §6.
-  (2013-10) Gender differences in reading impairment and in the identification of impaired readers: results from a large-scale study of at-risk readers. Journal of learning disabilities 48. Cited by: 1st item.
-  (2015-05) Detecting readers with dyslexia using machine learning with eye tracking measures. Proceedings of the 12th Web for All Conference W4A ’15, pp. 1–8. Cited by: §6.
-  (2016) Oral language deficits in familial dyslexia: a meta-analysis and review.. Psychological bulletin 142 5, pp. 498–545. Cited by: 2nd item.
-  (2016-03-29) Machine learning and dyslexia: classification of individual structural neuro-imaging scans of students with and without dyslexia. NeuroImage. Clinical 11, pp. 508–514. External Links: Cited by: §6.
-  (2014) Identifying dyslexia in adults: an iterative method using the predictive value of item scores and self-report questions. Annals of Dyslexia 64 (1), pp. 34–56. Cited by: §1.
-  (2016) Dyslexia international: better training, better teaching. Note: https://www.dyslexia-international.org/wp-content/uploads/2016/04/DI-Duke-Report-final-4-29-14.pdf Cited by: §1.
-  (1994-07) Meta-analytic confirmation of the nonword reading deficit in developmental dyslexia. Reading Research Quarterly 29, pp. 266. Cited by: 3rd item.
-  (2011-04) Examining agreement and longitudinal stability among traditional and rti-based definitions of reading disability using the affected-status agreement statistic. Journal of learning disabilities 44, pp. 296–307. Cited by: 3rd item.
-  (2018) Why is it so difficult to diagnose dyslexia and how can we do it better?. Note: https://dyslexiaida.org/why-is-it-so-difficult-to-diagnose-dyslexia-and-how-can-we-do-it-better Cited by: §5.1.