Stroke-Based Cursive Character Recognition

04/01/2013
by   K. C. Santosh, et al.
0

Human eye can see and read what is written or displayed either in natural handwriting or in printed format. The same work in case the machine does is called handwriting recognition. Handwriting recognition can be broken down into two categories: off-line and on-line. ...

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

06/07/2012

Off-Line Arabic Handwriting Character Recognition Using Word Segmentation

The ultimate aim of handwriting recognition is to make computers able to...
08/06/2021

Printed Texts Tracking and Following for a Finger-Wearable Electro-Braille System Through Opto-electrotactile Feedback

This paper presents our recent development on a portable and refreshable...
10/25/2014

A Framework for On-Line Devanagari Handwritten Character Recognition

The main challenge in on-line handwritten character recognition in India...
12/02/2016

Recognition of Text Image Using Multilayer Perceptron

The biggest challenge in the field of image processing is to recognize d...
02/14/2020

Why Do Line Drawings Work? A Realism Hypothesis

Why is it that we can recognize object identity and 3D shape from line d...
09/23/2018

Learning to Read by Spelling: Towards Unsupervised Text Recognition

This work presents a method for visual text recognition without using an...
09/06/2017

The Voynich Manuscript is Written in Natural Language: The Pahlavi Hypothesis

The late medieval Voynich Manuscript (VM) has resisted decryption and wa...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Human eye can see and read what is written or displayed either in natural handwriting or in printed format. The same work in case the machine does is called handwriting recognition. Handwriting recognition can be broken down into two categories: off-line and on-line.

Off-line character recognition

– Off-line character recognition takes a raster image from a scanner (scanned images of the paper documents), digital camera or other digital input sources. The image is binarised based on for instance, color pattern (color or gray scale) so that the image pixels are either 1 or 0.

On-line character recognition

– In on-line, the current information is presented to the system and recognition (of character or word) is carried out at the same time. Basically, it accepts a string of coordinate pairs from an electronic pen touching a pressure sensitive digital tablet.

In this chapter, we keep focusing on on-line writer independent cursive character recognition engine. In what follows, we explain the importance of on-line handwriting recognition over off-line, the necessity of writer independent system and the importance as well as scope of cursive scripts like Devanagari. Devanagari is considered as one of the known cursive scripts palC04PR; jayadevanKPP11SMC. However, we aim to include other scripts related to the current study.

1.1 Why On-line?

With the advent of handwriting recognition technology since a few decades plamondon00PAMI; arica01

, applications are challenging. For example, OCR is becoming an integral part of document scanners, and is used in many applications such as postal processing, script recognition, banking, security (signature verification, for instance) and language identification. In handwriting recognition, feature selection has been an important issue 

duetrier96PR. Both structural and statistical features as well as their combination have been widely used heutteO98PRL; foggia99IVC. These features tend to vary since characters’ shapes vary widely. As a consequence, local structural properties like intersection of lines, number of holes, concave arcs, end points and junctions change time to time. These are mainly due to

  • deformations can be from any range of shape variations including geometric transformation such as translation, rotation, scaling and even stretching; and

  • defects yield imperfections due to printing, optics, scanning, binarisation as well as poor segmentation.

In the state-of-the-art of handwritten character recognition, several different studies have shown that off-line handwriting recognition offers less classification rate compared to on-line tappert90; plamondon00PAMI. Furthermore, on-line data offers significant reduction in memory and therefore space complexity. Another advantage is that the digital pen or a digital form on a tablet device immediately transforms your handwriting into a digital representation that can be reused later without having any risk of degradation usually associated with ancient handwriting. Based on all these reasons, one can cite a few examples boccignone93PR; doermannR95IJCV; viard05PRL; qiaoNY06PAMI where they mainly focus on temporal information as well as writing order recovery from static handwriting image. On-line handwriting recognition systems provide interesting results.

On-line character recognition involves the automatic conversion of stroke as it is written on a special digitizer or PDA, where a sensor picks up the pen-tip movements as well as pen-up/pen-down switching. Such data is known as digital ink and can be regarded as a dynamic representation of handwriting. The obtained signal is converted into letter codes which are usable within computer and character-processing applications.

Fig. 1: On-line stroke sequences in the form of 2D coordinates. In this illustration, initial pen-tip position is coloured with red and pen-up (final point) is coloured with blue.

The elements of an on-line handwriting recognition interface typically include:

  1. a pen or stylus for the user to write with, and a touch sensitive surface, which may be integrated with, or adjacent to, an output display.

  2. a software application i.e., a recogniser which interprets the movements of the stylus across the writing surface, translating the resulting strokes into digital character.

Globally, it resembles one of the applications of pen computing i.e., computer user-interface using a pen (or stylus) and tablet, rather than devices such as a keyboard, joysticks or a mouse. Pen computing can be extended to the usage of mobile devices such as wireless tablet personal computers, PDAs and GPS receivers.

Historically, pen computing (defined as a computer system employing a user-interface using a pointing device plus handwriting recognition as the primary means for interactive user input) predates the use of a mouse and graphical display by at least two decades, starting with the Stylator dimond57 and RAND tablet groner66 systems of the 1950s and early 1960s.

1.2 Why Writer Independent?

As mentioned before, on-line handwriting recognition systems provide interesting results almost over all types scripts. The recognition systems vary widely which can be due to nature of the scripts employed along with the associated particular difficulties including the intended applications. The performance of the application-based (commercial) recogniser is used to determine by its speed in addition to accuracy.

Among many, more specifically, template based approaches have a long standing record hu96hmm; schenkel95; connell99; bahlmann04; kc_pricai06. In many of the cases, writer independent recogniser has been made since every new user does not require training – which is widely acceptable. In such a context, the expected recognition system should automatically update or adapt the new users once they provide input or previously trained recogniser should be able to discriminate new users.

1.3 Why Devanagari?

In a few points, interesting scope will be summarised.

  1. Pencil and paper can be preferable for anyone during a first draft preparation instead of using keyboard and other computer input interfaces, especially when writing in languages and scripts for which keyboards are cumbersome. Devanagari keyboards for instance, are quite difficult to use. Devanagari characters follow a complex structure and may count up to more than 500 symbols palC04PR; jayadevanKPP11SMC.

  2. Devanagari is a script used to write several Indian languages, including Nepali, Sanskrit, Hindi, Marathi, Pali, Kashmiri, Sindhi, and sometimes Punjabi. According to the 2001 Indian census, 258 million people in India used Devanagari.

  3. Writing one’s own style brings unevenness in writing units, which is the most difficult part to recognise. Variation in basic writing units such as number of strokes, their order, shapes and sizes, tilting angles and similarities among classes of characters are considered as the important issues. In contrast to Roman script, it happens more in cursive scripts like Devanagari.

    Devanagari is written from left to right with a horizontal line on the top which is the shirorekha. Every character requires one shirorekha from which text(s) is(are) suspended. The way of writing Devanagari has its own particularities. In what follows, in particular, we shortly explain a few major points associated difficulties.

    • Many of the characters are similar to each other in structure. Visually very similar symbols – even from the same writer – may represent different characters. While it might seem quite obvious in the following examples to distinguish the first from the second, it can easily be seen that confusion is likely to occur for their handwritten symbol counterparts (k, P), (y, p), (Y, d), etc.). Fig. 2 shows a few examples of it.

    • The number of strokes, their order, shapes and sizes, directions, skew angle etc. are writing units that are important for symbol recognition and classification. However, these writing units most often vary from one user to another and there is even no guarantee that a same user always writes in a same way. Proposed methods should take this into account.

    Fig. 2: A few samples of several different similar classes from Devanagari script.

Based on those major aforementioned reasons, there exists clear motivation to pursue research on Devanagari handwritten character recognition.

1.4 Structure of the Chapter

The remaining of the paper is organised as follows. In Section 2, we start with detailing the basic concept of character recognition framework in addition to the major highlights on important issues: feature selection, matching and recognition. Section 3 gives a complete outline of how we can efficiently handle optimal recognition performance over cursive scripts like Devangari. In this section, we first provide the complete and then validate the whole process step by step with genuine reasoning and a series of experimental tests over our own dataset but, publicly available. We conclude the chapter in Section 4.

2 Character Recognition Framework

Basically, we can categorise character recognition system into two modules: learning and testing. In learning or training module, following Fig. 3, handwritten strokes are learnt or stored. Testing module follows the former one. The performance of the recognition system is depends on how well handwritten strokes are learnt. It eventually refers to the techniques we employ.

input Handwritten Symbol
Stroke Pre-processing
Feature Selection
Template Formation & Mgmt. using clustering
(cf. Section 3.2), for instance.
Fig. 3: Learning strokes from the handwritten symbols. In this illustration, we present a basic concept to form template via clustering of features of the strokes immediately after they are pre-processed.

Basically, learning module employs stroke pre-processing, feature selection and clustering to form template to be stored. Pre-processing and feature selection techniques can be varied from one application to another. For example, noisy stroke elimination or deletion in Roman cannot be directly extended to the cursive scripts like Urdu and Devanagari. In other words, these techniques are found to be application dependent due to their different writing styles. However, they are basically adapted to each other and mostly ad-hoc techniques are built so that optimal recognition performance is possible. In the framework of stroke-based feature extraction and recognition, one can refer to 

ChiuT99PR; Zhou:2007:ICDAR

, for example. It is important to notice that feature selection usually drives the way we match them. As an example, fixed size feature vectors can be straightforwardly matched while for non-linear feature vector sequences, dynamic programming (elastic matching) has been basically used 

sakoe78ASSP; myers81; kruskall83; keogh99. The concept was first introduced in the 60’s bellman59a. Once we have an idea to find the similarity between the strokes’ features, we follow clustering technique based on their similarity values. The clustering technique will generate templates as the representative of the similar strokes provided. These stored templates will be used for testing in the testing module. Fig. 4 provides a comprehensive idea of it (testing module). More specifically, in this module, every test stroke will be matched with the templates (learnt in training module) so that we can find the most similar one. This procedure will be repeated for all available test strokes. At the end, aggregating all matching scores provides an idea of the test character closer to which one in the template.

training module
input Handwritten Symbol Template
Stroke Pre-processing Feature Selection Feature Matching
character’s label output
(via similarity
measure)
Fig. 4: An illustration of testing module. As in learning module, test characters are pre-processed and we present a basic concept to form template via clustering of features of the strokes immediately after they are pre-processed.

2.1 Preprocessing

Strokes directly collected from users are often incomplete and noisy. Different systems use a variety of different pre-processing techniques before feature extraction blumenstein03; verma04nn; yasser10CR. The techniques used in one system may not exactly fit into the other because of different writing styles and nature of the scripts. Very common issues are repeated coordinates deletion bahlmann04, noise elimination and normalisation guerfali93; chun05p.

Besides pre-processing, in this chapter, we mainly focus on feature selection and matching techniques.

2.2 Feature Selection

If you have complete address of your friend then you can easily find him/her without an additional help from other people on the way. The similar case is happened in character recognition. Here, an address refers to a feature selection. Therefore, the complete or sufficient feature selection from the provided input is the crucial point. In other words, appropriate feature selection can greatly decrease the workload and simplify the subsequent design process of the classifier.

In what follows, we discuss a few but major issues associated with feature selection.

  • Pen-flow i.e., speed while writing determines how well the coordinates along the pen trajectory are captured. Speed writing and writing with shivering hands, do not provide complete shape information of the strokes.

  • Ratios of the relative height, width and size of letters are not always consistent - which is obvious in natural handwriting.

  • Pen-down and pen-up events provide stroke segmentation. But, we do not know which and where the strokes are rewritten or overwritten.

  • Slant writing style or writing with some angles to the left or right makes feature selection difficult. For example, in those cases, zoning information using orthogonal projection does not carry consistent information. This means that the zoning features will vary widely as soon as we have different writing styles.

We repeat, features should contain sufficient information to distinguish between classes, be insensitive to irrelevant variability of the input, allow efficient computation of discriminant functions and be able to limit the amount of training data required lippmann89. However, they vary from one script to another blumenstein03; namboodiri04; verma04nn; okumura05.

Fig. 5: An illustration of feature selection: pen-tip position and tangent at every pen-tip position along the pen trajectory.

Feature selection is always application dependent i.e., it relies on what type of scripts (their characteristics and difficulties) used. In our case, we use a feature vector sequence of any stroke is expressed as in okumura05; kc_pricai06; kc_IJIG12:

(1)

where, . Fig. 5 shows a complete illustration.

Our feature includes a sequence of both pen-tip position and tangent angles sampled from the trajectory of the pen-tip, preserving the directional property of the trajectory path. It is important to remind that stroke direction (either left – right or right – left) leads to very different features although they are geometrically similar. To efficiently handle it, we need both kinds of strokes or samples for training and testing. This does not mean that same writer must be used.

The idea is somehow similar to the directional arrows that are composed of eight types, coded from . This can be expressed as, .

However, these directional arrows provide only the directional feature of the strokes or line segments. Therefore, more information can be integrated if the relative length of the standard strokes is taken into account ChaSS99ICDAR.

2.3 Feature Matching

Besides, discussing on classifiers, we explain how features can be matched to obtain similarity or dissimilarity values between them.

Matching techniques are often induced by how features are taken or strokes are represented. For instance, normalising the feature vector sequence into a fixed size vector provides an immediate matching. On the other hand, features having different lengths or non-linear features need dynamic programming for approximate matching, for instance. Considering the latter situation, we explain how dynamic programming is employed.

Dynamic time warping (DTW) allows us to find the dissimilarity between two non-linear sequences potentially having different lengths sakoe78ASSP; myers81; kruskall83; keogh99. It is an algorithm particularly suited to matching sequences with missing information, provided there are long enough segments for matching to occur.

Let us consider two feature sequences

of size and , respectively. The aim of the algorithm is to provide the optimal alignment between both sequences. At first, a matrix of size is constructed. Then for each element in matrix , local distance metric between the events and is computed i.e., . Let be the global distance up to ,

with an initial condition such that it allows warping path going diagonally from starting node to end . The main aim is to find the path for which the least cost is associated. The warping path therefore provides the difference cost between the compared signatures. Formally, the warping path is,

where and element of is for . The optimised warping path satisfies the following three conditions.

  • boundary condition:

  • monotonicity condition:

  • continuity condition:

c1 conveys that the path starts from to , aligning all elements to each other. c2 forces the path advances one step at a time. c3 restricts allowable steps in the warping path to adjacent cells, never be back. Note that c3 implies c2.

We then define the global distance between and as,

The last element of the matrix gives the DTW-distance between and , which is normalised by i.e., the number of discrete warping steps along the diagonal DTW-matrix. The overall process is illustrated in Fig. 6.

Fig. 6: Classical DTW algorithm – an alignment illustration between two non-linear sequences and . In this illustration, diagonal DTW-matrix is shown including how back-tracking has been employed.

Until now, we provide a global concept of using DTW distance for non-linear sequences alignment. In order to provide faster matching, we have used local constraint on time warping proposed in keogh02. We have such that where is a term defining a reach i.e., allowed range of warping for a given event in a sequence. With , upper and lower bounding measures can be expressed as,

Therefore, for all , an obvious property of and is . With this, we can define a lower bounding measure for DTW:

Since this provides a quick introduction of local constraint for lower bounding measure, we refer to keogh02 for more clarification.

2.4 Recognition

From a purely combinatorial point of view, measuring the similarity or dissimilarity between two symbols

composed, respectively, of and strokes, requires a one by one matching score computation of all strokes with all . This means that we align individual test strokes of an unknown symbols with the learnt strokes. As soon as we determine the test strokes associated with the known class, the complete symbol can be compared by the fusion of matching information from all test strokes. Such a concept is fundamental under the purview of stroke-based character recognition.

Overall, the concept may not always be sufficient, and these approaches generally need a final, global coherence check to avoid matching of strokes that shows visual similarity but do not respect overall geometric coherence within the complete handwritten character. In other words, matching strategy that happens between test stroke and templates of course, should be intelligent rather than straightforward one-to-many matching concepts. However, it in fact, depends on how template management has been made. In this chapter, this is one of the primary concerns. We highlight the use of relative positioning of the strokes within the handwritten symbol and its direct impact to the performance 

kc_IJIG12.

3 Recognition Engine

To make the chapter coherence as well as consistent (to Devanagari character recognition), it refers to the recognition engine which is entirely based on previous studies or works kc_CIS06; kc_pricai06; kc_tujournal07; kc_icfhr10; kc_IJIG12. Especially because of the structure of Devanagari, it is necessary to pay attention to the appropriate structuring of the strokes to ease and speed up comparison between the symbols, rather than just relying on global recognition techniques that would be based on a collection of strokes kc_pricai06. Therefore, kc_icfhr10; kc_IJIG12 develop a method for analysing handwritten characters based on both the number of strokes and the their spatial information. It consists in four main phases.

step 1.

Organise the symbols representing the same character into different groups based on the number of strokes.

step 2.

Find the spatial relation between strokes.

step 3.

Agglomerate similar strokes from a specific location in a group.

step 4.

Stroke-wise matching for recognition.

For more clear understanding, we explain the aforementioned steps as follows. For a specific class of character, it is interesting to notice that writing symbols with the equal number of strokes, generally produce visually similar structure and is easier to compare.

In every group within a particular class of character, a representative symbol is synthetically generated from pairwise similar strokes merging, which are positioned identically with respect to the shirorekha. It uses DTW algorithm. The learnt strokes are then stored accordingly. It is mainly focused on stroke clustering and management of the learnt strokes.

We align individual test strokes of an unknown symbols with the learnt strokes having both same number of strokes and spatial properties. Overall, symbols can be compared by the fusion of matching information from all test strokes. This eventually build a complete recognition process.

3.1 Stroke Spatial Description and its Need

The importance of the location of the strokes is best observed by taking a few pairs of characters that often lead to confusion:

(B m), (D G), (T y) etc.

The first character in every pair has visually two distinguishing features: its particular location of the shirorekha (more to the right) and a small curve in the text. There is no doubt that one of the two features is sufficient to automatically distinguish both characters. However, small curves are usually not robust feature in natural handwriting, finding the location of the shirorekha only can avoid possible confusion. Our stroke based spatial relation technique is explained further in the following.

To handle relative positioning of strokes, we use six spatial predicates i.e., relational regions:

For easier understanding, iconic representation of the aforementioned relational matrix can be expressed as,

where black-dot represents the presence i.e., stroke is found to be in the provided bottom-right region.

To confirm the location of the stroke, we use the projection theory: minimum boundary rectangle (MBR) papadias94t model combined with the stroke’s centroid.

Based on egenhofer91, we start with checking fundamental topological relations such as disconnected (DC), externally connected (EC) and overlap/intersect (O/I) by considering two strokes and :

as follows,

We then use the border condition from the geometry of the MBR. It is straightforward for disconnected strokes while, is not for externally connected and overlap/intersect configurations. In the latter case, we check the level of the centroid with respect to the boundary of the MBR. For example, if a boundary of the shirorekha is above the centroid level of the text stroke, then it is confirmed that the shirorekha is on the top. This procedure is applied to all of the six previously mentioned spatial predicates. Note that use of angle-based model like bi-centre miyajima94a and angle histogram wang99a are not the appropriate choice due to the cursive nature of writing.

(a) Two-stroke k
(b) MBR + Centroid
model
      
(c) Model realisation
Fig. 7: Pairwise spatial relation for a two-stroke k.

On the whole, assuming that the shirorekha is on the top, the locations of the text

strokes are estimated. This eventually allows to cross-validate the location of the

shirorekha along with its size, once texts’ locations are determined. Fig. 7 shows a real example demonstrating relative positioning between the strokes for a two-stroke symbol k. Besides, symbols with two shirorekhas are also possible to treat. In such a situation, the first shirorekha according to the order of strokes is taken as reference.

3.2 Spatial Similarity based Clustering

Basically, clustering is a technique for collecting items which are similar in some way. Items of one group are dissimilar with other items belonging to other groups. Consequently, it makes the recognition system compact. To handle this, we present spatial similarity based stroke clustering.

(a) Two-stroke a
(b) Three-stroke a
Fig. 8: Relative positions of strokes for a class a in two different groups i.e., two-stroke and three-stroke symbols.

As mentioned in previous work kc_icfhr10; kc_IJIG12, the clustering scheme is a two-step process.

  • The first step is to organise symbols representing a same character into different groups, based on the number of strokes used to complete the symbol. Fig. 8 shows an example of it for a class of character a.

  • In the second step, strokes from the specific location are agglomerated hierarchically within the particular group. Once relative position for every stroke is determined as shown in Fig. 8

    , single-linkage agglomerative hierarchical clustering is used (

    cf. Fig. 10). This means that only strokes which are at a specific location are taken for clustering. As an example, we illustrate it in Fig. 9. This applies to all groups within a class.

Fig. 9: Clustering technique for each class. Stroke clustering is based on the relative positioning. As a consequence, we have three clustering blocks for text strokes and remaining three for shirorekha.
Fig. 10: Hierarchical stroke clustering concept. At every step, features are merged according to their similarity up to the provided threshold level.

In agglomerative hierarchical clustering (cf. Fig. 10), we merge two similar strokes and find a new cluster. The distance computation between two strokes follows Section 2.3. The new cluster is computed by averaging both strokes via the use of the discrete warping path along the diagonal DTW-matrix. This process is repeated until it reaches the cluster threshold. The threshold value yields the number of cluster representatives i.e., learnt templates.

3.3 Stroke Number and Order Free Recognition

In natural handwriting, number of strokes as well as their order vary widely. This happens from one writing to another, even from the same user – which of course exits from different users. Fig. 11 shows the large variation of stroke numbers as well as the orders.

Once we have organised the symbols (from the particular class) into groups based on the number of strokes used, our stroke clustering has been made according to the relative positioning. As a consequence, while doing recognition, one can write symbol with any numbers and orders because stroke matching is based on relative positioning of the strokes in which group while it does not need to care about the strokes order.

(a) two-stroke k
(b) two-stroke k
(c) three-stroke k

(d) three-stroke k
(e) four-stroke k
(f) three-stroke k
Fig. 11: Different number of strokes and order for a class k. In this illustration, red-dot refers to the initial pen-tip position so that it makes easy to realise how many number of strokes to make a complete symbol. In addition, stroke ordering is different from one to another.

3.4 Dataset

In this work, as before, publicly available dataset has been employed (cf. Table 1) where a Graphite tablet (WCACOM Co. Ltd.), model ET0405A-U, was used to capture the pen-tip position in the form of coordinates at the sampling rate of 20 Hz. The data set is composed of 1800 symbols representing 36 characters, coming from 25 native speakers. Each writer was given the opportunity to write each character twice. No other directions, constraints, or instructions were given to the users.

Item Description
Classes of character 36
Users 25
Dataset size 1800
Visibility IAPR tc–11
http://www.iapr-tc11.org
Table 1: Dataset formation and its availability.

3.5 Recognition Performance Evaluation

While experimenting, every test sample is matched with training candidates and the closest one is reported. The closest candidate corresponds to the labelled class, which we call ‘character recognition’. Formally, recognition rate can be defined as the number of correctly recognised candidates to the total number of test candidates.

To evaluate the recognition performance, two different protocols can be employed:

  1. dichotomous classification and

  2. -fold cross-validation (CV).

In case of dichotomous classification, 15 writers are used for training and the remaining 10 are for testing. On the other hand, -fold CV has been implemented. Since we have 25 users for data collection, we employ in order to make recognition engine writer independent.

In -fold CV, the original sample for every class is randomly partitioned into sub-samples. Of the sub-samples, a single sub-sample is used for validation, and the remaining sub-samples are used for training. This process is then repeated for folds, with each of the sub-samples used exactly once. Finally, a single value results from averaging all. The aim of the use of such a series of rigorous tests is to avoid the biasing of the samples that can be possible in conventional dichotomous classification. In contrast to the previous studies kc_IJIG12, this will be an interesting evaluation protocol.

3.6 Results and Discussions

Following evaluation protocols we have mentioned before, Table 2 provides average recognition error rates. In the tests, we have found that the recognition performance has been advanced by approximately more than 2%.

Based on results (cf. Table 2), we investigate the recognition performance based on the observed errors. We categorise the origin of the errors that are occurred in our experiments. As said in Section 1.3, these are mainly due to

  1. structure similarity,

  2. reduced and/or very long ascender and/or descender stroke, and

  3. others such as re-writing strokes and mis-writing.

Compared to previous work kc_IJIG12, number of rejection does not change while confusions due to structure similarity has been reduced. This is mainly because of the 5-fold CV evaluation protocol. Besides, running time has been reduced by more than a factor of two i.e., 2 seconds per character, thanks to LB_Keogh tool keogh02.

of of Avg. Time
Method Mis-recognition Rejection Error % sec.
M1. 33 08 05.0 04
M2. 24 08 03.5 02
Index:
M1. kc_IJIG12.
M2. kc_IJIG12keogh02 and 5-fold CV.
Table 2: Error rates (in %) and running time (in sec. per character). The methods can be differentiated by the additional use of L_B Keogh tool keogh02 and the evaluation protocol employed.

4 Conclusions

In this chapter, an established as well as validated approach (based on previous studies kc_CIS06; kc_pricai06; kc_tujournal07; kc_icfhr10; kc_IJIG12) has been presented for on-line natural handwritten Devanagari character recognition. It uses the number of strokes used to complete a symbol and their spatial relations111A comprehensive work based on relative positioning of the handwritten strokes, is presented in kc_IJIG12. Once again, to avoid contradictions, this chapter aims to provide coherence as well as consistent studies on Devanagari character recognition.. Besides, we have provided the dataset publicly available for research purpose. Considering such a dataset, the success rate is approximately in less than 2 seconds per character on average. In this chapter, note that the new evaluation protocol reduces the errors (mainly due to multi-class similarity) and the optimised DTW reduces the delay in processing – which has been new attestation in comparison to the previous studies.

The proposed approach is able to handle handwritten symbols of any stroke and order. Moreover, the stroke-matching technique is interesting and completely controllable. It is primarily due to our symbol categorisation and the use of stroke spatial information in template management. To handle spatial relation efficiently (rather than not just based on orthogonal projection i.e., MBR), more elaborative spatial relation model can be used kc_11PRL

, for instance. In addition, use of machine learning techniques like inductive logic programming (ILP) 

kc_LR09ICDAR; Amin00IJIS to exploit the complete structural properties in terms of first order logic (FOL) description.

Acknowledgements

Since the chapter is based on the previous studies, thanks to researchers Cholwich Nattee, School of ICT, SIIT, Thammasat University, Thailand and Bart Lamiroy, Université de Lorraine – Loria Campus Scientifique, France for their efforts. Besides, the dataset is partially based on master thesis: TC-MS-2006-01, conducted in Knowledge Information & Data Management Laboratory, School of ICT, SIIT, Thammasat University under Asian Development Bank – Japan Scholarship Program (ADB-JSP).

References