Offline Writer Identification based on the Path Signature Feature

05/03/2019 ∙ by Songxuan Lai, et al. ∙ South China University of Technology International Student Union 0

In this paper, we propose a novel set of features for offline writer identification based on the path signature approach, which provides a principled way to express information contained in a path. By extracting local pathlets from handwriting contours, the path signature can also characterize the offline handwriting style. A codebook method based on the log path signature---a more compact way to express the path signature---is used in this work and shows competitive results on several benchmark offline writer identification datasets, namely the IAM, Firemaker, CVL and ICDAR2013 writer identification contest dataset.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Offline writer identification is to determine the writer of a text among numerous known writers based on their handwriting images. With practical applications in forensic analysis, offline writer identification is an important research field in pattern recognition and has made progressive advances in the past decades

[1][2][3]. Most existing approaches are text-independent and can be roughly divided into three categories: texture-based, structure-based and CNN-based approaches.

Texture-based approaches regard handwritten texts as a special texture image and extract texture features for writer identification. Bertolini et al. [4] used local binary patterns (LBP) and local phase quantization (LPQ) in the dissimilarity framework. He et al. [5] designed the LBPruns method by carrying out binary tests on parallel scanning lines and counting the run lengths of the resulting LBP patterns. By considering the joint feature distribution (JFD) principle [3], more powerful texture features, such as CoLBP, can be designed. Filter-based approaches, such as Gabor and XGabor filters [6], derivative-of-Gaussian filters [7], and wavelet [8], are also commonly used in writer identification,

Structure-based approaches exploit the structural information of local handwriting traces, and can be divided into two main categories: contour-based and grapheme-based methods. The contour-based methods extract features from the contours of handwritings and capture the local geometric properties. Bulacu et al. [2] proposed several contour-based directional features, of which the Hinge feature is considered to provide a strong baseline for following studies. Siddiqi et al. [9] used the global and local chain code-based features to capture orientation and curvature. Brink et al. [10]

proposed the Quill feature which is a joint probability of the ink direction and the ink width. He et al.

[11] generalized the Hinge feature and introduced the rotation-invariant Hinge feature. A series of new contour-based features was introduced in [3] based on the JFD principle. In the grapheme-based methods, handwriting graphemes [2] (obtained via segmentation) or patches [9] are used to generate a codebook that can capture the structural details of the allographs emitted by the writers. The SIFT [12][13] and RootSIFT [14] are also very effective in offline writer identification, which can be viewed as scale-invariant graphemes in the Gaussian scale space.

CNN-based approaches train convolutional neural networks to extract discriminant features from handwriting patches

[15], words [16], or pages [17]. Although CNN achieves very promising results, it requires a heavy computation which limits its appeal in some scenarios.

In this paper, we propose a novel set of contour-based features based on the path signature (PS) approach. The PS was initially introduced in the rough paths theory as a branch of stochastic analysis and has recently been successfully applied to pattern recognition as a principled method for time series description, such as online handwriting recognition [18][19], online signature verification [20][21], and skeleton-based action recognition [22][23]. Although offline handwritten text is not a time series, by extracting the handwriting contours and segmenting them into local fragments, we can describe the analytic and geometric properties of such fragments with the PS. We denote such fragments as pathlets. Compared with previous contour-based approaches, the PS method encodes rich information, such as orientation and curvature, in a principled way, and outperforms previous approaches on several benchmark datasets.

The rest of the paper is organized as follows. Section II introduces the PS theory in detail and shows how it can be used to characterize and identify offline handwritings. Section III reports the experimental results and analysis. Finally, Section IV concludes the paper.

Ii Methodology

Fig. 1: Geometric interpretations of and . and are elements from the level PS, while is an element from the level LPS.

To make this paper more self-contained, in this section we first concisely introduce the PS and log path signature (LPS), and then show how we construct features based on the LPS for offline writer identification.

Ii-a Path Signature and Log Path Signature

Ii-A1 Path Signature

The information contained in a path can be expressed in the form of iterated integrals [24]. Assume a path where is a time interval. The coordinate paths are denoted by , where each is a real-valued path. The k-fold iterated integral of path along the indexes is defined as

(1)

where , and denotes the value of path at time (). For example, for a 2D path such as the handwriting trajectory, the 1-fold integrals have two terms:

(2)

which correspond to increments in the x, y directions, respectively. The 2-fold iterated integrals have four terms:

(3)

where the first two terms are proportional to the square of the corresponding increments, and the last two terms have intuitive geometric interpretation as shown in Fig. 1.

The signature of the path is defined as the collection of all the iterated integrals of :

(4)

where the “zeroth” term is 1 by convention, and the superscripts run along the set of all multi-indexes

(5)

And the level signature, a subset of , is defined to be the collection of all -fold iterated integrals. Since the full is an infinite series, in practice we usually truncate it at some level . For example, in online handwriting recognition [18][19] and writer identification [25][26], is usually set between 2 and 5.

Ii-A2 Log Path Signature

The LPS can be obtained by considering the tensor logarithm of the PS:

(6)

where is the tensor product. Chen’s theorem [27] shows that the LPS can always be expressed using the Hall basis [28]; therefore, the LPS can be viewed as a more compact way to represent the PS or a dimensionality reduction.

Ii-B Offline Writer Identification Based on the LPS

Ii-B1 Pathlet Extraction

Although the offline handwritten text is not a time series, we can extract local pathlets from the polygonized

handwriting contours as illustrated in Fig. 2 (a). First, the handwriting image is binarized using the Otsu’s method, and the handwriting contours are extracted. Subsequently, a polygonization step, also called line approximation, is used to approximate each contour with a reduced number of points, with a tolerance approximation error

. The reason behind this is twofold. On the one hand, the raw contours are believed to have significant redundancy and noise, such as long straight lines and jagged edges owing to quantization, and can be well approximated with the structural information maintained. On the other hand, the computation can be much reduced. For example, by setting , the number of points can be reduced by about 90%.

After the above steps, we can simply trace a polygonized contour, starting from any point, and thus define the path. A pathlet is defined as a consecutive segment on the polygonized contour; using a sliding-window method, we can extract a large amount of pathlets and use the PS or LPS to describe their geometric properties. The pathlet size, i.e., the number of points in a pathlet, is an important parameter and should be appropriately chosen in order to cover expected local structures. Generally, the pathlet size should be inversely proportional to .

Ii-B2 Feature Representation for Offline Handwriting

An advantage of the LPS over the PS is that it has a lower dimensionality and is well distributed in the feature space, and hence, it is more suitable for the following clustering step to generate a feature codebook. Therefore we use the LPS in this paper. Nevertheless, the PS is also worthy of investigation, and we leave it for future work. Two feature normalization steps for LPS features are used. First, the length normalization method in [20] is applied to deal with the length variation of the pathlets. Second, each feature dimension is rescaled to .

Fig. 2: Construction of feature representation for offline handwriting. (a) Pathlets are extracted from the polygonized handwriting contours. (b) A pair of pathlets attached at a common end point. The joint LPS features of these pathlet pairs are extracted. (c) A feature matrix (i.e., 2D histogram) is computed from the joint LPS features based on a LPS feature codebook.

After LPS feature extraction, the bag-of-words method is used to construct the feature representation for offline handwriting owing to its simplicity. Specifically, the LPS features from training images are clustered using the k-means algorithm to obtain a feature codebook with

elements. Given any new image, inspired by the Hinge method [2], we consider the joint LPS feature of a pair of pathlets attached at a common end point, as shown in Fig. 2 (b). Let denote the codebook and denote the extracted feature set, where is the joint LPS feature from the pathlet pair. Based on and , a feature matrix FM can be obtained as follows.

  1. Initialize FM with zeros.

  2. For each , find the nearest codes and to and , respectively.

  3. .

  4. Repeat steps 2 and 3 until all elements in are visited.

  5. Normalize FM to sum to 1.

To measure the dissimilarity between two feature matrices U and V, the Manhattan distance

(7)

and the distance

(8)

are used.

Ii-B3 Rotation Invariant Features

Fig. 3: Based on the PS approach, rotation invariant features can be constructed. (a) A rotation invariant path, leading to rotation invariant PS and LPS features. (b) The level term from the LPS, which is rotation invariant.

Based on the PS approach, rotation invariant features can also be constructed. A possible solution is to construct rotation invariant paths, e.g., paths in the polar coordinates. Another possible solution is to construct rotation invariant features directly [29] from the PS or LPS. For example, the level term of the LPS is the Levy area enclosed by the path and the straight line connecting the end points. Fig. 3 illustrates the above two ideas. We leave these features for future work, as we do not focus on rotation invariant identification in the present study.

Iii Experiment Results and Analysis

Iii-a Datasets and Performance Evaluation

Fig. 4: Effect of polygonization and codebook size. (a) Effect of on different pathlet parameter settings, with the codebook size fixed at 16. (b) Effect of codebook size on different pathlet parameter settings, with = 1.0.

Our experiments are conducted on four benchmark datasets, namely IAM [30], Firemaker [31], CVL [32], and ICDAR2013 writer identification contest dataset [33].

The IAM dataset is collected from 657 writers, and is modified in this work in a similar manner to that in [2]. The modified IAM dataset used in our experiments has 650 writers, each having two handwritten documents.

The Firemaker dataset has 250 writers, each providing four pages of handwritten text. Following [3], the page 1 and 4 are used in the experiments.

The CVL dataset consists of 1604 handwritten documents from 310 writers. There are 27 writers who provide seven samples (one German and six English) and 283 writers who provide five samples (one German and four English). Because there are empty samples in user 431, the first four English samples from 309 writers are used in the experiments.

The ICDAR2013 dataset contains 1000 handwritten documents from 250 writers. Each writer is asked to copy four pages of text in two languages (two in English and two in Greek). The entire ICDAR2013 dataset is used in our experiments, regardless of the language used.

To evaluate the identification performance, the “leave-one-out” strategy is used: for each query document, we compute its distances to all other documents and sort the results in an ascending order. A correct hit is considered when at least one document of the same writer is included in the top N nearest neighbours. The ratio of the number of correct hits and the number of queries corresponds to the Top-N accuracy. In this paper we consider the Top-1 and Top-10 accuracies.

Iii-B Effect of Polygonization and Codebook Size

The polygonization parameter controls the number of points that are removed from the raw contours, and has a direct effect on the performance. Therefore we first experiment with = 0.2, 1.0, and 2.0 on the IAM dataset. The pathlet size is chosen to be 3, 4, or 5, and the LPS truncation level is chosen to be . The codebook size is fixed at 16. The distance is used for = 0.2; the Manhattan distance is used for = 1.0 and 2.0 and in all the following experiments.

Experiment results are shown in Fig. 4 (a). The best results are achieved at = 1.0. When = 0.2, a larger pathlet size can capture more meaningful local structures, therefore leads to an improved identification accuracy. When = 1.0, different pathlet sizes have similar performances. When = 2.0, a large pathlet size leads to a degraded performance. The reason may be twofold. First, a small codebook size is insufficient to represent the complex patterns of long pathlets. Second, the number of pathlets is much reduced when = 2.0, and therefore insufficient for stable representations of the documents. Therefore, the parameter , codebook size and the amount of ink should be considered at the same time. In practice, we observe that is a good choice across different datasets.

To see the effect of codebook size, we fix and vary the codebook size from 16 to 32, 48, and 64. The results are presented in Fig. 4 (b). We can see that, the performance is improved as the codebook size increases. Surprisingly, the pathlet setting steadily performs well. The setting , , achieves the best Top-1 accuracy of 94.23%, which is among the best reported results of individual state-of-the-art features (i.e., no feature combination is used).

Iii-C Results on Benchmark Datasets

Two parameter settings, , , and , , , are applied to several benchmark datasets, and the best results are reported in Table I. A better result is achieved with , , on the Firemaker dataset, whereas on the other three datasets, better results are achieved with , , . The reason is that, in the Firemaker dataset, the page 4 is naturally written and contains much less ink than the page 1. Therefore the page 4 has insufficient pathlets to cover the feature matrix, as analyzed in the above experiments.

Dataset Top-1 Top-10 Parameter setting
IAM 94.24 97.77 4 3 48
Firemaker 91 97 3 2 32
CVL 99.27 99.51 4 3 48
ICDAR2013 96.6 99.2 4 3 48
TABLE I: Offline writer identification results on four benchmark datasets using the LPS feature.
Database Method Feature Top-1 Top-10
IAM Bulacu et al.[2] Hybrid 89 97
Wu et al.[12] SDS+SOH 98.5 99.5
He et al.[34] Junclets 83.3 94.4
He et al.[3] QuadHinge 93.2 96.5
Khan et al.[35] DCT features 97.2 -
Our method LPS 94.24 97.77
Firemaker Bulacu et al.[2] Hybrid 83 95
Wu et al.[12] SDS+SOH 92.4 98.8
He et al.[34] Junclets 80.6 94.0
He et al.[3] QuadHinge 92.2 97.2
Khan et al.[35] DCT features 89.47 -
Our method LPS 91 97
CVL CS-UMD[32] Graphems 97.9 99.4
Wu et al.[12] SDS+SOH 99.2 99.6
Nicolaou et al.[36] SRS-LBP 99.0 99.5
Our method LPS 99.27 99.51
ICDAR2013 CS-UMD-a[33] Graphems 95.1 99.1
Wu et al.[12] SDS+SOH 95.6 99.1
Nicolaou et al.[36] SRS-LBP 97.2 99.2
Our method LPS 96.6 99.2

The “leave-one-out” strategy is not used.

TABLE II: Comparison with previous results on four benchmark datasets.

Fig. 5: The fourth pages from the CVL dataset are cropped into text lines for line-level identification.

We compare the our results with previous results in Table II. Note that due to different experiment protocols in the literatures (such as whether the “leave-one-out” strategy is used and how the dataset is used), some reported results should be treated differently. Our method achieves a very promising performance without any feature fusion or complex preprocessing such as segmentation. Thanks to the polygonization step and a small codebook size, our method is as fast as other contour-based methods, such as Hinge[2], at the same time. For example, the computation of the LPS feature is a magnitude faster than that of SIFT.

Iii-D Effect of Amount of Ink

The amount of ink is important for the offline writer identification systems. To test how the performance of our method varies with amount of ink in query documents, we conduct the following line-level identification experiment. We crop the fourth pages from the CVL dataset into text lines using the line projection method, with some necessary manual adjustments. Some examples are shown in Fig. 5. As there are at least four text lines within each document, we take the first four text lines from each document. When using a single text line as a query, the number of queries is ; when using two text lines and three text lines, the numbers of queries are and , respectively. The first three pages are used as templates, while the text lines are used as the test set. The “leave-one-out” strategy is not applied here, and the setting , , is used. Experiment results are shown in Table III. Using only one single text line as a query, our method can achieve a top-1 accuracy of 95.31%. Therefore our method is rather data efficient.

Query sample 1 line 2 lines 3 lines whole page
Top-1 accuracy 95.31 98.44 99.03 99.35
TABLE III: Line-level writer identification on the CVL dataset.
Template sample Top-1 Top-10 Parameter setting
Page 1 95.6 97.6 3 2 32
Page 4 86.4 96.4
Page 1 96.8 98.8 4 3 48
Page 4 82.8 96.8
TABLE IV: Performance of two parameter settings on the Firemaker dataset to see the effect of amount of ink.

We should point out that the templates should be representative of the writers. For example, three pages of text are used as templates to achieve the results in Table III. If the amount of ink in templates is insufficient, statistical methods, including ours, would have a degraded performance. Indeed, this is why the Firemaker dataset prefers the setting , , rather than , , . In Table IV we give the results of the two settings on the Firemaker dataset, and consider whether pages 1 or pages 4 are used as templates. The setting , , is better when using pages 4 as templates, whereas the setting , , is better when using pages 1 as templates. This is because the amount of ink in pages 4 is much less than that in pages 1. It is an important research direction to improve the performance when the amount of ink in templates is insufficient.

Iv Conclusion

In this paper, we propose a novel set of features for offline writer identification based on the PS approach. The PS provides a principled way to express information contained in a path, such as orientation and curvature, and has achieved significant success in time series description. By extracting local pathlets from offline handwriting images, the PS can also to characterize the offline handwriting style. A codebook method based on the LPS—a more compact way to express the PS—is used in this work and shows competitive results on several benchmark datasets. Investigations of the PS, rotation invariant features and feature fusion are leaved for future work. Furthermore, how to improve the performance when no sufficient amount of ink is available is an important research direction.

References

  • [1] R. Plamondon and G. Lorette, “Automatic signature verification and writer identification—the state of the art,” Pattern recognition, vol. 22, no. 2, pp. 107–131, 1989.
  • [2] M. Bulacu and L. Schomaker, “Text-independent writer identification and verification using textural and allographic features,” IEEE transactions on pattern analysis and machine intelligence, vol. 29, no. 4, pp. 701–717, 2007.
  • [3] S. He and L. Schomaker, “Beyond OCR: Multi-faceted understanding of handwritten document characteristics,” Pattern Recognition, vol. 63, pp. 321–333, 2017.
  • [4] D. Bertolini, L. S. Oliveira, E. Justino, and R. Sabourin, “Texture-based descriptors for writer identification and verification,” Expert Systems with Applications, vol. 40, no. 6, pp. 2069–2080, 2013.
  • [5] S. He and L. Schomaker, “Writer identification using curvature-free features,” Pattern Recognition, vol. 63, pp. 451–464, 2017.
  • [6] B. Helli and M. E. Moghaddam, “A text-independent Persian writer identification based on feature relation graph (frg),” Pattern Recognition, vol. 43, no. 6, pp. 2199–2209, 2010.
  • [7] A. J. Newell and L. D. Griffin, “Writer identification using oriented basic image features and the delta encoding,” Pattern Recognition, vol. 47, no. 6, pp. 2255–2265, 2014.
  • [8] Z. He, X. You, and Y. Y. Tang, “Writer identification using global wavelet-based features,” Neurocomputing, vol. 71, no. 10-12, pp. 1832–1841, 2008.
  • [9] I. Siddiqi and N. Vincent, “Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features,” Pattern Recognition, vol. 43, no. 11, pp. 3853–3865, 2010.
  • [10] A. Brink, J. Smit, M. Bulacu, and L. Schomaker, “Writer identification using directional ink-trace width measurements,” Pattern Recognition, vol. 45, no. 1, pp. 162–171, 2012.
  • [11] S. He and L. Schomaker, “Delta-n hinge: rotation-invariant features for writer identification,” in 2014 22nd International Conference on Pattern Recognition (ICPR).   IEEE, 2014, pp. 2023–2028.
  • [12] X. Wu, Y. Tang, and W. Bu, “Offline text-independent writer identification based on scale invariant feature transform,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 3, pp. 526–536, 2014.
  • [13] Y.-J. Xiong, Y. Wen, P. S. Wang, and Y. Lu, “Text-independent writer identification using SIFT descriptor and contour-directional feature,” in 2015 13th International Conference on Document Analysis and Recognition (ICDAR).   IEEE, 2015, pp. 91–95.
  • [14]

    F. A. Khan, F. Khelifi, M. A. Tahir, and A. Bouridane, “Dissimilarity gaussian mixture models for efficient offline handwritten text-independent identification using SIFT and RootSIFT descriptors,”

    IEEE Transactions on Information Forensics and Security, vol. 14, no. 2, pp. 289–303, 2019.
  • [15] V. Christlein, D. Bernecker, F. Honig, and E. Angelopoulou, “Writer identification and verification using gmm supervectors,” in

    2014 IEEE Winter Conference on Applications of Computer Vision (WACV)

    .   IEEE, 2014, pp. 998–1005.
  • [16] S. He and L. Schomaker, “Deep adaptive learning for writer identification based on single handwritten word images,” Pattern Recognition, vol. 88, pp. 64–74, 2019.
  • [17] Y. Tang and X. Wu, “Text-independent writer identification via cnn features and joint bayesian,” in 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).   IEEE, 2016, pp. 566–571.
  • [18]

    S. Lai, L. Jin, and W. Yang, “Toward high-performance online HCCR: A CNN approach with DropDistortion, path signature and spatial stochastic max-pooling,”

    Pattern Recognition Letters, vol. 89, pp. 60–66, 2017.
  • [19] Z. Xie, Z. Sun, L. Jin, H. Ni, and T. Lyons, “Learning spatial-semantic context with fully convolutional recurrent network for online handwritten Chinese text recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 8, pp. 1903–1917, 2018.
  • [20]

    S. Lai, L. Jin, and W. Yang, “Online signature verification using recurrent neural network and length-normalized path signature descriptor,” in

    2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1.   IEEE, 2017, pp. 400–405.
  • [21] S. Lai and L. Jin, “Recurrent adaptation networks for online signature verification,” IEEE Transactions on Information Forensics and Security, 2019.
  • [22] W. Yang, T. Lyons, H. Ni, C. Schmid, L. Jin, and J. Chang, “Leveraging the path signature for skeleton-based human action recognition,” arXiv preprint arXiv:1707.03993, 2017.
  • [23] C. Li, X. Zhang, and L. Jin, “LPSNet: A novel log path signature feature based hand gesture recognition framework,” in 2017 IEEE International Conference on Computer Vision Workshop (ICCVW).   IEEE, 2017, pp. 631–639.
  • [24] I. Chevyrev and A. Kormilitzin, “A primer on the signature method in machine learning,” arXiv preprint arXiv:1603.03788, 2016.
  • [25] W. Yang, L. Jin, and M. Liu, “Chinese character-level writer identification using path signature feature, DropStroke and deep CNN,” in 2015 13th International Conference on Document Analysis and Recognition (ICDAR).   IEEE, 2015, pp. 546–550.
  • [26] ——, “DeepWriterID: An end-to-end online text-independent writer identification system.” IEEE Intelligent Systems, vol. 31, no. 2, pp. 45–53, 2016.
  • [27] K.-T. Chen, “Integration of paths, geometric invariants and a generalized Baker-Hausdorff formula,” Annals of Mathematics, pp. 163–178, 1957.
  • [28] C. Reutenauer, “Free lie algebras,” in Handbook of algebra.   Elsevier, 2003, vol. 3, pp. 887–903.
  • [29] J. Diehl, “Rotation invariants of two dimensional curves based on iterated integrals,” arXiv preprint arXiv:1305.6883, 2013.
  • [30] U.-V. Marti and H. Bunke, “The IAM-database: an English sentence database for offline handwriting recognition,” International Journal on Document Analysis and Recognition, vol. 5, no. 1, pp. 39–46, 2002.
  • [31] M. Bulacu, L. Schomaker, and L. Vuurpijl, “Writer identification using edge-based directional features,” in 2003 7th International Conference on Document Analysis and Recognition (ICDAR).   IEEE, 2003, pp. 937–941.
  • [32] F. Kleber, S. Fiel, M. Diem, and R. Sablatnig, “CVL-database: An off-line database for writer retrieval, writer identification and word spotting,” in 2013 12th International Conference on Document Analysis and Recognition (ICDAR).   IEEE, 2013, pp. 560–564.
  • [33] G. Louloudis, B. Gatos, N. Stamatopoulos, and A. Papandreou, “ICDAR2013 competition on writer identification,” in 2013 12th International Conference on Document Analysis and Recognition (ICDAR).   IEEE, 2013, pp. 1397–1401.
  • [34] S. He, M. Wiering, and L. Schomaker, “Junction detection in handwritten documents and its application to writer identification,” Pattern Recognition, vol. 48, no. 12, pp. 4036–4048, 2015.
  • [35] F. A. Khan, M. A. Tahir, F. Khelifi, A. Bouridane, and R. Almotaeryi, “Robust off-line text independent writer identification using bagged discrete cosine transform features,” Expert Systems with Applications, vol. 71, pp. 404–415, 2017.
  • [36] A. Nicolaou, A. D. Bagdanov, M. Liwicki, and D. Karatzas, “Sparse radial sampling LBP for writer identification,” in 2015 13th International Conference on Document Analysis and Recognition (ICDAR).   IEEE, 2015, pp. 716–720.