I Introduction
This paper addresses the problem of automatic writer identification using off-line handwritten images. Handwriting is a kind of behavioural biometrics. Writer can be recognized by capturing specific characteristics of handwriting habbit of one author, which differ from other authors. [1] Writer identification has been applied in anti-crime and historic document analysis fields, which requires high level of domain expertise and heavy work.
Automatic writer identification aims to recognizing person based on his or her handwritten text. Researches in writer identification can be divided into two categories, off-line and on-line identification. On-line writer identification requires record the whole procedure of writing with special devices, thus the input is a time series of pen-tip positions, pressures, angles and other information about writing. On the other hand, off-line identification merely takes scanned images of handwritten text as input, which is usually more difficult [3].
Methods for off-line writer identification can be further categorized into two groups: text-dependent and text-independent. Text-dependent methods [18, 19, 20, 21] require input image with fixed text contents and which usually compares the input with registered templates for identification. In contrast with this, text-independent methods [1, 2, 4] dose not make assumptions on input content and have broader applications. However, compared with text-dependent one, text-independent writer identification needs to deal with image with arbitrary texts which exhibits huge intra-category variations, therefore, and is much more challenging. Figure 1 and Figure 2 shows several examples of handwritten English and Chinese by different writers. As can be seen, the main difference between two handwritten images is dominated by the text contents. For writer identification, one needs to extract abstractive written style features and fine details which reflect personal writing habits. This poses a great challenge for current handcrafted features which usually capture the local shape and gradient information. These handcrafted features may include both information of written contents (text) and written styles (person), which may limit their performance on this task.
![]() |
![]() |
![]() |
![]() |
To address this challenging problem, this paper leverages deep CNNs (Convolutional Neural Network) as a powerful model to learn effective representations for off-line text-independent writer identification. Deep CNNs have demonstrated its effectiveness in various computer vision problems by improving state-of-the-art results with a large margin, including image classification
[5, 6, 7], object detection [8, 9][10, 11], handwriting recognition [12] etc. We propose DeepWriter, a multi-stream CNN, for extracting writer-sensitive features. DeepWriter takes multiple local regions as input and is trained with softmax loss on identification. The main contributions are three-folds. Firstly, we design a multi-stream structure and optimize its configuration for writer identification task. Secondly, we introduce data augmentation to enhance the performance of DeepWriter. Finally, we introduce a patch scanning strategy to handle handwritten image with various lengths. We evaluate the proposed methods on IAM dataset [15] and HWDB1.1 dataset [14]. Our methods achieves high identification accuracy of on 301 writers, on 657 writers from the IAM dataset on English sentence level, and on 300 writers from HWDB1.1 dataset on Chinese character level, which outperforms previous state-of-the-art. Interestingly, our results also show that handwritten texts of different languages such as English and Chinese may share common features for writer identification, and pretraining CNNs on another language can lead to better performance.Ii Related Works
Writer verification is similar to writer identification. Writer verification system [22, 23, 24, 1] performs one-to-one comparison and determines whether or not the two input example are written by the same writer. Writer identification system [1, 2] performs a one-to-many search in a large database with handwriting samples of known authorship and returns a likely list of candidates. Writer verification performs two-class classification, while writer identification performs multi-class classification. [25] investigates how much handwritten text is needed for text-independent writer verification and identification. Experimental result in [25] demonstrates that, given the same number of handwritten characters, verification systems achieve lower error rate than identification systems with identical feature. Therefore, writer identification system is more ambiguous and difficult.
Methods proposed previously generally follow the pipeline of pre-processing, feature extraction and feature matching or classification, and mainly focus on feature extraction. In
[1], Bulace et.al. combined multiple features (directional, grapheme, and tun-length) and used probability distribution functions (PDFs) extracted from the handwriting images to characterize writer individuality, achieving an identification accuracy of
on 650 writers from IAM dataset on page level. In [2], Jain et.al. used K-adjacent segments (KAS) features to model character contours, achieving an identification accuracy of on 300 writers from IAM dataset on page level. These methods depend on features defined by humans, which has been shown can be learned automatically by deep CNN. We believe that with integrated training and overall optimization, deep CNN can learn to extract appropriate features to this task and outperform traditional methods.[3] leverages CNN to identify writer. [3] address the problem of on-line text-independent writer identification. [3] leverages on-line writing information and deep CNNs to obtain accuracy of on 187 writers with Chinese page input, and on 134 writers with English page input on CASIA Handwriting Database [16]. In contrast, this paper address the problem of off-line text-independent writer identification which is more general and difficult. This paper feeds the model with merely scanned gray-scale handwritten image, and learns effective representation with carefully designed deep CNN model, leading a more simplified and elegant method.
Iii DeepWriter


with a stride of
pixels and a padding of
pixels. The boxes with MPdenote max-pooling layers. The
like notation specifies that the max-pooling layer performs max-pooling operation in a neighbourhood of size with a stride of pixels. The boxes with FCXdenote fully-connected layers, and the followed number specifies the number of neurons. The
Sum box denote element-wise sum operation. The Softmaxdenote softmax classifier. All convolutional layers and fully-connected layers are followed by Rectified Linear Unit layer(ReLU).
FC6 and FC7 are followed by dropout layer with ratio=0.5 to prevents overfitting.
This section will firstly introduce the design of the multi-stream structure of DeepWrite and discuss how to preprocess the input image with various lengths as input for DeepWrite. Then we will describe the training and testing process with implementation details.
Iii-a Multi-Stream
Our basic network structure is similar to AlexNet structure [5], as depicted in Figure 5. In this paper, we denote this basic network structure as Half DeepWriter. Half DeepWriter takes as input a image patch. Input handwritten text images for identifying author are with various height and width. In particular, English sentence handwritten image are usually with high aspect-ratio, whose width is much bigger than its height. Resizing input image to fixed size distorts the the shape of handwriting, leading serious information loss. We thus employ a patch scanning strategy to address this problem. The patch scanning strategy is detailed below. However, scanning ignores spatial relationships between these image patches, which contains important information to determine the writer. On the other hand, it is expensive to keep complete spatial relationships between all image patches of input scanned handwritten image. As a trade-off, we leverage relationship between two adjacent image patches, leading to DeepWriter structure. The network structure of DeepWriter is depicted in Figure 4. DeepWriter takes as input a pair of image patches. Patch 2 is adjacent to Patch 1, as depicted in Figure 6. Out1 and out2
, output vectors of
FC7 of DeepWriter, are merged by element-wise sum operation. Detailed configuration of DeepWriter is specified in the caption of Figure 4. The number of model parameters in DeepWriter is the same as that in Half DeepWriter. Therefore, DeepWriter dose not increase the risk of overfitting, requiring the same size of training data size as Half DeepWriter. We experimentally demonstrate that considering spatial relationship between image patches benefits writer identification. The comparison between DeepWriter and Half DeepWriter on 301 writers from IAM dataset with English sentence handwritten text as input is shown in Table 1.Model | Accuracy |
---|---|
Half DeepWriter | 98.23% |
DeepWriter | 99.01% |
Iii-B Patch Scanning Strategy
Firstly, we resize the image so that min(w,h)=113 while maintaining its aspect ratio. Secondly, image patches are cropped from the resized image. Finally, image patches for testing are uniformly sampled from these cropped image patches with a specific ratio. The sample ratio in this paper is set to 20% with Chinese character input and 10% with English sentence input
Iii-C Kernel Size
Conv1 and Conv2 layers of DeepWriter and Half DeepWriter filter their input with smaller kernels with smaller stride compared to that of AlexNet. This structure adjustment is inspired by the observation that AlexNet fed with image patch degrades identification accuracy. Therefore, we decrease the kernel size and stride step of Conv1 and Conv2 layers to handle more image details. This network structure adjustment also decreases the number of parameters, thus decreasing the risk of overfitting. The comparison between AlexNet and its variants on 301 writers from IAM dataset with English handwritten image patch as input is shown in Table 2.
Patch size | Configuration | Accuracy | ||
---|---|---|---|---|
|
||||
|
||||
|
91.35% |
Iii-D Neuron Number
Comparing to AlexNet, FC6 and FC7 layers of DeepWriter and Half DeepWriter have less neurons. The size of training data and number of classes of this task are smaller than those of ILSVRC [13]. Therefore We believe that appropriate neuron number reduces the risk of overfitting. We chose the number of neurons of FC6 and FC7 through contrast experiment on validation set, varying neuron number of Half DeepWriter, on 301 writers from IAM dataset with English handwritten image patch as input. Experiment result is shown in Table 3. We finally set the neuron number of FC6 and FC7 layers of DeepWriter and Half DeepWriter to 1024.
Neuron number | Accuracy |
---|---|
4096 | |
1024 | 92.15% |
512 |
Iii-E Feature Sharing
We also observe that handwritten images of different languages share some common features for identifying writers. On IAM dataset, we finetune DeepWriter from Half DeepWriter model pretained on HWDB1.1, whose data size is much bigger than IAM dataset. On HWDB1.1 dataset, we finetune Half DeepWriter from the above DeepWriter model. Table 4 shows comparison between whether joint training or not.
Dataset | Train | Accuracy |
---|---|---|
IAM | Pretrained on HWDB | 99.01% |
IAM | Trained directly on IAM | 98.80% |
HWDB1.1 | Pretrained on IAM | 93.85% |
HWDB1.1 | Trained directly on HWDB1.1 | 93.45% |
Iii-F Training Details
We augment training data by resizing the shorter edge of input image to 113 with original aspect ratio and then randomly cropping image patches from the input image. It is important to keep the original aspect ratio which contains important information of handwriting habits for identifying writer. The identification accuracy degrades seriously when the input image is distorted.
Firstly, the Half DeepWriter was trained on HWDB1.1 dataset. We trained Half DeepWriter using mini-batch gradient descent. The batch size was set to 256, momentum to 0.9, and weight decay to . The learning rate was initialized at , and then decreased by a factor of 10 every iterations. The learning was stopped after 400K iterations.
Secondly, the DeepWriter for IAM dataset was fintuned from Half DeepWriter model pretained on HWDB1.1 dataset. The batch size was set to 256, momentum to 0.9, and weight decay to . The base learning rate was initialized at
, and then decreased by a factor of 10 every 20K iterations. The learning was stopped after 40K iterations. The learning rate of softmax layer correlated to specific dataset was set to tenfold larger than base learning rate.
Finally, the Half DeepWriter was finetuned from the above DeepWriter model in the same way as that of training directly.
Iii-G Testing Details
Given a scanned handwritten image, the testing procedure follows this pipeline: scan the image to generate image patches following the strategy presented above; input image patch pair or image patch into DeepWriter or Half DeepWriter to compute score vector ; compute final score of writer , where denotes the number of image patches; return the writer with highest score. Noting that the score vector outputted by DeepWriter can be treated as a probability distribution over all writers, we thus average score vectors of all image patch pairs or image patch to construct the final prediction of input image. The testing pipeline is depicted in Figure 6.

Iv Experiments
Iv-a Data sets
The IAM dataset (version 3.0) [15] contains unconstrained handwritten English text from 657 different writers, using different pens. Handwritten pages in IAM dataset were scanned at a resolution of 300dpi and saved as PNG images with 256 gray levels. IAM dataset contains 1,539 pages of scanned text which contains 5,685 isolated sentences. 301 writers contribute more than 1 page of scanned text. In this paper, we train, validate and test in sentence images. Sentence images contributed by each writer are divided into training set, validation set and testing set according to the ratio 4 : 1 : 1.
The HWDB1.1 dataset [14] contains handwritten Chinese text from 300 different writers, which were scanned at a resolution of 300dpt and saved with 256 gray levels. HWDB1.1 contains 1,172,907 Chinese character images. Each writer contributes about 3,755 different Chinese characters. The Chinese character images contributed by each writer are divided into training set, validation set, and testing set according to the ratio 4 : 1 : 1.
Iv-B Experimental Results
We use the off-the-shelf resource Caffe
[17] to train our Half DeepWriter and DeepWriter. Our Half DeepWriter achieves identification accuracy of on 300 writers with merely one Chinese character input. Our DeepWriter achieves identification accuracy of on 301 writers from IAM dataset on English sentence level, on 657 writers from IAM dataset on English sentence level. In addition, DeepWriter achieves identification accuracy of When given two adjacent English handwritten image patches, which usually cover 2 to 3 English alphabets. DeepWriter taking as input three adjacent image patches, which usually cover 3 to 4 English alphabets, achieves identification accuracy of . Experimental results above demonstrate that our models can obtain high identification accuracy with little handwritten text input.We summarize experiment results of our method and several published writer identification methods in Table 5. [1, 2, 27, 26, 28, 29]
follow the classic pipeline to address off-line writer identification problem: propose and combine multiple handcrafted features; employ Euclidean, cosine or trained SVM(Support Vector Machines) as similarity metric; perform nearest neighbour search to compute writer of input handwritten image.
[3] employs Deep CNNs to address on-line writer identification problem, as summarized in RELATED WORKS section. Our method outperforms previous start-of-art methods a large margin. DeepWriter achieve similar identification accuracy with much less input text. In addition, DeepWriter only need to store the trained model for test, without storing big reference data set. Because DeepWriter dose not need to perform heavy search computation, the test procedure is fast.Year | Input type | Dataset | Language | Number of writer | Input text for test | Accuracy | |
DeepWriter | 2016 | off-line | IAM | English | 301 | 1 sentence | 99.01% |
DeepWriter | 2016 | off-line | IAM | English | 657 | 1 sentence | 97.3% |
DeepWriter | 2016 | off-line | IAM | English | 301 | about 3 alphabets | 96.92% |
DeepWriter | 2016 | off-line | IAM | English | 301 | about 4 alphabets | 98.01% |
Half DeepWriter | 2016 | off-line | HWDB1.1 | Chinese | 300 | 1 character | 93.85% |
Bulacu et al. [1] | 2007 | off-line | IAM | English | 650 | 1 page | 89% |
Jain et al. [2] | 2011 | off-line | IAM | English | 300 | 1 page | 93.3% |
Jain et al. [2] | 2011 | off-line | IAM | English | 650 | 1 page | 92.1% |
Brink et al. [27] | 2012 | off-line | IAM | English | 657 | 1 page | 97% |
Bertolini et al. [26] | 2013 | off-line | IAM | English | 650 | 1 page | 96.7% |
He et al. [28] | 2015 | off-line | IAM | English | 650 | 1 page | 91.1% |
Hannad et al. [29] | 2016 | off-line | IAM | English | 657 | 6 text lines at most | 89.54% |
Yang et al. [3] | 2015 | on-line | CASIA Handwriting Database | English | 134 | 1 page | 98.51% |
Yang et al. [3] | 2015 | on-line | CASIA Handwriting Database | Chinese | 187 | 1 page | 95.72% |
V Conclusion and Future Work
In this paper, we introduce a novel data-driven text-independent model to identify writer for off-line handwritten scanned images. We learn a carefully designed deep Convolutional Neural Network to extract discriminative features from handwritten image patches. We investigate how the network structure affects identification accuracy and introduce multi-stream structure to leverage spatial relationship between handwritten image patches. We also investigate the appropriate method to augment training data for writer identification. We achieve high identification accuracy even merely taking as input one Chinese character or 4 English alphabets. In the future, we will investigate the off-line text-independent writer verification task with discriminative features extracted by DeepWriter. We will also investigate multi-task learning of identification and verification.
References
- [1] Bulacu M, Schomaker L. Text-independent writer identification and verification using textural and allographic features[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2007, 29(4): 701-717.
- [2] Jain R, Doermann D. Offline writer identification using k-adjacent segments[C]. Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE, 2011: 769-773.
- [3] Yang W, Jin L, Liu M. DeepWriterID: An End-to-end Online Text-independent Writer Identification System[J]. arXiv preprint arXiv:1508.04945, 2015.
- [4] Li B, Sun Z, Tan T. Online text-independent writer identification based on stroke’s probability distribution function[M]. Advances in Biometrics. Springer Berlin Heidelberg, 2007: 201-210.
-
[5]
Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
- [6] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
- [7] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[J]. arXiv preprint arXiv:1512.03385, 2015.
- [8] Girshick R. Fast r-cnn[C]. Proceedings of the IEEE International Conference on Computer Vision. 2015: 1440-1448.
- [9] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]. Advances in Neural Information Processing Systems. 2015: 91-99.
-
[10]
Sun Y, Wang X, Tang X. Deep learning face representation from predicting 10,000 classes[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014: 1891-1898.
- [11] Sun Y, Liang D, Wang X, et al. Deepid3: Face recognition with very deep neural networks[J]. arXiv preprint arXiv:1502.00873, 2015.
- [12] Ciresan D, Meier U, Schmidhuber J. Multi-column deep neural networks for image classification[C]. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012: 3642-3649.
- [13] J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla, and L. Fei-Fei. ILSVRC-2012, 2012. URL http://www.image-net.org/challenges/LSVRC/2012/.
- [14] C.-L. Liu, F. Yin, D.-H. Wang, Q.-F. Wang, CASIA online and offline Chinese handwriting databases, Proc. 11th International Conference on Document Analysis and Recognition (ICDAR), Beijing, China, 2011, pp.37-41.
- [15] U. Marti and H. Bunke. The IAM-database: An English Sentence Database for Off-line Handwriting Recognition. Int. Journal on Document Analysis and Recognition, Volume 5, pages 39 - 46, 2002.
- [16] CASIA Handwriting Database, http://biometrics.idealtest.org/dbDetailForUser.do?id=10
- [17] Jia Y, Shelhamer E, Donahue J, et al. Caffe: Convolutional architecture for fast feature embedding[C]. Proceedings of the ACM International Conference on Multimedia. ACM, 2014: 675-678.
- [18] H. Said, T. Tan, and K. Baker, “Personal Identification Based on Handwriting,” Pattern Recognition, vol. 33, no. 1, pp. 149-160, 2000.
-
[19]
H. Said, G. Peake, T. Tan, and K. Baker, “Writer Identification from Non-Uniformly Skewed Handwriting Images,” Proc. Ninth British Machine Vision Conf., pp. 478-487, 1998.
- [20] T. Tan, “Rotation Invariant Texture Features and Their Use in Automatic Script Identification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 7, pp. 751-756, July 1998.
- [21] Y. Zhu, T. Tan, and Y. Wang, “Font Recognition Based on Global Texture Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1192-1200, Oct. 2001.
-
[22]
Leclerc F, Plamondon R. Automatic signature verification: The state of the art—1989–1993[J]. International Journal of Pattern Recognition and Artificial Intelligence, 1994, 8(03): 643-660.
- [23] Srihari S N, Beal M J, Bandi K, et al. A statistical model for writer verification[C]. Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on. IEEE, 2005: 1105-1109.
- [24] Hafemann L G, Sabourin R, Oliveira L S. Writer-independent Feature Learning for Offline Signature Verification using Deep Convolutional Neural Networks[J]. arXiv preprint arXiv:1604.00974, 2016.
- [25] Brink A, Bulacu M, Schomaker L. How much handwritten text is needed for text-independent writer verification and identification[C]. Pattern Recognition, 2008. ICPR 2008. 19th International Conference on. IEEE, 2008: 1-4.
- [26] Bertolini D, Oliveira L S, Justino E, et al. Texture-based descriptors for writer identification and verification[J]. Expert Systems with Applications, 2013, 40(6): 2069-2080.
- [27] Brink A A, Smit J, Bulacu M L, et al. Writer identification using directional ink-trace width measurements[J]. Pattern Recognition, 2012, 45(1): 162-171.
- [28] He S, Wiering M, Schomaker L. Junction detection in handwritten documents and its application to writer identification[J]. Pattern Recognition, 2015, 48(12): 4036-4048.
- [29] Hannad Y, Siddiqi I, El Kettani M E Y. Writer identification using texture descriptors of handwritten fragments[J]. Expert Systems with Applications, 2016, 47: 14-22.