Ancient Coin Classification Using Graph Transduction Games

10/02/2018 ∙ by Sinem Aslan, et al. ∙ Università Ca' Foscari Venezia 0

Recognizing the type of an ancient coin requires theoretical expertise and years of experience in the field of numismatics. Our goal in this work is automatizing this time consuming and demanding task by a visual classification framework. Specifically, we propose to model ancient coin image classification using Graph Transduction Games (GTG). GTG casts the classification problem as a non-cooperative game where the players (the coin images) decide their strategies (class labels) according to the choices made by the others, which results with a global consensus at the final labeling. Experiments are conducted on the only publicly available dataset which is composed of 180 images of 60 types of Roman coins. We demonstrate that our approach outperforms the literature work on the same dataset with the classification accuracy of 73.6 set, respectively.



There are no comments yet.


page 1

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Ancient coins, that depict cultural, political and military events, natural phenomena, ideologies and portraits of god and emperors are important source of information for historians and archaeologists. Recognizing the type of an ancient coin requires theoretical expertise and years of experience in the field of numismatics. A common way to detect the period of a discovered coin is searching through the manual books where ancient coins are indexed [4]

which requires a highly time consuming labor. Our goal in this paper is automatizing recognition of Roman coins by employing computer vision and pattern recognition techniques. Automatizing such a manual procedure not only provides faster processing time but also can support historians and archaeologists for a more accurate decision. A visual classification framework for ancient coin recognition can also be used at museums or by individual collectors to organize large collections of coins.

From the computer vision point of view, classification of ancient coin images is a highly challenging task. One of the difficulties arises from existence of high number of types (i.e. classes) in ancient coins (e.g. Portuguese coins from medieval period and coins from Roman Republic compose over 1500 [17] and 550 [4] different classes, respectively), while most of the classes include few known specimens as mentioned in [17, 23]. Moreover, intra-class variation is large due to local spatial variations arising from missing parts and degradations on the coins, and manual manufacturing of coins by different engravers. Another reason of large intra-class variation is the metallic structure of these coins yields to strong reflection and shading variations so the appearance of the same coin changes significantly under different lighting conditions. Another challenge in ancient coin classification is the typical low inter-class variations due to high global similarity between classes [22]. Images from two coin classes are presented in Fig. 1 to demonstrate the challenges of large intra-class and low inter-class variations.

Fig. 1: Example images of two classes from the Roman coin dataset [20] that is used in this work. First row: Images of class 387/1; Second row: Images of class 300/1 (listed with Crawford [4] reference number).

Ancient coin classification can be accomplished by adopting one of the following approaches for classifiers

[3]: (i) learning-based classifiers

, where the parameters of the classifier (e.g. Deep Neural Networks, SVM, Random Forests, etc.) are learned from data in an intensive training phase. (ii)

non-parametric classifiers, where the classification decision is directly based on data without pursuing any training phase (e.g. Nearest Neighbor based classifier). Although the first group proved to be superior to the second one, they require extraction of highly discriminative features (possibly from abundant training data) for robust classification. Moreover, pursuing such a time consuming training phase can be impractical for handling dynamic databases where new classes are included continuously.

In this paper, we adopt a non-parametric classifier for ancient coin classification, which is preferable under existence of aforementioned challenges, i.e. large intra-class and low inter-class variation and lack of abundant training data. We have followed the same approach in [22, 20], i.e. our non-parametric classifier uses a dissimilarity measurement derived from costs of dense matching of SIFT features. Similar to [22, 20], for dense feature matching we use SIFT flow [11]

, a flow estimation technique developed for image alignment. SIFT flow preserves discontinuity so allows matching objects that locate at different parts of image. This property of SIFT flow makes it well suited for coin images

[22], i.e. it helps to deal with large intra-class variation since images from the same class has similar spatial arrangement of features. Additionally, defining similarity between two coin images based on local matches between them helps to deal with low inter-class variation, since two classes mostly differ from each other in variations at local regions.

Differently from [22, 20], in this work we do not use a greedy Nearest Neighbor (NN

) based classifier where a query image is labeled with the class of its nearest (most similar) image in the dataset. Instead, we use a semi-supervised learning approach, namely

Graph Transduction Game (GTG) [6], for ancient coin classification. The GTG casts the classification problem as a non-cooperative game where the players (the coin images) decide their strategies (class labels) according to the choices made by the others, which results with a global consensus at the final labeling. Experimenting on a small-scale ancient coin dataset having the aforementioned classification challenges, we show that the notion of label consistency [8] provided by GTG brings significant performance gain over the conventional NN-based classifier for this challenging problem.

Ii Previous works

One of the main problems of ancient coin image analysis that is addressed in the literature is coin identification where the goal is recognizing a specific coin instance instead of a coin type [7, 9]. This type of application finds usage at identification of stolen coins. Most of the other works have focused on coin type recognition (or coin classification) which has found a wider range of practical usage. A number of works [10, 22, 20, 23] employed NN-based classifier where the class of a query coin image is assigned with the label of its most similar one in the training set. Among these, [10] defined coin similarity by number of matched SIFT features that were detected sparsely on the images, while [22, 20] employed dense matching costs of SIFT flow as dissimilarity metric. In [23], the authors used densely computed illumination-invariant LIDRIC features and fusing several similarity scores that point out the matching quality they employed an overall similarity score. High performance results are reported in these works although the employed datasets were quite small-scale, i.e. classification accuracies of 90% [10] and 82% [20] are obtained for the datasets with 390 images of 3 classes and 180 images of 60 classes, respectively.

Other works employed learning-based classifiers. Earlier attempts [1, 2] relied on Bag of Visual Words based representation of local image features where a visual dictionary is learned from a training set and classification is achieved with SVM in [1] and GMM in [2]

. Recently, Schlag and Arandjelovic proposed to use a deep convolutional neural network for Roman coin classification in

[18]. They accomplish training with a large set of images, i.e. around 20K images of 83 classes, and reported around 83% accuracy on 10k images.

A significant obstacle at employing learning-based classifiers for this particular research problem is deficiency of publicly available datasets. A number of works employed datasets of Sassanian dynasty coins [15], some others focused on medieval coins [17], and most of them have worked on coins of the Roman Republic [1, 2, 18, 22, 20, 23]. However, the only publicly available ancient coin dataset is published by Zambanini and Kampel which is composed of 180 images of 60 Roman coin classes [20] which we experimented on in this work.

Iii Graph Transduction Game

The Graph Transduction Game (GTG) [6], is a semi-supervised learning method which has recently found a renewed interest and successfully applied in different contexts, e.g. bioinformatics [19] and label augmentation problem [5]

. The GTG casts the problem in terms of a non-cooperative multiplayer game, in which the objects (or images of a dataset) are the players while the possible strategies are the class labels. The idea is, randomly taking two players, they both choose a strategy with a certain probability and receive a payoff which is proportional to the agreement of the chosen strategies (labels). Being a non-cooperative game is in their own interest to maximize their payoff, hence choosing the labels with the higher agreement. Then, the game is played until all the objects have chosen a strategy (label) and none of them would like to change their membership hypothesis. This particular condition is known as

Nash Equilibria [14]. Once the game reaches an equilibrium, every player plays its best strategy which correspond to a consistent labeling [8, 13]. A peculiarity of GTG is that the consistency is a global property which is not related to a single player but achieved for all of the players.

For the sake of completeness we recap some basic concepts on game-theory in here. Given a set of players

(i.e. images of our dataset) and a set of possible pure strategies (the set of labels):

  1. mixed strategy: a mixed strategy

    is a probability distribution over the possible strategies for player

    . Then, , where is the standard -dimensional simplex and is the probability of player to choose the pure strategy .

  2. strategy space: it corresponds to the set of all mixed strategies of the players

    , which is represented as a stochastic matrix of size

    . The starting point of the game is defined by a proper setting of .

  3. utility function: it is responsible for computing the gain obtained by the -th player when it chooses a mixed strategy . In particular .

In this context, the players are separated into labeled () and unlabeled () sets111In terms of standard learning algorithm, the set of labeled players correspond to the training set while the unlabeled ones to the test set.. The strategy space is initialized in two different ways based on the fact that an object is labeled or unlabeled

. A one-hot vector is assigned to each of the labeled objects, since their labels are known:


whereas, since no prior knowledge is available for the unlabeled objects, the same probability of all labels is assigned to them:


Payoff function

The payoff function reflects the likelihood for a player (object) to choose a particular strategy (label), considering the similarities between labeled and unlabeled players. It provides that more similar players are more likely to influence each other in choosing one of the possible strategies (labels).

Formally, given a player and a strategy the utility function is as follows:


where and are the labeled and unlabeled nearest neighbors of , respectively. Here, and are the payoffs received by player while it uses the strategy and plays the mixed strategy , respectively. The matrix is the partial payoff matrix between players and , which is computed as [6], where is the similarity between player and and

is the identity matrix of size


Players similarity

Once the features are extracted for players (objects or images) and , similarity between them can be computed by Eq. 6, where denotes the distance between features and and is the distance between and its -nearest-neighbors [24].


Finding Nash Equilibria

In order to find a Nash Equilibria of the game we used a result, named as Replicator Dynamics (RD) [12], from Evolutionary Game Theory[21]. The RD are dynamical systems that mimic a Darwinian selection process on a set of strategies for each player. The underlying idea is it favors the fittest strategies for their survival while the others become extinct.

More formally, the RD are defined as follows:


where is the probability of strategy at time for player (see Eq. 4) and is the expected payoff of the entire mixed strategy (see Eq. 5). The Eq. 7 is iterated until convergence222Convergence criteria: i) the distance between two successive steps is or ii) a certain amount of iterations is reached, i.e. typically 20 iterations are sufficient. (See [16] for a detailed analysis). Once the convergence of Eq.7 is reached, we simply get the index of the maximum value in the -th row of in order to label the -th object.

Iv Ancient Coin Classification using Graph Transduction Game (GTG)

By considering the training set of coin images as the labelled players, GTG can be applied for ancient coin classification problem to estimate the labels of the test set images, i.e. unlabelled players. We list the steps that we have employed for the application of GTG for ancient coin classification as follows:

Feature extraction

We compute two type of features on the images: (i) In order to analyze local similarities, we compute 128-dimensional SIFT features in the local neighborhood of every image pixel that results with a tensor named as

SIFT-image [11]

; (ii) In order to analyze global similarities between images we compute CNN features. Specifically, since our dataset is quite small, which makes a CNN training unfeasible, we apply transfer learning by using a CNN architecture pre-trained on ImageNet. Finally, for each input image we get its feature from the output of the last fully-connected layer of the CNN.

Initialization of the strategy space

Since no other knowledge on the problem exists, but only the distinction between labeled and unlabeled sets, the strategy space is initialized using Eq.1 and Eq. 2.

Computation of similarity between objects

A correct choice of computation for the similarity between images is important to avoid a failure at label estimation. We employ different schemes of similarity computation regarding to the extracted feature types:

i. Similarity between local features of images: It is demonstrated in [22] that matching scores of SIFT flow technique are powerful dissimilarity metric for ancient coin classification. In SIFT flow, SIFT-images are matched along the flow vectors and optimal correspondences are found by minimizing an energy function ( in [11]) using dual-layer belief propagation [11]. Since runtime of such optimization scales up with the image size, authors of [11] proposed to employ coarse-to-fine search which results with faster computation and better performance of matching. Similar to [20], in this work we used the minimum energy value, say , (to which SIFT Flow algorithm converges at the finest level of the coarse-to-fine search) as a dissimilarity metric between image and , i.e. we used in Eq. 6.

ii. Similarity between global descriptions of images: Following the general trend [5, 6], we used Euclidean distance, i.e. in Eq. 6, to compute similarity between the CNN features.

Execution of transduction game

Giving the similarities to the GTG, it starts to play the game between players, i.e. images, until convergence. We get the final probabilities of strategies, i.e. labels, for the unlabeled objects at the output and we assign the object with the strategy that could get the highest maximum probability.

V Experiments


We experimented on the only published333 ancient coin dataset [20] which is acquired at Coin Cabinet of the Museum of Fine Arts in Vienna, Austria. The dataset is composed of 180 images (reverse sides of the coins that includes motifs and legends) of 60 classes with 3 images in each class. Images are resized to pixels as in [20].

Experimental setup

Since we have experimented on the same dataset, we followed the same experimental setting with [20] to make a fair comparison of techniques. In [20], accepting one of the coins as a query image (or test image), the remaining one or two images per class are used to create the training set. At each classification run, nearest neighbor of the query image is searched in the training set. This procedure leads to 180 and 360 classification runs when two and one training images per class is used. When the training set is created by two images per class, the nearest neighbor search is performed through accumulated dissimilarity values of each training set image over classes.

Adopting the same experimental setting in our approach, we create a dissimilarity matrix with the entries computed as in [20], i.e. as mentioned in Section IV.c. Then we symmetrize it (by getting maximum of entries around diagonal) before giving input to the GTG algorithm. Additionally, at each iteration we substitute the test image and training images as unlabeled object and labeled objects, respectively to be used in GTG and we get the class label of the unlabeled object in the output. In all experiments, the parameter of the neighboring set in Eq. 3 is set to 2.

Performance evaluation

We performed GTG by employing two feature types and with the corresponding dissimilarity metrics as explained in Section IV. In the first experiment, we compute off-the-shelf CNN features by DenseNet-201 which is one of the state-of-the-art CNN architectures where we use the Euclidean distance metric to measure the dissimilarity between the features. In the second experiment, by employing densely computed local SIFT features we use matching costs of SIFT flow as dissimilarity measure. The performance results of these experiments and comparison with the state-of-the-art work on the same dataset [20] are given in Table I.

It can be seen in Table I that the lowest performance results for both training set sizes are obtained when we use the CNN features. This is an expected outcome, because CNN features provide a global description of images and a high global similarity exists between different classes in this coin dataset. We could outperform [20] that employs a NN-based classifier, by using the GTG for ancient coin classification by 73.6% and 87.2% classification accuracy when the training set is constructed from one and two images per class, respectively. We additionally checked the performance of conventional NN-based classifier which does not adopt the accumulation of class-wise dissimilarities (that were adopted at [20]), when there are two images per class in the training set. In that case, we got 81.67% accuracy which was slightly lower than the reported performance (83.3%) in [20].

Training set: 1 image per class Training set: 2 images per class
Technique Correct classifications Classification accuracy Correct classifications Classification accuracy
CNN features + Euclidean distance + GTG 188 / 360 52.2% 113 / 180 62.8%
Dense SIFT + Matching cost + NN [20] 257 / 360 71.4% 150 / 180 83.3%
Dense SIFT + Matching cost + GTG 265 / 360 73.6% 157 / 180 87.2%
TABLE I: Classification results

In Fig. 2, we present two misclassifications of the proposed approach. It can be seen that the misclassifications are mostly due to low variability between different classes.

Fig. 2: Two selected misclassifications of the proposed approach based on GTG. First column: test image; Second column: another image from the same class; Third column: image of selected class by the proposed scheme.

Vi Conclusion

In this paper, we studied the ancient coin classification problem using Graph Transduction Games (GTG) which adopts the approach of non-parametric classifier. The GTG is a game-theoretic semi-supervised learning algorithm, grounded on the notion of label consistency, in which the final labeling of the objects is achieved by reaching an equilibrium condition between all labeling hypothesis. Our experimental results show that GTG works better for the problem of ancient coin classification, which is a highly complex problem due to large intra-class and low inter-class variations, compared to conventional nearest neighbor based non-parametric classifiers that does not consider global agreement at labeling choices of all dataset images.


The authors would like to thank Sebastian Zambanini, Ismail Elezi, Leulseged Tesfaye Alemu and Alessandro Torcinovich for their invaluable advices, sharing and helps at various technical issues.


  • [1] H. Anwar, S. Zambanini, and M. Kampel, “Coarse-grained ancient coin classification using image-based reverse side motif recognition,” Machine Vision and Applications, vol. 26, no. 2-3, pp. 295–304, 2015.
  • [2] O. Arandjelovic, “Automatic attribution of ancient roman imperial coins,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 1728–1734.
  • [3] O. Boiman, E. Shechtman, and M. Irani, “In defense of nearest-neighbor based image classification,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008, pp. 1–8.
  • [4] M. H. Crawford, Roman republican coinage.   Cambridge University Press, 1974, vol. 2.
  • [5] I. Elezi, A. Torcinovich, S. Vascon, and M. Pelillo, “Transductive label augmentation for improved deep network learning,” arXiv preprint arXiv:1805.10546, 2018.
  • [6] A. Erdem and M. Pelillo, “Graph transduction as a noncooperative game,” Neural Computation, vol. 24, no. 3, pp. 700–723, 2012.
  • [7] R. Huber-Mörk, S. Zambanini, M. Zaharieva, and M. Kampel, “Identification of ancient coins based on fusion of shape and local features,” Machine vision and applications, vol. 22, no. 6, pp. 983–994, 2011.
  • [8] R. A. Hummel and S. W. Zucker, “On the foundations of relaxation labeling processes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 3, pp. 267–287, 1983.
  • [9] M. Kampel, R. Huber-Mörk, and M. Zaharieva, “Image-based retrieval and identification of ancient coins,” IEEE Intelligent Systems, vol. 24, no. 2, pp. 26–34, 2009.
  • [10] M. Kampel and M. Zaharieva, “Recognizing ancient coins based on local features,” in International Symposium on Visual Computing, 2008, pp. 11–22.
  • [11] C. Liu, J. Yuen, and A. Torralba, “Sift flow: Dense correspondence across scenes and its applications,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 5, pp. 978–994, 2011.
  • [12] J. Maynard Smith, Evolution and the Theory of Games.   Cambridge University Press, 1982.
  • [13] D. A. Miller and S. W. Zucker, “Copositive-plus Lemke algorithm solves polymatrix games,” Operations Research Letters, vol. 10, no. 5, pp. 285–290, 1991.
  • [14] J. Nash, “Non-cooperative games,” Annals of Mathematics, pp. 286–295, 1951.
  • [15] S.-S. Parsa, M. Sourizaei, M. M. Dehshibi, R. E. Shateri, and M. R. Parsaei, “Coarse-grained correspondence-based ancient sasanian coin classification by fusion of local features and sparse representation-based classifier,” Multimedia Tools and Applications, vol. 76, no. 14, pp. 15 535–15 560, 2017.
  • [16] M. Pelillo, “The dynamics of nonlinear relaxation labeling processes,” Journal of Mathematical Imaging and Vision, vol. 7, no. 4, pp. 309–323, 1997.
  • [17] L. Salgado, “Medieval coin automatic recognition by computer vision,” Ph.D. dissertation, 2016.
  • [18]

    I. Schlag and O. Arandjelovic, “Ancient roman coin recognition in the wild using deep learning based recognition of artistically depicted face profiles,” in

    IEEE International Conference on Computer Vision Workshop (ICCVW), 2017, pp. 2898–2906.
  • [19] S. Vascon, M. Frasca, R. Tripodi, G. Valentini, and M. Pelillo, “Protein function prediction as a graph-transduction game,” Pattern Recognition Letters, 2018 (in press).
  • [20] S. Zambanini and M. Kampel, “Coarse-to-fine correspondence search for classifying ancient coins,” in Asian Conference on Computer Vision, 2012, pp. 25–36.
  • [21] J. Weibull, Evolutionary Game Theory.   MIT Press, 1997.
  • [22] S. Zambanini and M. Kampel, “Automatic coin classification by image matching,” in 12th International conference on Virtual Reality, Archaeology and Cultural Heritage, 2011, pp. 65–72.
  • [23] S. Zambanini, A. Kavelar, and M. Kampel, “Classifying ancient coins by local feature matching and pairwise geometric consistency evaluation,” in 22nd International IEEE Conference on Pattern Recognition (ICPR), 2014, pp. 3032–3037.
  • [24]

    L. Zelnik-Manor and P. Perona, “Self-tuning spectral clustering,” in

    Advances in Neural Information Processing Systems (NIPS), 2005, pp. 1601–1608.