The remarkable expansion of the digital data during the last period favored a much easier access to works of art for the general public. Great efforts were put lately into creating automatic image processing solutions that facilitate a better understanding of art [Cornelis et al.(2011)Cornelis, Dooms, Cornelis, Leen, and Schelkens]. These solutions may aim at obtaining high-quality and high-fidelity digital versions of paintings [Martinez et al.(2002)Martinez, Cupitt, Saunders, and Pillay] or may address various aspects such as: image diagnostics, virtual restoration, color rejuvenation etc. as discussed in the review of Stork et al. [Stork(2009)]. Another more appropriate to the ultimate goal of computers is the context recognition. One of the broadest possible implementation of context recognition is the automatic art movement identification.
According to Artyfactory [Art(Retrieved February 2016)], art movements are “collective titles that are given to artworks which share the same artistic ideals, style, technical approach or timeframe”. While some works are clearly set into a single art movement, others are in the transition period, as painters loved to experiment new ideas, leading to creation of a new movement. Also while the actual characteristics place a work in some art movement, its author, for personal reasons, refused to be categorized in such a way, giving birth to disputes.
In this paper, we look into the problem of computational categorization of digitized paintings into artistic genres (or art movements). In contrast to other directions of image classification, such as scene or object recognition, where large databases and evaluation protocols do exist, such an aspect is less emphasized for digitized paintings. Typically, the evaluation of a new method is carried on a small database with few paintings belonging to few genres. Given the latest advances of machine learning, two aspects should be noted: (1) deep networks with the many parameters easily overfit on small databases and (2) to have progress, we need larger databases.
In this paper we start by reviewing painting collections introduced in prior art and we follow by describing the proposed database. Next, to form a baseline, we continue by reporting the performance of various popular image descriptors and machine learning systems on the introduced database. The paper ends with discussions and conclusions.
2 Related work
In the last period multiple solutions issued automatic analysis of visual art and especially paintings using computer vision techniques. However, most of the research is based on medium–to–small databases. A summary of such methods is presented in table1. One may easily note the size of the databases (and implicitly the number of art movements investigated) increased with time, while the reported performance decreased until it stabilized in the range of 50-70% for correct art movement recognition. Some of the most representative databases used for art movement identification are:
Artistic genre dataset [Gunsel et al.(2005)Gunsel, Sariel, and Icoglu]. Images, gathered from Web Museum-Paris, were set in the following art movements: Classicism, Cubism, Impressionism, Surrealism, Expressionism.
Artistic genre dataset [Zujovic et al.(2009)Zujovic, Gandy, Friedman, Pardo, and Pappas]. Images from various Internet sources were categorized into 5 genres : Abstract, Impressionism, Cubism, Pop Art and Realism.
Painting genre dataset [Siddiquie et al.(2009)Siddiquie, Vitaladevuni, and Davis]: Images collected from the Internet were grouped into: Abstract expressionist, Baroque, Cubist, Graffiti, Impressionist and Renaissance.
Artistic style dataset [Shamir et al.(2010)Shamir, Macura, Orlov, Eckley, and Goldberg]: Paintings from 9 painters were grouped intro three art movements: Impressionism, Abstract expressionism and Surrealism.
Artistic genre dataset [Arora et al.(2012)Arora, , and Elgammal] with images collected from Artchive fine-art dataset and grouped into: fine-art genres: Renaissance, Baroque, Impressionism, Cubism, Abstract, Expressionism and Pop art.
Paintings-91 dataset [Khan et al.(2014)Khan, Beigpour, van de Weijer, and Felsberg]
with images collected from the Internet. While the database is larger than the previous ones, only paintings corresponding to painters that have the majority of works into one art movement got a genre label. It resulted in a smaller database illustrating Abstract expressionism, Baroque, Constructivism, Cubism, Impressionism, Neo-classical, Pop art, Post-impressionism, Realism, Renaissance, Romanticism, Surrealism and Symbolism. Probably this is the most structured database previously proposed.
Artistic genre dataset [Condorovici et al.(2015)Condorovici, Florea, and Vertan] is the basis of the proposed database. We increased that dataset by adding more images to illustrate the existing art movements and added 4 new ones.
Artistic genre dataset [Agarwal et al.(2015)Agarwal, Karnick, Pant, and Patel] contains images collected from WikiArt and grouped into: Abstract-expressionism, Baroque, Cubism, Impressionism, Expressionism, Pop Art, Rococo, Realism, Renaissance and Surrealism.
Concluding, many of the databases previously used, are small and contain non-standard evaluation protocols allowing overfitting. Thus, a larger scale database with fixed evaluation protocol should be beneficial for further development on the topic.
|Gunsel et al. [Gunsel et al.(2005)Gunsel, Sariel, and Icoglu]||3||107||91.66%||53.5%|
|Zujovic et al. [Zujovic et al.(2009)Zujovic, Gandy, Friedman, Pardo, and Pappas]||5||353||68.3%||10% CV|
|Siddiquie et al. [Siddiquie et al.(2009)Siddiquie, Vitaladevuni, and Davis]||6||498||82.4%||20% CV|
|Shamir et al. [Shamir et al.(2010)Shamir, Macura, Orlov, Eckley, and Goldberg]||3||517||91%||29.8%|
|Arora&ElGammal[Arora et al.(2012)Arora, , and Elgammal]||7||490||65.4%||20% CV|
|Khan et al. [Khan et al.(2014)Khan, Beigpour, van de Weijer, and Felsberg]||13||2338||62.2%||46.53% CV|
|Condorovici et al.[Condorovici et al.(2015)Condorovici, Florea, and Vertan]||8||4119||72.24%||10% CV|
|Agarwal et al. [Agarwal et al.(2015)Agarwal, Karnick, Pant, and Patel]||10||3000||62.37%||10% CV|
3 Pandora database
Our main contribution is the creation of a new and extensive dataset of art images111The up-to-date database with pre-computed features data reported here is available at http://imag.pub.ro/pandora/pandora_download.html . While we follow the Paintings-91 database [Khan et al.(2014)Khan, Beigpour, van de Weijer, and Felsberg], our dataset is significantly larger, it was build around art movements and not painters and we tried to span wider time periods from antiquity to current periods. The later property should help the automatic study of style evolution, of thematic evolution and cross-time relationship identifications.
The Pandora (Paintings Dataset for Recognizing the Art movement) dataset consists of 7724 images from 12 movements: old Greek pottery, iconoclasm, high renaissance, baroque, rococo, romanticism, impressionism, realism, cubism, fauvism, abstract-expressionism and surrealism. The precise database structure is shown in table 2 and some examples representative for the art movements are in figure 1. We kindly ask the reader to note some of difficulties in distinguishing between genres: the main difference between Abstract and Fauvism is the less natural order in the structure of the Abstract works, while the Fauvism tends “to use color to express joy“. Baroque has a darker tone with respect to Romanticism while the later depicts "exotism or extraordinary things" . The difference between Realism and Surrealism is that the later illustrate “irrational juxtaposition of images” [Art(Retrieved February 2016)] (e.g. such as wings attached to the girl). Yet thinking in computer terms, to detect the irrational of joy in an image is extremely hard. Thus we consider that to achieve such goals, one needs, first, an appropriate database of considerable size and variability.
|Art movement||No. of paintings||Historical period|
|Old Greek pottery||350||Antiquity|
|High renaissance||812||1490 - 1527|
|Baroque||960||1590 - 1725|
|Rococo||844||1650 - 1850|
|Romanticism||874||1770 - 1850|
|Impressionism||984||1860 - 1925|
|Realism||307||1848 - present|
|Cubism||920||1900 - present|
|Abstract-expressionism||340||1920 - present|
|Fauvism||426||1900 - 1950|
|Surrealism||242||1900 - present|
4 Art movement recognition performance
4.1 Training and testing
To separate the database training and testing parts, a 4-fold cross validation scheme was implemented. The division into 4 folds exists at the level of each art movement, thus each image being uniquely allocated into a fold. The same division was used for all further tests and it is part of the database.
4.2 Features and classifiers
As “there is no fixed rule that determines what constitutes an art movement” and ”the artists associated with one movement may adhere to strict guiding principles, whereas those who belong to another may have little in common” [Art(Retrieved February 2016)], there cannot be a single set of descriptors that are able to separate any two art movements.
Following the observations from prior works [Arora et al.(2012)Arora, , and Elgammal], [Khan et al.(2014)Khan, Beigpour, van de Weijer, and Felsberg], multiple categories of feature descriptors should be used. For instance, to differentiate between impressionism and previous styles, one of the main difference is the brush stroke, thus texture. Old Pottery and Orthodox Iconoclasm are older and use a limited color palette. Also, one needs to understand the content of the painting to distinguish between realism and surrealism (for instance); thus, global composition descriptor should be used.
To provide a baseline for further evaluation, we have tested various combinations of popular feature extractors and classification algorithms.
The texture feature extractors used are :
Histogram of oriented gradients (HOG) [Dalal and Triggs(2005)] which computes the oriented gradient in each pixel and accumulates the weight of each orientation into a histogram. It has been previously used in painting analysis [Khan et al.(2014)Khan, Beigpour, van de Weijer, and Felsberg], [Agarwal et al.(2015)Agarwal, Karnick, Pant, and Patel].
Pyramidal HOG (pHOG) the above mentioned HOG is implemented on 4 levels of a Gaussian pyramid.
Color HOG - the above mentioned HOG descriptor applied on each color plane of the RGB color space.
Local Binary Pattern (LPB) [Ojala et al.(2002)Ojala, Pietikäinen, and Mäenpää] is a histogram of quantized binary patterns pooled in a local image neighborhood of and restrained to a total of 58 quantized non-uniform patterns. The LPB was used in painting description [Khan et al.(2014)Khan, Beigpour, van de Weijer, and Felsberg], [Agarwal et al.(2015)Agarwal, Karnick, Pant, and Patel].
Pyramidal LBP (pLBP) - the above mentioned descriptor computed over 4 levels of a Gaussian pyramid.
Local Invariant Order Pattern [Wang et al.(2011)Wang, Fan, and Wu] - assume the order after sorting in the increasing intensity local samples.
For HOG, LBP and LIOP we have relied on the implementation from the VLFeat library [Vedaldi and Fulkerson(2010)].
Edge Histogram Descriptor (EHD) is part of the MPEG-7 standard. It accounts for the distribution of four basic gradient orientations within regular image parts. The implementation is based on BilVideo-7 library [Baştan et al.(2009)Baştan, Çam, Güdükbay, and Özgür Ulusoy].
The spatial envelope, GIST [Oliva and Torralba(2001)] describes the spatial character or shape of the painting and was previously used for painting categorization [Agarwal et al.(2015)Agarwal, Karnick, Pant, and Patel].
The color descriptors tested are:
Discriminative Color Names (DCN) [Khan et al.(2013)Khan, van de Weijer, Shahbaz Khan, Muselet, Ducottet, and Barat] - represents the dominant color retrieved through an information oriented approach. Here, we have used author provided code. The baseline form (Color Name) was successfully used to determine the style and the painter [Khan et al.(2014)Khan, Beigpour, van de Weijer, and Felsberg].
Color Structure Descriptor (CSD) [Manjunath et al.(2001)Manjunath, Ohm, Vasudevan, and Yamada], which is based on color structure histogram, a generalization of the color histogram. The CSD accounts for some spatial coherence in the gross distribution of quantized colors within the image and it has been shown that is able to differentiate between various art movements [Huang and Wang(2014)]
. We computed a 64 long CSD vector using the BilVideo-7 library[Baştan et al.(2009)Baştan, Çam, Güdükbay, and Özgür Ulusoy].
Machine learning classification systems tested are:
Let us note that before the development of the deep networks the random forests and support vector machines have been found to be the most robust families of classifiers[Fernández-Delgado et al.(2014)Fernández-Delgado, Cernadas, Barro, and Amorim]. Also, for small and diverse databases SVM and RF out-compete deep networks.
(kNN). We have implemented 1-NN, 3-NN and 7-NN based on Euclidean distance. While we report the results in terms of correct recognition rate, the nearest neighbor results will give an indication about the retrieval performance as it may be translated in terms of precision–recall.
Furthermore we have tested several systems that were previously used for art movement recognition. Inspired from previous work [Arora et al.(2012)Arora, , and Elgammal], we have run the Bag of Words (BoW) over SIFT keypoint detector with a vocabulary of 500. We have also tested a combination of color description, texture analysis based on Gabor filters and scene composition based on Gestalt frameworks [Condorovici et al.(2015)Condorovici, Florea, and Vertan].
Additionally, while the database is small for such a purpose and thus not really suited for deep learning, to have an indication of baseline performance, we have trained and evaluated a version of Deep Convolutional Neural Network (CNN). Our implementation is based on the MatConvNet[Vedaldi and Lenc(2015)] library and LeNet architecture [LeCun et al.(1998)LeCun, Bottou, Bengio, and Haffner].
We report first the results achieved when various combinations of features and classifiers are used (to be followed in table 3). We also report, in tables 4, 5, the confusion matrices for the best combination in each category: pLPB+SVM, GIST+RF, CSD+SVM and respectively pLBP+CSD+SVM.
|Feat. / Class.||Random Forest||SVM||1-NN||3-NN||7-NN|
|pLBP + DCN||0.488||0.521||0.278||0.282||0.297|
|pLBP + CSD||0.540||0.547||0.377||0.282||0.297|
|GIST + RF|
|CSD + SVM|
|pLBP + SVM|
|pLBP + CSD + SVM|
Secondly we report comparatively the best performance of aggregated systems in table 6. We note that for this particular database, the best performance is achieved by a standard combination of features (pyramidal LBP + Color Structure Descriptor) with a Support Vector Machine.
While one may find disappointing the performance of various established systems, this is perfectly explainable. For the Bag of Words there is too much variability between keypoints to find a common ground; instead of the baseline version tested here, one should opt for much larger vocabularies with accurate compression to keep memory requirements low. Regarding the performance of the DeepCNN, the reported value should be perceived as a lower boundary, as the database is too small for directly training nets with tens of thousands of variables, since no data augmentation was implemented and the images being resized at lost some of the defining characteristics.
|pLBP + CSD +SVM||0.547|
|Condorovici et al. [Condorovici et al.(2015)Condorovici, Florea, and Vertan]||0.379|
5 Discussion and conclusions
The best achieved performance was by a combination of pyramidal LBP and Color Structure Descriptor. One may expect the addition of GIST to further increase the performance, but this does not happen, probably due to the curse of dimensionality (the features dimension reaching 800); in such a case a feature selection method should be used, but we consider it outside the scope of the current paper.
The next important observation is that different descriptors do a good job separating some currents and not so good on identifying others. For instance, the CSD separates excellently the Orthodox Iconoclasm which has a unique color palette (due to degradation in time and reduced colors available at creation), but it is not able to separate Fauvism from Impressionism as both use the same colors but distributed differently. The Surrealism is hard to separate by everything else except GIST as it is the only tested feature able to describe the scene composition. Yet the GIST is not able to distinguish the Fauvism from Impressionism as local texture makes the difference. In contrast, the pLBP confusion between Fauvism and Impressionism is much reduced.
Overall, the confusion between Abstract and Cubism is large. As Cubism is defined by the extraordinary apparition of straight lines, to address it, one should try to introduce features appropriate to describe rectilinear objects.
Concluding we propose a new painting database annotated with art movements labels and divided in 4 folds to prepare it for rigorous evaluation. The database is significantly larger than the ones previously used. We have tested a multitude of popular features and classifiers and we have identified the weak and strong points of each of them. We also suggest some directions for future research that we anticipate to be beneficial for progress in the field.
This work is supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS UEFISCDI, number PN-II-RU-TE-2014-4-0733
- [Art(Retrieved February 2016)] What is an art movement ? http://www.artyfactory.com/art_appreciation/art_movements/art_movements.htm, Retrieved February 2016.
- [Agarwal et al.(2015)Agarwal, Karnick, Pant, and Patel] S. Agarwal, H. Karnick, N. Pant, and U. Patel. Genre and style based painting classification. In Proc. of WACV, pages 588–594, 2015.
- [Arora et al.(2012)Arora, , and Elgammal] R. S. Arora, , and Ahmed Elgammal. Towards automated classification of fine–art painting style: a comparative study. In Proc. of ICPR, pages 3541–3544, 2012.
- [Baştan et al.(2009)Baştan, Çam, Güdükbay, and Özgür Ulusoy] Muhammet Baştan, Hayati Çam, Uǧur Güdükbay, and Özgür Ulusoy. BilVideo-7: An MPEG-7-Compatible Video Indexing and Retrieval System. IEEE MultiMedia, 17(3):62–73, 2009.
- [Breiman(2001)] Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001.
- [Chang and Lin(2011)] Chih-Chung Chang and Chih-Jen Lin. Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3), May 2011.
- [Condorovici et al.(2015)Condorovici, Florea, and Vertan] Razvan George Condorovici, Corneliu Florea, and Constantin Vertan. Automatically classifying paintings with perceptual inspired descriptors. J. Vis. Commun. Image. Represent., 26:222 – 230, 2015.
- [Cornelis et al.(2011)Cornelis, Dooms, Cornelis, Leen, and Schelkens] B. Cornelis, A. Dooms, J. Cornelis, F. Leen, and P. Schelkens. Digital painting analysis, at the cross section of engineering, mathematics and culture. In Proc. of EUSIPCO, pages 1254–1259, 2011.
- [Dalal and Triggs(2005)] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proc. of CVPR, pages 886–893, 2005.
- [Fernández-Delgado et al.(2014)Fernández-Delgado, Cernadas, Barro, and Amorim] Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, and Dinani Amorim. Do we need hundreds of classifiers to solve real world classification problems? JMLR, 15(1):3133–3181, 2014.
- [Gunsel et al.(2005)Gunsel, Sariel, and Icoglu] B. Gunsel, S. Sariel, and O. Icoglu. Content-based access to art paintings. In Proc. of ICIP, pages 558–561, 2005.
- [Huang and Wang(2014)] Yin-Fu Huang and Chang-Tai Wang. Classification of painting genres based on feature selection. In Proc. of Multimedia and Ubiquitous Engineering, LNEE, volume 308, pages 159–164, 2014.
- [Khan et al.(2014)Khan, Beigpour, van de Weijer, and Felsberg] Fahad Shahbaz Khan, Shida Beigpour, Joost van de Weijer, and Michael Felsberg. Painting-91: a large scale database for computational painting categorization. Mach. Vis. App., 25(6):1385–1397, 2014.
- [Khan et al.(2013)Khan, van de Weijer, Shahbaz Khan, Muselet, Ducottet, and Barat] R. Khan, J. van de Weijer, F. Shahbaz Khan, D. Muselet, C. Ducottet, and C. Barat. Discriminative color descriptors. In Proc. of CVPR, pages 2866–2873, 2013.
- [LeCun et al.(1998)LeCun, Bottou, Bengio, and Haffner] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- [Manjunath et al.(2001)Manjunath, Ohm, Vasudevan, and Yamada] B. S. Manjunath, J. R. Ohm, V. V. Vasudevan, and A. Yamada. Color and texture descriptors. IEEE Trans. Cir. and Sys. for Video Technol., 11(6):703–715, 2001.
- [Martinez et al.(2002)Martinez, Cupitt, Saunders, and Pillay] K. Martinez, J. Cupitt, D. Saunders, and R. Pillay. Ten years of art imaging research. Proceedings of the IEEE, 90(1):28–41, 2002.
- [Ojala et al.(2002)Ojala, Pietikäinen, and Mäenpää] Timo Ojala, Matti Pietikäinen, and Topi Mäenpää. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell., 24(7):971–987, July 2002.
- [Oliva and Torralba(2001)] Aude Oliva and Antonio Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42(3):145–175, 2001.
- [Shamir et al.(2010)Shamir, Macura, Orlov, Eckley, and Goldberg] Lior Shamir, Tomasz Macura, Nikita Orlov, D. Mark Eckley, and Ilya G. Goldberg. Impressionism, expressionism, surrealism: Automated recognition of painters and schools of art. ACM Trans Appl Percept, 7(2):1–17, 2010.
- [Siddiquie et al.(2009)Siddiquie, Vitaladevuni, and Davis] B. Siddiquie, S.N. Vitaladevuni, and L.S. Davis. Combining multiple kernels for efficient image classification. In Proc. of WACV, pages 1–8, 2009.
- [Stork(2009)] D. Stork. Computer vision and computer graphics analysis of paintings and drawings: An introduction to the literature. In Proc. of CAIP, pages 9–24, 2009.
- [Vedaldi and Fulkerson(2010)] Andrea Vedaldi and Brian Fulkerson. Vlfeat: An open and portable library of computer vision algorithms. In Proc. of ACM MM, pages 1469–1472, 2010.
- [Vedaldi and Lenc(2015)] Andrea Vedaldi and Karel Lenc. Matconvnet: Convolutional neural networks for matlab. In Proc. of ACM MM, pages 689–692, 2015.
- [Wang et al.(2011)Wang, Fan, and Wu] Zhenhua Wang, Bin Fan, and Fuchao Wu. Local intensity order pattern for feature description. In Proc. of ICCV, pages 603–610, 2011.
- [Zujovic et al.(2009)Zujovic, Gandy, Friedman, Pardo, and Pappas] J. Zujovic, L. Gandy, S. Friedman, B. Pardo, and T.N. Pappas. Classifying paintings by artistic genre: An analysis of features & classifiers. In Proc. of IEEE MMSP, pages 1–5, 2009.