Illustrations help us to understand the message clearly and have been widely used in printed and visual media. Yet, the role of illustrations in children’s books is more than being a simple picture accompanying the text. For the children who don’t know how to read, those are the illustrations that make the children to understand the story. Those images help them to identify the characters, scenes and events in the books and let them to be prepared for the fascinating world of words when they start to read by themselves111http://www.maaillustrations.com/blog/article/the-role-of-illustration-in-childrens-book/.
This fact inspires many artists to draw illustrations for children’s books. On the other hand, understanding, predicting, and analyzing people’s taste of reading is a challenging problem, since the taste can depend on individuals’ philosophical, psychological, political backgrounds. When it comes to children’s books, especially from a child’s perspective the choice mostly depends on the visual illustrations. Discovering the taste requires the understanding of the style characteristics of the illustrators. Motivated by this observation, in this study we aim to understand the style of artists who draw children’s books.
Automatic understanding of artistic images could assist in organizing large collections and could be useful for art recommendation systems. However, it is a difficult task mostly due to varying stylistic behavior of different artists. Particularly with the increase of deep structures there has been an interest towards this relatively less explored area.
There have been recent efforts to understand aesthetic perception of art works such as investigating the potential of a computer to make aesthetic judgments (Spratt and Elgammal, 2014), quantifying creativity (Elgammal and Saleh, 2015), aesthetic analysis of images by feature discovery (Campbell et al., 2015), and analyzing the artistic influence by comparing them to others (Saleh et al., 2014)
. Even though classifying art is qualitative(DiMaggio, 1987), classification of art works has also emerged as another line of work. Bar et. al. (Bar et al., 2014) worked on classification of artistic styles by presenting a perceptiveness of deep neural network features in identifying artistic styles in paintings. Li et. al. (Li and Chen, 2009) worked on automatically classifying paintings as aesthetic or not. Lyu et. al. (Lyu et al., 2004) focused on painter authentication. Identification of painters is also studied based on wavelet analysis of brush strokes in paintings (Johnson et al., 2008; Li and Wang, 2004). In (Tan et al., 2016; Saleh and Elgammal, 2015) they aimed to classify fine-art paintings using CNNs on ”Wikiart paintings” (Karayev et al., 2013) data set. In (Tan et al., 2016) they conducted experiments on their proposed CNN which is very similar to AlexNet (Krizhevsky et al., 2012)
. Best result is achieved when network is first trained on ImageNet dataset(Russakovsky et al., 2015)
, then transfer learning applied to the network.
Inspired by capabilities of humans who are able to recognize objects regardless being in art or photography, Cai et. al. worked on automatically identifying objects in cross domains (Cai et al., 2015). In (Crowley and Zisserman, 2014b, a), the authors focus on recognizing objects in paintings learned from natural images.
Collecting and labeling a dataset for artistic images is also a challenging task. Mensink et. al. (Mensink and Van Gemert, 2014) introduced a diverse dataset of over 1 million artworks, 700,000 of which are prints to support and evaluate art classification. Carneiro et. al. (Carneiro et al., 2012) presented a database of monochromatic artistic images. Crowley et al. (Crowley and Zisserman, 2014b, a) annotate a subset of publicly available ’Your Paintings’ (you, 2012) data set images with ten category labels from the PASCAL VOC data set (Everingham et al., 2011). In (Khan et al., 2014) presented a dataset which contains 4266 paintings from 91 different painters. Karayev et. al. (Karayev et al., 2013) presented two novel data sets, one of them contains 80K Flickr photographs annotated with 20 style labels such as vintage, romantic, HDR etc., and the second one consist of 85K art paintings from 25 art styles like Baroque, Roccoco, Cubism etc.
Some works concentrated on transferring artistic styles from style images such as paintings to content images such as selfie pictures (Gatys et al., 2016; Johnson et al., 2016; Dumoulin et al., 2017). In (Gatys et al., 2015)
, the artistic style transfer pipeline tries to minimize feature reconstruction loss and style reconstruction loss at the same time by using features from pre-trained CNN model with forward and backward passes. Since backward computations increases computation time, to overcome this,(Johnson et al., 2016) proposed a similar approach with using forward passes to minimize both feature and style reconstruction losses. Kyprianidis et. al. (Kyprianidis et al., 2013) presented a survey on state of the art techniques for transforming images and videos into artistically stylized renderings.
The studies that try to identify the style or genre for art images could be considered similar to ours (Saleh and Elgammal, 2015; Karayev et al., 2013; Matsuo and Yanai, 2016; Chu and Wu, 2016). However, they define style as a more generic term shared by several artists. The work in (Thomas and Kovashka, 2015) that identifies the authorship of photographs, that is the photographer, is the most similar one to ours. Deep networks are also utilized in that study for qualitative evaluations.
In the illustrator identification domain, based on our knowledge the only work is (Sener et al., 2012) where they tried to identify only four illustrators on a very small data set. They utilized several low-level descriptors such as HOG, GIST and SIFT and used a bag of words model to classify illustrations. In this work, we collected a larger data set and used their results as our baselines.
In some recent studies, illustrations are considered in the form of clip arts. In (Garces et al., 2014), a style similarity metric is designed by combining color, shading, texture and stroke features with relative comparisons collected via AMT, and this work was leveraged in (Garces et al., 2016)
to obtain aesthetically coherent clusters for visualizations of clip art datasets. In (Furuya
et al., 2015), an unsupervised approach is proposed for stylistic comparison of illustrations again in the form of clip-arts.
The illustrations that we consider are specific to the artistic drawings in children books, and they are more challenging than the illustrations in clip-arts.
Our contributions: We have several important contributions that will be described in detail in the following sections: (1) We attack to the problem of classifying styles of illustrators which is a more challenging task than classifying the content. (2) We have constructed a new dataset of illustrations. Based on our knowledge this is the first comprehensive dataset specific to artistic illustrations from books. (3) We focus on illustrations in children’s books which have distinct characteristics in the sense that the imagination could lead to extreme characters and settings to happen that are difficult to be found in most of the photographs and paintings. (4) We explored different deep networks and compared them with low level features. (5) We tested three different strategies for categorisation: novel instance recognition from seen books as well as unseen books, and book recognition. (6) We exploited the style transfer method and showed the qualitative results for transferring styles from illustrators to cartoon images and natural photographs as well as to the illustrations of other illustrators. (7) Further, we provided quantitative results for illustrator to illustrator transfer utilizing the style categorization. (8) We compared different methods and features in choosing representative illustrators and discriminative patches.
We constructed a new data set consisting of 6468 distinct illustrations from 24 different illustrators. Focusing on the popular children’s books, we mostly selected the illustrators who created more than a single basic character. The pages are collected either directly scanning from printed books or from publicly available e-books and read aloud videos over YouTube. Table 1 shows summary of our dataset and Figure 2 represents some example illustrations.
In building the dataset, we are inspired from (Sener
et al., 2012) in which a dataset consisting of 248 illustrations of Axel Scheffler, 243 illustrations of
Debi Gliori, 249 illustrations of Korky Paul and 234 illustrations of Dr. Seuss was generated. We almost doubled the examples for three of the illustrators, and included 20 other illustrators. Within its current form the dataset is unique: although large scale datasets exist for paintings (Karayev et al., 2013; Crowley and
Zisserman, 2014b; Khan et al., 2014), based on our knowledge this is the first comprehensive dataset for illustrations.
Note that, in the painting datasets there are a variety of artists following the same artistic style, and thus the dataset is deep in the sense that the number of examples per style is large. However, each illustrator has only a limited set of books and therefore the number of examples per category is not possible to reach to the numbers in painting datasets. Similarly, the number of categories can only be extended within some limits when we force each illustrator to have more than a single specific character or book series. We continue to extend the dataset and will make it publicly available within the copyright limitations.
3. Discovering style of illustrators
In the following, we will first describe the details of our method in categorizing the style of illustrators using deep networks. Then, we will discuss about approaches to transfer style and to discover representative elements.
3.1. Deep Learning For Style Recognition
Instead of creating a model from scratch, we used three well-known CNN models in training: AlexNet (Krizhevsky et al., 2012), VGG-19 (Simonyan and Zisserman, 2014) and GoogLeNet (Szegedy et al., 2015)
. We used Caffe(Jia et al., 2014) framework to train deep networks on a Tesla K40 12GB GPU. We employed both end-to-end training and transfer learning. To train an end-to-end model, we enlarged our data set which is comparably small, by applying data augmentation.
For small data sets like ours, it is not practical and meaningful to fully train very deep networks. Thus, we fully trained only the AlexNet as being relatively shallow. We first subtracted the mean of RGB values over our illustrations dataset for each pixel and obtained the centered raw RGB values. We augmented our training and validation data using only horizontal reflections to reduce overfitting. The batch sizes are chosen as 128 and 40 for train and validation respectively. Base learning rate is set to 0.01 with a momentum of 0.9 and the learning rate is decreased by a factor of 10 after each 40K iterations.
Considering the fact that our dataset is comparably small, alternatively we applied transfer learning. For this purpose, we used VGG-19, AlexNet and GoogLeNet models pre-trained on a large scale ImageNet dataset. Our hyper parameters are nearly the same for fine tuning on AlexNet and VGG-19 except learning rate and batch sizes. Due to the memory issues, we were able to train VGG-19 with train batch sizes of 32. We selected learning rate accordingly and set it to 0.0004. Base learning rate for AlexNet is 0.0001 and all other parameters for SGD are same as end-to-end training. To train GoogLeNet we used quick solver (qui, 2014) properties with initial learning rate of 0.01.
3.2. Style transfer
Inspired by the recent work on transfering artistic style of paintings (Gatys et al., 2016), we transfer the style from one illustration to another. Besides showing the ability of style generation, this task is also important to understand the capability of deep models to capture the style separated from the content.
Style transfer model (Gatys et al., 2016) combines the appearance of a style image, e.g. an artwork, with the content of another image, e.g. an arbitrary photograph, by minimizing the loss of content and style. In our case, style is learned from an illustration of a particular illustrator, and transferred to another image. The target image could be a cartoon, a natural photograph, or another illustration from another artist. We expect the resulting image to contain the content of the target image drawn with the style of source illustration.
However, it is difficult and subjective to judge the quality of the resulting images. In this study, focusing on the style transfers from one illustration to another, we propose to compare the style of the resulting illustration with the original style from the categorisation perspective. Our intuition here is that if we use the resulting image as a test instance on our deep networks, and classify them correctly then we could infer that deep models can capture styles.
3.3. Discovering representatives
Here we try to understand style of illustrators in terms of discriminative and representative examples. We utilised two methods for this purpose. The first method (Doersch et al., 2012) was initially proposed for discovering architectural elements of different cities. It takes a positive set of images from which we want to extract discriminative patches, and a global negative set. It uses HOG features (Dalal and Triggs, 2005) to represent the images. We have used this method both to find representative illustrations for different artists and also for discovering the discriminative parts in the illustrations. However, since this algorithm takes days to complete on a powerful laptop, we were able to run it only for a few of illustrators.
The second method that we utilised (Golge and Duygulu-Sahin, 2015)
focuses on eliminating the outliers from a candidate set of positive examples to capture the representative elements in an iterative fashion. The method was proposed to recognise faces from noisy weakly labeled images collected from web. Being flexible, we exploited this method with HOG(Dalal and Triggs, 2005), color dense SIFT (Lowe, 2004), and VGG (Simonyan and Zisserman, 2014) features.
In this section, we first present detailed experimental evaluations to recognize style of illustrators using deep networks. We also provide experimental results of conventional classification methods as a baseline to compare with deep architectures. Then, we present our results on style transfer and representative element discovery.
4.1. Style recognition with deep networks
We used two different settings for categorisation. In the first setting, we treated each page as an independent instance and constructed training, validation and test sets by randomly selecting instances from the entire collection. In the second setting, we tested a more challenging case, and removed some of the books entirely from the training set. Results of both settings will be discussed in the following.
To analyze and understand the results further, we exploited the method of (Yosinski et al., 2015). Figure 4 shows per-unit visualizations from different layers of VGG-19 network. In every image, first column corresponds to synthetic images which cause high activation using regularized optimization, and second column shows crops from our training dataset that cause highest activation for that unit. As it is shown, our network is able to find parts and objects such as eyes, fish, car/wheel, house, plant, people and clothes, and even discriminate poses such as side views of humans and animals, as well as hair, fur or ears.
Instance categorisation: In this setting, our goal is classify illustrations on a randomly carved data. Here, we don’t care about the books and thus we put all the illustrations from all the books of an illustrator all together and then we construct training, validation and test sets by selecting fixed percentage of the instances randomly.
For this group of experiments we utilized several deep networks including end-to-end training of a network and fine tuning. Table 2 summarizes the results in terms of the network architecture used, test type such as fully training or fine tuning the network, and whether data augmentation is used or not. For all experiments on deep networks, we used 70% of the data as training set and, 10% of the data as validation set. The rest which is 20% is used for testing.
|Color Dense SIFT||-||84.35|
As expected fully training a deep network gives less accuracy than fine tuning. Thus, in the next group of experiments we focused only on the fine tuning. Also note that, using augmented data for fine tuning a model doesn’t improve the accuracy much. Thus, we preferred not to use augmented data while fine tuning a model. GoogLeNet has much less parameters and less error rate than VGG-19 on ImageNet data set. Our results are in line with the same observation and GoogLeNet beats VGG-19 with a very small difference. Since GoogLeNet has the best performance, in the following experiments we report only the GoogLeNet results. Figure 3 and Table 3
depicts confusion matrix and class-based F1 and accuracy results respectively.
|Id||F1 Score||Accuracy||Id||F1 Score||Accuracy|
Book based instance categorisation: Since illustrators are likely to have varying styles in different books, in this setup we attack a more challenging problem of recognizing the style on novel books. Instead of carving illustrations from one illustrator, we split our data in terms of books into training/validation and test sets. Thus, training and test sets do not share illustrations from the same book. Some illustrators have fewer books than others, but to measure the accuracy we make sure that every illustrator have at least one book in the test set. Note that, this setting is similar to recognizing unseen categories, and especially in the case of domain transfer problem. Leaving out some books mean having unseen characters and contents. Therefore, our recognition performance on this setting proves the capability of our method in recognizing the style but not the specific characters. Notice that the results are lower than the results of instance recognition as expected (see Table 4) .
Book categorisation: We further used this network to predict the illustrator of each illustration book. Note that, in the previous settings our goal was to predict the illustrator of a single page. To predict the illustrator of a book, we used majority voting and selected the illustrator as the one having the largest number of pages assigned. We evaluated the performance of book categorisation with 60 different illustration books using results of VGG-19 model, and obtained 90% accuracy on predicting illustrator of a given book. Table 4 presents the performance on book recognition.
|Network||book based instance||book|
|Color Dense SIFT||70.00||-|
4.2. Style recognition with conventional methods
As a baseline method, we utilized conventional feature extraction methods that are shown to have the highest accuracies in(Sener et al., 2012). We extracted Dense SIFT (Lowe, 2004) and Color Dense SIFT (Lowe, 2004) features from every illustration and then generated a code book for Bag-of-words (Sivic et al., 2005)2011) is used for SVM classification. We use one versus all approach for training where to prepare the training set for a class, we provide the negative samples from all other classes. A test example is fed into multiple classifiers and it is assigned to the class with the highest confidence value. Half of the data set is used to train SVMs, and the rest is used for testing the models. We observe Hellinger’s kernel boosts the performance by almost 20% over other kernels. As seen in Table 2 and Table 4 the results are much lower compared to the results of deep network architectures.
4.3. Style Transfer on Illustration Dataset
In style transfer experiments, we first selected a simple content image (cartoon image or a natural photograph) gathered from web and has no relation with our data set. Then, we randomly chose a group of illustrations from different illustrators as style images. In our second experiment, we challenged the problem and selected an illustration from our data set as the content image. In this setting, style image is an illustration from our data set, and content image is again an illustration but belongs to a different illustrator. We performed style transfer using each style and content image, and looked for the recognizing performance of our deep model on the resulting images. We use fine tuned GoogLeNet in all style transfer experiments.
Figure 5 illustrates the style transfer results for the given style and content images. As it could be seen, our model mostly succeed to capture the styles, except for ’Debi Gliori’ on both content images, who has the worst classification performance in the previous experiments as well due to large variations in her style.
4.4. Representative and discriminative elements
First, we aimed to find representative illustrations of each illustrator. As depicted in Figure 6, we compared the method in (Doersch et al., 2012), with the method in (Golge and Duygulu-Sahin, 2015) first using HOG features in both methods. Then, we utilised color dense SIFT and VGG19 fined tuned features with (Golge and Duygulu-Sahin, 2015) as well. Note that, since (Doersch et al., 2012) produces patches while (Golge and Duygulu-Sahin, 2015) gives images, only way to compare results of both algorithms was to find images which contain most of the extracted patches. While (Doersch et al., 2012) is likely to choose the pages with text as considering the font style being discriminative, (Golge and Duygulu-Sahin, 2015) is more likely to capture the style forced by the chosen feature. VGG19 was able to capture the dark colors and the strokes better than the others. Since the visual examples are subjective, in order to quantitatively compare the performance of different methods for selection of representatives we used the categorisation performance. For the first 50 images (Doersch et al., 2012) resulted in 1 incorrect classification and the others reported 100% accuracy. For a better analysis though we should look at the full list and find better comperative measures. Figure 7 shows the representatives for some other illustrators using VGG19 features with (Golge and Duygulu-Sahin, 2015). As a final experiment, we explored the patches extracted by (Doersch et al., 2012) in Figure 8 for the Korky Paul images. As seen, we are able to select stylistic elements like the head of the witch, leafless trees, or furniture, and even the typeface of fonts as discriminative elements.
We attacked the problem of recognizing style of illustrators as a pioneering work in this area. On the new dataset constructed we reported qualitative and quantitative results for three different applications: illustrator recognition, style transfer and representative instance selection. In our future work, we plan to expand the dataset with more illustrators. Moreover, better metrics are required to evaluate the quality of style transfer and selection of representatives.
- WeA (2008) 2008. We Are All Born Free: The Universal Declaration of Human Rights in Pictures. Frances Lincoln. (2008). http://www.goodreads.com/book/show/3082451-we-are-all-born-free
- you (2012) 2012. BBC Your Paintings. (2012). Dataset available at http://www.bbc.co.uk/arts/.
- qui (2014) 2014. Caffe Model Zoo: GoogLeNet Model. (2014). Model available at https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet.
et al. (2014)
Yaniv Bar, Noga Levy,
and Lior Wolf. 2014.
Classification of artistic styles using binarized features derived from a deep neural network. In
Workshop at the European Conference on Computer Vision. Springer, 71–84.
- Cai et al. (2015) Hongping Cai, Qi Wu, Tadeo Corradi, and Peter Hall. 2015. The Cross-Depiction Problem: Computer Vision Algorithms for Recognising Objects in Artwork and in Photographs. CoRR abs/1505.00110 (2015). http://arxiv.org/abs/1505.00110
- Campbell et al. (2015) Allan Campbell, Vic Ciesielksi, and AK Qin. 2015. Feature discovery by deep learning for aesthetic analysis of evolved abstract images. In International Conference on Evolutionary and Biologically Inspired Music and Art. Springer International Publishing, 27–38.
- Carneiro et al. (2012) Gustavo Carneiro, Nuno Pinho da Silva, Alessio Del Bue, and João Paulo Costeira. 2012. Artistic image classification: An analysis on the printart database. In European Conference on Computer Vision. Springer, 143–157.
- Chang and Lin (2011) Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2 (2011), 27:1–27:27. Issue 3. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
- Chu and Wu (2016) Wei-Ta Chu and Yi-Ling Wu. 2016. Deep Correlation Features for Image Style Classification. In Proceedings of the 2016 ACM on Multimedia Conference (MM ’16). 402–406.
- Crowley and Zisserman (2014a) E. J. Crowley and A. Zisserman. 2014a. In Search of Art. In Workshop on Computer Vision for Art Analysis, ECCV.
- Crowley and Zisserman (2014b) E. J. Crowley and A. Zisserman. 2014b. The State of the Art: Object Retrieval in Paintings using Discriminative Regions. In British Machine Vision Conference.
Dalal and Triggs (2005)
Navneet Dalal and Bill
Histograms of oriented gradients for human
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. IEEE, 886–893.
- DiMaggio (1987) Paul DiMaggio. 1987. Classification in art•. American sociological review (1987), 440–455.
- Doersch et al. (2012) Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. 2012. What Makes Paris Look like Paris? ACM Transactions on Graphics (SIGGRAPH) 31, 4 (2012), 101:1–101:9.
- Dumoulin et al. (2017) Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. 2017. A Learned Representation For Artistic Style. ICLR (2017). https://arxiv.org/abs/1610.07629
- Elgammal and Saleh (2015) Ahmed Elgammal and Babak Saleh. 2015. Quantifying Creativity in Art Networks. arXiv preprint arXiv:1506.00711 (2015).
- Everingham et al. (2011) M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. 2011. The PASCAL Visual Object Classes Challenge 2011 (VOC2011) Results. (2011). http://www.pascal-network.org/challenges/VOC/voc2011/workshop/index.html
- Furuya et al. (2015) T. Furuya, S. Kuriyama, and R. Ohbuchi. 2015. An unsupervised approach for comparing styles of illustrations. In 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI). 1–6. DOI:http://dx.doi.org/10.1109/CBMI.2015.7153615
- Garces et al. (2014) Elena Garces, Aseem Agarwala, Diego Gutierrez, and Aaron Hertzmann. 2014. A Similarity Measure for Illustration Style. ACM Trans. Graph. 33, 4, Article 93 (July 2014), 9 pages. DOI:http://dx.doi.org/10.1145/2601097.2601131
- Garces et al. (2016) Elena Garces, Aseem Agarwala, Aaron Hertzmann, and Diego Gutierrez. 2016. Style-based exploration of illustration datasets. Multimedia Tools and Applications (2016), 1–20. DOI:http://dx.doi.org/10.1007/s11042-016-3702-x
- Gatys et al. (2015) Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style. CoRR abs/1508.06576 (2015). http://arxiv.org/abs/1508.06576
- Gatys et al. (2016) Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2414–2423.
- Golge and Duygulu-Sahin (2015) Eren Golge and Pinar Duygulu-Sahin. 2015. FAME: face association through model evolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 43–49.
- Jia et al. (2014) Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093 (2014).
- Johnson et al. (2008) C Richard Johnson, Ella Hendriks, Igor Berezhnoy, Eugene Brevdo, Shannon Hughes, Ingrid Daubechies, Jia Li, Eric Postma, and James Z Wang. 2008. Image Processing for Artist Identification - Computerized Analysis of Vincent van Gogh’s Painting Brushstrokes. IEEE Signal Processing Magazine (July 2008).
- Johnson et al. (2016) Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. arXiv preprint arXiv:1603.08155 (2016).
- Karayev et al. (2013) Sergey Karayev, Matthew Trentacoste, Helen Han, Aseem Agarwala, Trevor Darrell, Aaron Hertzmann, and Holger Winnemoeller. 2013. Recognizing image style. arXiv preprint arXiv:1311.3715 (2013).
- Khan et al. (2014) Fahad Shahbaz Khan, Shida Beigpour, Joost Van de Weijer, and Michael Felsberg. 2014. Painting-91: a large scale database for computational painting categorization. Machine vision and applications 25, 6 (2014), 1385–1397.
- Krizhevsky et al. (2012) Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105.
- Kyprianidis et al. (2013) Jan Eric Kyprianidis, John Collomosse, Tinghuai Wang, and Tobias Isenberg. 2013. State of the” Art&# x201D;: A Taxonomy of Artistic Stylization Techniques for Images and Video. IEEE Transactions on Visualization and Computer Graphics 19, 5 (2013), 866–885.
- Li and Chen (2009) C. Li and T. Chen. 2009. Aesthetic Visual Quality Assessment of Paintings. IEEE Journal of Selected Topics in Signal Processing 3, 2 (April 2009), 236–252.
- Li and Wang (2004) Jia Li and J. Z. Wang. 2004. Studying Digital Imagery of Ancient Paintings by Mixtures of Stochastic Models. Trans. Img. Proc. 13, 3 (March 2004), 340–353. DOI:http://dx.doi.org/10.1109/TIP.2003.821349
- Lowe (2004) David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2 (2004), 91–110.
- Lyu et al. (2004) Siwei Lyu, Daniel Rockmore, and Hany Farid. 2004. A digital technique for art authentication. Proceedings of the National Academy of Sciences of the United States of America 101, 49 (2004), 17006–17010.
Matsuo and Yanai (2016)
Shin Matsuo and Keiji
CNN-based style vector for style image retrieval. InProceedings of the 2016 ACM on International Conference on Multimedia Retrieval. ACM, 309–312.
- Mensink and Van Gemert (2014) Thomas Mensink and Jan Van Gemert. 2014. The rijksmuseum challenge: Museum-centered visual recognition. In Proceedings of International Conference on Multimedia Retrieval. ACM, 451.
- Russakovsky et al. (2015) Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252. DOI:http://dx.doi.org/10.1007/s11263-015-0816-y
- Saleh et al. (2014) Babak Saleh, Kanako Abe, Ravneet Singh Arora, and Ahmed Elgammal. 2014. Toward automated discovery of artistic influence. Multimedia Tools and Applications (2014), 1–27.
- Saleh and Elgammal (2015) Babak Saleh and Ahmed Elgammal. 2015. Large-scale Classification of Fine-Art Paintings: Learning The Right Metric on The Right Feature. arXiv preprint arXiv:1505.00855 (2015).
- Sener et al. (2012) Fadime Sener, Nermin Samet, and Pinar Duygulu Sahin. 2012. Identification of illustrators. In European Conference on Computer Vision. Springer Berlin Heidelberg, 589–597.
- Simonyan and Zisserman (2014) K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).
- Sivic et al. (2005) Josef Sivic, Bryan C Russell, Alexei A Efros, Andrew Zisserman, and William T Freeman. 2005. Discovering object categories in image collections. (2005).
- Spratt and Elgammal (2014) Emily L. Spratt and Ahmed M. Elgammal. 2014. Computational Beauty: Aesthetic Judgment at the Intersection of Art and Science. In Computer Vision - ECCV 2014 Workshops - Zurich, Switzerland, September 6-7 and 12, 2014, Proceedings, Part I. 35–53. DOI:http://dx.doi.org/10.1007/978-3-319-16178-5_3
- Szegedy et al. (2015) Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1–9.
- Tan et al. (2016) W. R. Tan, C. S. Chan, H. E. Aguirre, and K. Tanaka. 2016. Ceci n’est pas une pipe: A deep convolutional network for fine-art paintings classification. In 2016 IEEE International Conference on Image Processing (ICIP). 3703–3707. DOI:http://dx.doi.org/10.1109/ICIP.2016.7533051
- Thomas and Kovashka (2015) Christopher Thomas and Adriana Kovashka. 2015. Who’s Behind the Camera? Identifying the Authorship of a Photograph. arXiv preprint arXiv:1508.05038 (2015).
Yosinski et al. (2015)
Jason Yosinski, Jeff
Clune, Anh Nguyen, Thomas Fuchs, and
Hod Lipson. 2015.
Understanding Neural Networks Through Deep
Deep Learning Workshop, International Conference on Machine Learning (ICML).