Being able to understand and intentionally create illusions is currently only possible for humans. Being able to accurately recognize illusory patterns using a computer, and to generate novel illusion images, would represent a huge advancement in computer vision. Current systems are capable of predicting the effect of specific classes of illusions, such as color consistency illusions ROBINSON20071631 and length illusions Garibay2015 bertulis2001distortions
. A reinforcement learning system learned to perceive color consistency illusion after training to predict color values where half of the image was covered in a tinted film6400580 , showing that perception of an illusion can emerge from the demands of seeing in a complicated world. It is also important to consider whether making a perceptual mistake similar to humans constitutes having a visual experience similar to humans DBLP:journals/corr/abs-1712-04020 .
Recent work on generative adversarial networks (GANs) karras2017progressive has shown that high resolution images of faces can be created using a large dataset of 30,000 images. This size and quality of images is not available for optical illusions; naively applying their methods to this dataset does not have the same results, as discussed below. The number of static optical illusion images is in the low thousands, and the number of unique kinds of illusions is certainly very low, perhaps only a few dozen (for example, the Scintillating Grid illusion, Cafe Wall Illusion and other known categories). Creating a model capable of learning from such a small and limited dataset would represent a huge leap in generative models and understanding of human vision.
2 Related Works
Research into biologically plausible models makes it possible to learn about visual phenomenon by doing experiments on proxies for the real human vision system. Elsayed et al. found that by selecting the right models, adversarial examples for these models were also effective on time-limited humans 1802.08195 . The Brain-Score metric SchrimpfKubilius2018BrainScore measures internal and behavioral similarity between computer and primate image recognition. As this metric is developed and models with higher scores are created, those model may be capable of experiencing more kinds of optical illusions otherwise only experienced by primates.
To our knowledge, no dataset of this kind has been created before.
3 Data Collection
3.1 Image Sources
Twelve different websites that collect and display optical illusions (such as one shown in Figure 1) were considered for inclusion in the dataset. Most proved to be too small or not containing the right content. For instance, the site “Visual Phenomena & Optical Illusions” MichaelBach contains many interesting and visually powerful demonstrations of optical illusions, but very few still images that by themselves contain a visual effect. In the end, “Mighty Optical Illusions” MoIllusions and “ViperLib” ViperLib proved to be the best sources of illusion images, both containing labeled images and containing almost exclusively static images.
The “Illusions of the Year” IllusionOfTheYear contest also seemed to be a good source of images, but they only post the winning results publicly. Emails to the website owner requesting all of the submissions were not answered.
3.2 Data Collection Results
We created a web scraper to go to each page of Mighty Optical Illusions and download the images on that page (source is available at OpticalIllusionDataset ). In total, 6436 images obtained, along with their metadata such as categories and page titles. ViperLib was scraped in a similar manner, obtaining 1454 images also organized into categories and with page titles.
4 Machine Learning Results
Two different kinds of models were tested on subsets of the data. A classifier was trained to test how visually distinguishable the given classes were and a generative model was trained to see if new instances of known illusions could be created by naively applying existing methods for image generation.
4.1 Classifier Results
A pretrained “bottleneck” model BottleneckTurorial
was used to classify images from Mighty Optical Illusions. Only the last few layers had to be retrained, making use of transfer learning from a much larger dataset to learn to classify images in general. Each image in the training data may belong to multiple classes, which was not accounted for in the model. For the purpose of early dataset evaluation this flaw can be overlooked, but for more complete testing a multiclass model would need to be used. The results of training can be seen in Figure2.
The model performed significantly better then random, meaning that the given classes are meaningful in a way the can be detected using a model trained on normal classes of images. A more in depth study could reveal more about how the neural network is able to distinguish these classes, such as the methods used in1710.00935 .
4.2 Generative Adversarial Network Results
A trial run using a generative adversarial network was attempted. Using HyperGAN HyperGAN
on a hand picked subset of the data with no hyperparameter optimization, nothing of value was created after 7 hours of training on an Nvidia Tesla K80. The training progression is shown in Figure3. Possible improvements could be pretraining on a larger dataset, tweaking hyperparameters, and using dataset expansion techniques.
5 Future Work
The only optical illusions known to humans have been created by evolution (for instance, eye patterns in butterfly wings) or by human artists. Both artistic designers of illusion images and the glacial process of evolution have access to active vision systems to verify their work against. An illusion artist can make an attempt at creating an illusion, observe its effect on their eyes, and add or remove elements to try to create a more powerful illusion. In an evolutionary process, every agent has a physical appearance and a vision system, allowing for patterns to be verified in their environment constantly. A GAN trained on existing illusions would have none of these advantages, and it seems unlikely that it could learn to trick human vision without being able to understand the principles behind these illusions. Because of these limitations, it seems that a dataset of illusion images might not be sufficient to create new illusions and a deeper understanding of human vision would need to be obtained somehow. This could be by having a human giving feedback as the network learned, or by learning an accurate proxy for human vision and trying to deceive the proxy as in 1802.08195 .
Appendix A Downloading the Dataset
Images are currently hosted on the machine learning cloud platform “Floydhub” and can be downloaded without an account.
This contains all images that were downloaded, using the same numbering scheme as the metadata on the linked github repository.
This folder contains images hand picked for having obvious visual effects without having to follow special instructions.
A. E. Robinson, P. S. Hammon, V. R. de Sa,
brightness illusions using spatial filtering and local response
normalization, Vision Research 47 (12) (2007) 1631 – 1644.
O. B. García-Garibay, V. de Lafuente,
müller-lyer illusion as seen by an artificial neural network, Front
Comput Neurosci 9 (2015) 21, 25745398[pmid].
- (3) A. Bertulis, A. Bulatov, Distortions of length perception in human vision, Biomedicine 1 (1) (2001) 3–23.
- (4) K. Shibata, S. Kurizaki, Emergence of color constancy illusion through reinforcement learning with a neural network, in: 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL), 2012, pp. 1–6. doi:10.1109/DevLrn.2012.6400580.
R. V. Yampolskiy, Detecting qualia in
natural and artificial agents, CoRR abs/1712.04020.
- (6) T. Karras, T. Aila, S. Laine, J. Lehtinen, Progressive growing of gans for improved quality, stability, and variation, arXiv preprint arXiv:1710.10196.
- (7) G. F. Elsayed, S. Shankar, B. Cheung, N. Papernot, A. Kurakin, I. Goodfellow, J. Sohl-Dickstein, Adversarial examples that fool both computer vision and time-limited humansarXiv:arXiv:1802.08195.
- (8) M. Schrimpf, J. Kubilius, H. Hong, N. J. Majaj, R. Rajalingham, E. B. Issa, K. Kar, P. Bashivan, J. Prescott-Roy, K. Schmidt, D. L. K. Yamins, J. J. DiCarlo, Brain-score: Which artificial neural network for object recognition is most brain-like?, bioRxiv preprint.
Visual phenomena & optical illusions
M. O. Illusions, Mighty optical illusions
P. Thompson, Y. D. U. Rob Stone at the Department of Psychology, University
of York, Viperlib (2018).
N. C. Society, Illusions of the year
R. M. Williams,
retrain an image classifier for new categories (2018).
- (15) Q. Zhang, Y. N. Wu, S.-C. Zhu, Interpretable convolutional neural networksarXiv:arXiv:1710.00935.
M. e. a. 255bits(Martyn, Hypergan
- (17) I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial networks (2014). arXiv:arXiv:1406.2661.