Humans are sensitive to complexity and regularity in patterns [YKM13, FK97]. The subjective perception of pattern complexity is correlated to algorithmic (Kolmogorov-Chaitin) complexity as defined in computer science [LV08], but also to the frequency of naturally occurring patterns [HGS10]. However, the possible mediational role of natural frequencies in the perception of algorithmic complexity remains unclear. Here we reanalyze [HGS10] through a mediational analysis, and complement their results in a new experiment. We conclude that human perception of complexity seems partly shaped by natural scenes statistics, thereby establishing a link between the perception of complexity and the effect of natural scene statistics.
Keywords: visual complexity; visual perception; algorithmic complexity; randomness
Humans are extremely sensitive to patterns and regularities [YKM13]. Our brains detect slight departures from randomness. Someone throwing 3 dice and getting three ‘6s’ or the pattern ‘1, 2, 3’ is likely to be stunned by the fact that these combinations are regular, to be incredulous in the face of this meager evidence that the dice are fai [FK97]. This general human feature—the discernment of rules governing the world—may be thought of as the cognitive basis of science, but also as an adaptive ability shaped by natural evolution to avoid predictable dangers. Psychologists have linked our natural perception of randomness to the mathematical theory of algorithmic complexity (also known as Kolmogorov-Chaitin complexity): the more complex the stimulus, the more random it will be perceived to be. Formally, the algorithmic complexity of a sequence is the length of the shortest program that produces the sequence in question and halts [LV08]
. In this definition, the said program doesn’t involve a specific computer, but rather a general Universal Turing Machine, an abstract computer. Algorithmic complexity is related to the probability that such a machine, fed with a random program, will produce a particular sequence and halt, a link formally proven by the coding theorem[Lev74] and initially conceived to solve the problem of induction—which it does in a very general and powerful way [Sol64]—so powerful that the measure is indeed ultimately uncomputable, though approximations are possible. The main intuition behind algorithmic probability is that if a sequence is not random then it will contain some regularity that can be encoded in a computer program of length shorter than the sequence that can generate it by mechanistic means. And shorter programs are more likely to occur, and therefore more frequent than longer ones if each program instruction is uniformly randomly chosen, which establishes a powerful connection between complexity and frequency. Within the framework of algorithmic complexity theory, “randomness” and “complexity” are interchangeable concepts: the formal definition of randomness relies on complexity, and complexity is a direct measure of randomness.
The hypothesis that the human perception of randomness is linked to algorithmic complexity could not be verified prior to recent developments in computer science. Indeed, if methods have long existed that allow satisfactory estimations of the algorithmic randomness of long sequences, such as compression algorithms[ZL78], until recently no such methods were available to assess the algorithmic complexity of short sequences [STZDG13, STZDG14, ZSTDG12, GZDST14].
In light of algorithmic complexity, we undertook an investigation into how humans perceive randomness, and how we learn (if we do) to perceive complexity. Hsu, Griffiths and Schreiber [HGS10] advanced an interesting hypothesis: the frequency with which a pattern appears in real world scenes could explain how we perceive randomness, permitting us to infer complexity from the world we see. Hsu et al. [HGS10] scanned a set of photographs of real world natural scenes and extracted every possible
array from these images. Then they computed the resulting probability distribution, and derived the randomness of each array, defined as being the relative frequency of the array in the natural scene database, and the probability that this array appears by chance if every cell in the array is selected at random (either white or black). They chose 100 balanced arrays with probabilities of occurrence in real scenes ranging from low to high. They then had 77 subjects decide whether these arrays looked random or not. This led to a measure of subjective randomness (the proportion of participants declaring the array random) for each array. They found that subjective probability and natural randomness were positively correlated on these particular 100 arrays . We computed that Kolmogorov-Chaitin complexity is also significantly linked to the subjective perception of randomness. With the arrays published in Hsu et al. [HGS10], we found a correlation of between two-dimensional algorithmic complexity as defined in Zenil et al. [ZSTDG12] —see also Gauvrit, Zenil, Delahaye and Soler-Toscano [STZDG13]—and subjective probability. This pattern of correlations is not surprising. Because the world can be thought of as a generator of patterns, like a random computer program, the probability that an array will occur in the world is then linked to its algorithmic complexity. Indeed, as computed with the 100 arrays of Hsu et al. [HGS10], we also found a positive correlation between algorithmic complexity and natural scene statistics . Could natural scene statistics account for human perception of algorithmic complexity? This would be in line with recent results in neuroscience reported by Berkes, Orban, Lengyel and Friser [BOLF11]. They analyzed cortical activity in ferrets, and compiled evidence in favor of the hypothesis that our brain learns an optimal internal probabilistic model of the environment, based on natural world frequencies—see also [TVG11] for examples of children’s rapid adaptation to natural frequencies. But how much of our perception of randomness is attributable to learning through the natural world? To answer this question, we performed a mediation analysis using scaled data. A regression of subjective randomness on both algorithmic complexity and natural scenes statistics gives an adjusted R-squared equal to . Figure 1(A) displays the coefficients linking complexity to subjective randomness and natural scenes statistics to subjective randomness, controlling for algorithmic complexity . In this figure, “Algorithmic complexity” stands for the Kolmogorov-Chaitin complexity of the arrays, as approximated by the method described in Zenil et al. [ZSTDG12]. “Natural statistics” refers to the random function defined above, in which stands for the frequency of array in the natural scenes dataset. Last, “subjective randomness” is a shorthand for , where designates the proportion of participants who indicate that is seemingly random. A Sobel test confirms the mediational role of natural scene statistics .
The result suggests that our perception of complexity is partially driven by the perception of natural scenes. However, it is fair to underscore two points that may prejudice the values found here. First, we cannot control the link between “natural scene statistics” (i.e. the “random” function) and the choice of the set of pictures. Second, because the 100 arrays chosen for use here fall within certain parameters (they are all balanced, and have been chosen in such a way that they are evenly distributed on the natural scene statistics scale), variance of natural scene statistics could be artificially high. In the following experiment, we overcome these two possible drawbacks in order to get a clear view of the possible mediational role of natural scene statistics in the perception of complexity.
We perform an experiment similar to the one presented above, but releasing some constraints that could affect the results. We do not impose that every pattern is balanced in terms of white and black cells. We do not choose still nature shots only. Our hypothesis is that even when these constraints are relieved, natural scene statistics will play a mediational role.
A sample of 100 participants (59 male, 41 female) was recruited via the Amazon Mechanical Turk. Hired “workers” from the Mechanical Turk were required to have a 90% approval rating on previous Mechanical Turk tasks (HITs) and at least 50 previous HITs approved. Ages in years ranged between 19 and 55 . Participants were paid 0.30 USD for their participation. The experiment duration ranged from 84s to 289s . Older participants in this sample showed a slight tendency to need more time .
Hsu et al. [HGS10] used a set of 62 pictures previously used by Doi, Inui, Lee, Wachtler and Sejnowski [ILW03] to compute the natural scenes statistics. All pictures were still nature shots, including no faces, urban scenes or artificial objects. Therefore, the random function may vary if computed with other sets of pictures. To test this hypothesis, we applied the method used by Hsu et al. [HGS10] to a new set of 100 random pictures, taken from the Wikimedia Commons database111http://commons.wikimedia.org/wiki/Special:Random/Image
. The sample included natural scenes but also animals and non-natural objects such as buildings. Then we binarized the pictures to black and white pixels using the median as the threshold. We then divided each image intoadjacent binary square arrays and calculated (for the whole set of 100 images) the probability of each square. The resulting “random” function is strongly correlated to the data obtained by Hsu et al. [HGS10]when computed on their choice of 100 arrays , which validates the method. The correlation between natural scenes statistics (function random) and algorithmic complexity was .50 when computed on the 100 arrays chosen by Hsu et al [HGS10]. However, because the choice of arrays was not random, the correlation could well be overestimated. When computed on every possible array found while scanning the 100 random pictures, the correlation remains highly significant, although slightly lower , confirming the previous result. We then picked at random a sample of 100 arrays from among all the arrays found in our set of 100 images, using the sample function in R. We did not contrive to obtain balanced arrays, a departure from the design of Hsu et al. [HGS10]. Figure 2 displays the 100 arrays obtained by random selection.
The procedure mirrored the one used in Hsu et al. (2010), although our experiment took place online. Participants filled out a questionnaire similar to that used by Hsu et al. (2010). They were informed that a series of arrays would appear on the screen, and that their task was to decide whether the arrays were produced by a random process or by a nonrandom process. For each array, they were asked to press a button, either “random” or “not random” according to their perception.
The data were analyzed with the same method as Hsu et al. [HGS10]. Algorithmic complexity is positively correlated with natural statistics and subjective randomness , as are subjective randomness and natural statistics . A multiple regression of subjective randomness on algorithmic complexity and natural scene statistics yields an adjusted R-squared of . Figure 1(B) displays the coefficients linking complexity to subjective randomness and natural scenes statistics to subjective randomness, controlling for algorithmic complexity . A Sobel test confirms the mediational role of natural scene statistics .
Perhaps as an upshot of the eschewal of constraints as compared with the Hsu et al. [HGS10] study (balanced arrays scattered along the natural probability range), the coefficients are now smaller. However, the patterns of correlations are remarkably similar. These results suggest that natural scene statistics are indeed an important element in the perception of complexity. The correspondence between the reanalysis and the subsequent experiment also suggests that there is some objective natural probability of arrays, linked both to our perception of complexity and to the formal definition of complexity arising from Kolmogorov-Chaitin theory. Although our perception of complexity may be largely explained by natural scene statistics, this does not preempt the possibility of a complementary means of perception, which could eventually turn out to be innate. However, further studies would be needed to confirm this assumption.
- [BOLF11] Pietro Berkes, Gergő Orbán, Máté Lengyel, and József Fiser. Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science, 331(6013):83–87, 2011.
- [FK97] Ruma Falk and Clifford Konold. Making sense of randomness: Implicit encoding as a basis for judgment. Psychological Review, 104(2):301, 1997.
- [GZDST14] Nicolas Gauvrit, Hector Zenil, Jean-Paul Delahaye, and Fernando Soler-Toscano. Algorithmic complexity for short binary strings applied to psychology: a primer. Behavior research methods, 46(3):732–744, 2014.
- [HGS10] Anne S Hsu, Thomas L Griffiths, and Ethan Schreiber. Subjective randomness and natural scene statistics. Psychonomic bulletin & review, 17(5):624–629, 2010.
- [ILW03] Toshio Inui, Tai Sing Lee, Thomas Wachtler, Terrence J Sejnowski, et al. Spatiochromatic receptive field properties derived from information-theoretic analyses of cone mosaic responses to natural scenes. Neural Computation, 15(2):397–417, 2003.
Laws of information conservation (non-growth) and aspects of the foundation of probability theory.Problems in Form. Transmission, 10:206–210, 1974.
- [LV08] M. Li and P. Vitányi. An Introduction to Kolmogorov Complexity and Its Applications. Springer, Heidelberg, 2008.
- [Sol64] R. J. Solomonoff. A formal theory of inductive inference: Parts 1 and 2. Information and Control, 7(1-22):224–254, 1964.
- [STZDG13] Fernando Soler-Toscano, Hector Zenil, Jean-Paul Delahaye, and Nicolas Gauvrit. Correspondence and independence of numerical evaluations of algorithmic information measures. Computability, 2(2):125–140, 2013.
- [STZDG14] Fernando Soler-Toscano, Hector Zenil, Jean-Paul Delahaye, and Nicolas Gauvrit. Calculating kolmogorov complexity from the output frequency distributions of small turing machines. PLoS ONE, 9(5), 2014.
- [TVG11] Ernő Téglás, Edward Vul, Vittorio Girotto, Michel Gonzalez, Joshua B Tenenbaum, and Luca L Bonatti. Pure reasoning in 12-month-old infants as probabilistic inference. science, 332(6033):1054–1059, 2011.
- [YKM13] Yuki Yamada, Takahiro Kawabe, and Makoto Miyazaki. Pattern randomness aftereffect. Scientific reports, 3, 2013.
- [ZL78] Jacob Ziv and Abraham Lempel. Compression of individual sequences via variable-rate coding. Information Theory, IEEE Transactions on, 24(5):530–536, 1978.
- [ZSTDG12] Hector Zenil, Fernando Soler-Toscano, Jean-Paul Delahaye, and Nicolas Gauvrit. Two-dimensional kolmogorov complexity and validation of the coding theorem method by compressibility. arXiv preprint arXiv:1212.6745, 2012.