Introduction
Fractals are of frequent interest to the mathematical art community, perhaps owing to the ease with which fractals produce aesthetically pleasing designs. Iterated Function System (IFS) fractals are a particular type of fractal, constructed by repeatedly applying a set of transformations to an initial image. In this paper, I demonstrate a system for drawing IFS fractals written in Pytorch, a programming language with automatic differentiation capability. This program takes as input a line drawing, builds an IFS fractal using that drawing as a base, and outputs an image. The advantage of doing this in Pytorch is that it enables us to optimize our original line drawing so that the produced IFS fractal resembles a target image. The main inspiration for this project was ”Differentiable Drawing and Sketching”, by Mihai et al.
[6]. The authors presented a framework for differentiating through the rasterization of lines and curves, using the insight that the distance from a pixel to a line can act as a differentiable proxy for rasterization. Since distance estimators have a long history of use in the fractal rendering community
[1, 2, 5], it seemed natural and exciting to try to combine some of those ideas with the techniques from Mihai et al.SelfSimilar Curves
A common construction for selfsimilar curves is to start with a base figure, and then recursively replace parts of the current figure with a transformed copy of the entire figure. To do this, we apply a set of (usually affine) transformations to the starting figure. Adding together the images produced by each transformation yields a new figure, which can be transformed the same way. For example, the famous von Koch curve (see Figure 1) is generated by beginning with the line segment between and and recursively applying the four affine transformations shown in Equation 1. The attractor of an IFS is the limit figure which results from applying this process infinitely many times.

(1) 
Differentiable Rendering
Automatic Differentiation (“autodiff” or AD) is a programming paradigm that has aided in the explosive growth of deep learning. The basic idea of autodiff is that all basic operations in the programming language are defined in a way that includes their derivative. This means that we can use the chain rule to compute the derivative of our entire program with respect to its inputs, enabling optimization via gradient descent. Recent work has successfully used autodiff to write full rasterization pipelines which are differentiable. Mihai et al.
[6] detail how to (differentiably) turn an implicit line segment into a pixelated picture of a line segment; in the next section we take the ability to do this as a given, and use it to produce fractal images.Rendering Process
The parameters of our differentiable fractal are a set of variable control points , where all , and a pair of endpoints . For stability during optimization, the endpoints are fixed. The basic process in generating a fractal figure from a set of control points is as follows:

Build the list of line segments: Start with the single line segment . Then, up to a set maximum number of recursions , replace the current list of line segments with the result of applying each transformation to each line segment in the current list. The result is a set of segments total.

Compute the distance from each pixel to each line segment .

Color the pixel with the value , where the minimization over simply means we are taking the closest of the line segments to that pixel. is a radius parameter that determines the radius of the line; I fixed it at , so that the line is one pixel wide.
To get an IFS fractal image that looks like our source image, all we need to do is implement the above steps in Pytorch (see attached notebook). An optimization algorithm then adjusts the position of the control points to minimize the loss, i.e. so that our rendered image matches some target image. My original goal for this project was to find IFS fractals that filled the unit square, without selfintersection. Thus, for all of my experiments the target image was a filled square, and I added a term to the loss function that discouraged selfintersection (see below). Certainly the same process could be applied to other images. The maximum recursion depth,
, should be set to be as deep as possible within computation constraints. I will now discuss some of the details of the rendering pipeline; readers who want to recreate the images in this paper should refer to the included Jupyter notebook. See Figure 3 for example IFS fractals generated with this code.Initial Conditions.
Since I wanted fractals which did not selfintersect, I set the initial control points to produce fractals that already had this property, like the von Koch curve or the Sierpinski Arrowhead curve. By starting with the set of control points that produce one of these fractals and adding some random noise to the control points, we get a different fractal each time we perform the optimization.
Calculating Transformations.
An important ingredient in the above process is the calculation of the linear transformation which takes our endpoints to each line segment. To do this for two line segments and , we: a) find the translation that takes to
, b) find the rotation that aligns the two vectors, and then c) scale
to have the same length as . The composition of these three linear transformations is the one we want. We do this for each line segment between subsequent control points. Optionally, we can reflect the new copy of the transformed image by mapping to instead.Image Mapping Trick.
Once the first few levels of the IFS fractal have been generated using line segments, we can easily approximate further levels without needing to explicitly recompute distances. To do this, we simply take the rendered image and apply each of our transformations . We can use builtin Pytorch methods for differentiably applying affine transformations to images. This method is less accurate than the explicit distance calculations, but can approximate the appearance of many () levels of recursion, while only explicitly calculating the distances for levels.
Loss Function.
Following the example of [6], I used Blurred Mean Squared Error (BMSE) as the loss function that my optimizer was trying to minimize. MSE encourages our rendered image to resemble the target image, and blurring both images before comparing them keeps our optimizer from getting stuck in local optima. In all cases, the ‘target image’ I was training against was just the unit square.
Preventing Crossings.
There are two tricks I found which help prevent selfcrossings. The first is to replace the list of control points with a cumulative sum of vectors in  so each parameter vector no longer represents a single point but instead the offset from to . This has the effect of ’unwinding’ loops in the set of points. The second trick I used was a penalty function, added to the loss, which discourages self crossings. This was fairly easy to implement: since we already need to compute the distance from a pixel to the closest line segment, we can keep track of the secondclosest while we do so. The product of the closest distance and the secondclosest distance is a number that will be small when our pixel is simultaneously close to multiple line segments. This makes a ’heatmap’ of crossings. We can add the mean value of this heatmap to our loss to discourage selfintersection. See Figure 2 for an example heatmap.
Adaptive Image Resolution
Optimizing a coarse image is faster, but lowresolution images don’t look very good. To generate higher resolution images I start from the lowresolution image, and refine it. This is done by repeatedly a) resizing the image to be larger, and then b) finding the distance only for pixels that were close to the fractal in the previous resolution.
Summary and Conclusion
This work presents an initial attempt at a differentiable rendering pipeline for IFS fractals. There are several modifications that could be make to make the process more efficient, such as the rendering tricks mentioned in [3]. It may also be possible to use this approach for smooth selfsimilar curves, adapting the approach of [4]. Finally, I would like to repeat these experiments with the ’target’ image being something more interesting than the unit square, like a natural image or an image of a preexisting fractal. This might be a technique for recovering the original parameters for an IFS fractal from an image.
References
 [1] (2011) Distance estimated 3d fractals. External Links: Link Cited by: Introduction.
 [2] (1989) Ray tracing deterministic 3d fractals. In Proceedings of the 16th annual conference on Computer graphics and interactive techniques, pp. 289–296. Cited by: Introduction.
 [3] (1991) Rendering methods for iterated function systems. Cited by: Summary and Conclusion.
 [4] (2011) Smooth selfsimilar curves. In Proceedings of Bridges 2011: Mathematics, Music, Art, Architecture, Culture, R. Sarhangi and C. H. Séquin (Eds.), Phoenix, Arizona, pp. 209–216. External Links: ISBN 9780984604265, ISSN 10996702 Cited by: Summary and Conclusion.
 [5] (2014) Numerical methods for ray tracing implicitly defined surfaces. Williams College. Cited by: Introduction.
 [6] (2021) Differentiable drawing and sketching. arXiv preprint arXiv:2103.16194. Cited by: Introduction, Differentiable Rendering, Loss Function..