DeepAI

# Space-time tradeoffs of lenses and optics via higher category theory

Optics and lenses are abstract categorical gadgets that model systems with bidirectional data flow. In this paper we observe that the denotational definition of optics - identifying two optics as equivalent by observing their behaviour from the outside - is not suitable for operational, software oriented approaches where optics are not merely observed, but built with their internal setups in mind. We identify operational differences between denotationally isomorphic categories of cartesian optics and lenses: their different composition rule and corresponding space-time tradeoffs, positioning them at two opposite ends of a spectrum. With these motivations we lift the existing categorical constructions and their relationships to the 2-categorical level, showing that the relevant operational concerns become visible. We define the 2-category 2-Optic(𝒞) whose 2-cells explicitly track optics' internal configuration. We show that the 1-category Optic(𝒞) arises by locally quotienting out the connected components of this 2-category. We show that the embedding of lenses into cartesian optics gets weakened from a functor to an oplax functor whose oplaxator now detects the different composition rule. We determine the difficulties in showing this functor forms a part of an adjunction in any of the standard 2-categories. We establish a conjecture that the well-known isomorphism between cartesian lenses and optics arises out of the lax 2-adjunction between their double-categorical counterparts. In addition to presenting new research, this paper is also meant to be an accessible introduction to the topic.

05/08/2022

### Dynamic categories, dynamic operads: From deep learning to prediction markets

Natural organized systems adapt to internal and external pressures and t...
02/01/2019

### Categorical Operational Physics

Many insights into the quantum world can be found by studying it from am...
07/02/2018

### Rule Algebras for Adhesive Categories

We show that every adhesive category gives rise to an associative algebr...
02/28/2019

### Homunculus' Brain and Categorical Logic

The interaction between syntax (formal language) and its semantics (mean...
05/04/2019

### A categorical construction for the computational definition of vector spaces

Lambda-S is an extension to first-order lambda calculus unifying two app...
01/26/2021

### Cyber Kittens, or Some First Steps Towards Categorical Cybernetics

We define a categorical notion of cybernetic system as a dynamical reali...
01/03/2017

### Ambiguity and Incomplete Information in Categorical Models of Language

We investigate notions of ambiguity and partial information in categoric...

## 1 Introduction

Lenses and optics are have recently received a great deal of attention from the applied category theory community. They are abstract data structures that model systems exhibiting bidirectional data flow. There’s a number of disparate places they’ve been discovered in: deep learning (

), game theory (

[DiegeticOpenGames, CompositionalGameTheory]), bayesian learning ([Bayesianlearning, BayesianOG][ValueIterationIsOpticComposition]), database theory ([FunctorialAggregation]), dynamical systems ([OpenDynamicalSystems]), data accessors ([ModularDataAccessors]), trading protocols ([Escrows]), server operations ([LensesForComposableServers]) and more ([LensResources]).

As evident by the breadth of their applications, lenses and optics don’t assume that the underlying systems are of any particular kind. Instead, they are defined parametrically for some base category , which is only required to satisfy a minimal set of axioms. By appropriately instantiating this category we can recover various kinds of systems – deterministic, probabilistic, differentiable, and so on. This makes it possible to treat the bidirectionality in an abstract way, proving theorems about whole classes of bidirectional processes that satisfy particular properties.

The two constructions we focus on in this paper – cartesian lenses and optics111We note that there is a whole zoo of bidirectional gadgets, each with their own kind of behavior, and their own abstract interface that the underlying world needs to satisfy. A lot of effort has been put in towards representing all of these constructions in an unifying way (see any of [DependentOptics, FibreOptics, ProfunctorOptics, GeneralisedLensCategories]) – are related, but have different requirements on the base category. For optics to be defined, we require the base category to permit parallel composition of processes, i.e. a monoidal structure. For lenses we additionally require this structure to be cartesian, i.e. the ability to coherently copy and delete information. These play well together – defining optics in a cartesian category gives us a category isomorphic to lenses, as it is well-established in the literature.

In this we paper observe that this isomorphism is denotational in nature and blind to operational concerns relevant to their practical implementations. Namely, it treats optics extensionally – describing them as being observed from the outside. This means that any matters of their internal setup, especially ones relevant to making a distiction between an efficent and an inefficient implementation, are ignored. But in a modern, software oriented world we’re not merely observing these optics from the outside in – we’re instead building them from the inside out. We’re choosing their particular internal states, and in most cases we don’t have the luxury of not distinguishing between an efficient and an inefficient representation, as often only the former can compute an answer for us. As the current categorical framework doesn’t have a high-enough resolution to formally capture these distinctions, we seek to provide one. We lift the existing 1-categorical formalism to a 2-categorical one. We show how to track and manipulate the internal state of optics, making a distinction between denotationally equivalent, but operationally different kinds of optics.

We first start out by unpacking the category of lenses and emphasizing that a seldom talked about perspective: that lenses are cartesian optics with one option removed: the option to choose the type of its internal state. We show how this lack of a tangible way to refer to this important notion has significant operational consequences when composing lenses. Namely, we’ll see that lenses implement a particular kind of a space-time tradeoff that’s in the deep learning literature called gradient checkpointing. We move on to unpacking the category of optics and notice they they have a different composition rule than lenses, implementing a different space-time tradeoff. What motivates the rest of the paper is the observation that the difference is completely invisible to the categorical machinery.

We observe that optics are defined by a particular kind of colimit, suggesting an avenue forward by instead defining them as an oplax colimit. We do so, and thus define the 2-optics: a 2-category whose 2-cells now explicitly track their internal state. We show that they coherently reduce to 1-optics by locally computing their connected components.

We then go on to explore the isomorphism in this 2-categorical setting. We show that the 2-categorical setting now locally hosts an adjunction between the corresponding categories, and that the embedding of lenses into cartesian optics is upgraded from a functor to an oplax functor whose oplaxator now detects the different composition rule. We determine the difficulties in showing the oplax functor forms in any of the standard categories whose 1-cells are lax functors. We establish a conjecture that the well-known isomorphism between cartesian lenses and optics arises out of the lax 2-adjunction between their double-categorical counterparts, as an image under the local connected components quotient.

### 1.1 Acknowledgements.

We thank Igor Baković, Fosco Loregian, Mario Román, Matteo Capucci and Jules Hedges for helpful conversations.

### 1.2 Notation.

We write morphisms in diagrammatic order, so composition of and is written as . We write for the projection in the 2nd component.

## 2 Cartesian Lenses

Lenses come in many shapes and sizes. In this paper, we tell the story from the point of view of cartesian222As opposed to closed. lenses ([BimorphicLenses, CategoriesOfOptics, ProfunctorOptics]). This is a category which we denote by . We proceed to unpack its contents and, as most of the content of this paper is motivated by its previously unnoticed operational aspects, we take special care in doing so. We look out for any potentially resource-relevant operations such as data copying or recomputation.

can be defined for any base category which is cartesian monoidal. Its objects are pairs of objects in , denoted by , where we interpret the value of type as going forward and value of type as going backward.

A morphism is a cartesian lens. It consists of a two morphisms in , and , roughly thought of as the forward and the backward part of a lens.333Sometimes the terminology and is used instead of and . We can visualise lenses graphically using the formalism of string diagrams [Selinger] (Figure 1), an especially useful visual language for studying the flow of information in a lens.

The flow of information works as follows. Information starts at the input of type . The lens takes in this input and produces two things: a copy of it (sent down the vertical wire, where the operation of copying is drawn as a black dot) and the output (via the ) map. This is the forward pass of the lens, and is drawn with the gray outline. Then, the environment takes this output and turns it into a response (not drawn). This lands us in the backward pass of the lens. Here the lens via the map consumes two things: the response and the previously saved copy of the input on the vertical wire, turning them back into .

A lens has an inside and an outside. The outside are the ports and . These ports are the interface to which other lenses connect. The inside is the vertical wire whose type is . The vertical wire is the internal state of the lens (sometimes also called the residual) – mediating the transition between the forward and the backward pass.

Here we are explicitly referring to the internal state because of what’s to come, but we emphasize that in the lens literature this concept hasn’t been reified. In the lens literature the internal state is not explicit data that can be manipulated, and is instead being implicitly threaded through definitions and theorems – always being pegged to the forward part of the domain of a lens. More precisely, the type of the internal state of a lens is always equal to . In what follows, we will see how this lack of a tangible way to refer to this important notion has significant operational consequences when composing lenses.

”The simplicity of the presentation of lenses is balanced by the complexity of their composition.”

Armed with the above motto we proceed to unpack the definition of lens composition. Suppose we have two lenses: and , as drawn below in Figure 3.

Using the grapical formalism of the figure above, it seems reasonable to define the composite of these two lenses simply by plugging them along the two matching ports . We draw the result of this in Figure 4. We invite the reader to ponder this definition before moving on. Is this composition well-defined?

The answer is no! What is drawn above is not a lens. It turns out that this elegant and plausible looking solution has an issue. Namely, if we look at the figure, we see that the internal state of this supposed lens is . But we’ve previously established that the type of the internal state of every lens with domain is pegged to itself, as internal state is not data available for manipulation. This means that what we’ve defined above is some kind of a bidirectional process, but not a lens. Another way to see this is to try to write down the and maps explicitly. Below we explicitly do so – we write out the correct

definition of lens composition (forgetting the above image for a moment).

###### Definition 1 (Lens composition).

Consider two lenses:

and .

Their composite is defined as:

 get \coloneqqAget1−−→Bget2−−→C put \coloneqqA×C′graph(get1)×C′−−−−−−−−−→A×B×C′A×put2−−−−→A×B′put1−−→A′

While the definiton of the composite is simple, the composite is more complex.444We observe that

defines a generalised form of chain rule

[GradientBasedLearning, p. 11] If we look at at , we see first see applied to , which copies the input and applies to one of the copies.555We write out the formal definition of in Def. 7.That copy results in a , which is used in . The map gives us a which is used together with the other copy of to obtain an using . Observe that this is the only way lens composition can be defined. 666This can be seen by the reasoning going backwards: ”the only thing that we can use to produce is , and the only way to produce its inputs is by…”, Lens composition is shown graphically in Figure 5.

We can immediately observe that this is different than our original guess: 1) there are two maps, and 2) the input is copied twice, not once. With the hindsight that we’re interested in implementing these lenses in software, the fact that some functions are computed twice raises some suspicions about the feasibility such an implementation. To get a better sense of what’s going on, we up the stakes and depict a composition of three lenses in Figure 6.

At this point things start to look crowded. There are now 6 maps. We are also copying three times in total. In general, it seems that composing more lenses only exacerbates the problem. What is going on?

If we look closely, we see that the backward pass of this composite for each map independently computes from scratch what that map needs. For instance, uses to compute its internal state, and uses , while uses just the already available , but none of these computations share results of computation with each other.

This strategy of recomputing every intermediate result from scratch might certainly seem disadvantageous, but we observe that it’s a part of a tradeoff: this strategy uses less memory. Only the initial state needs to be preserved in memory, and everything needed from the backward pass can be computed from it. It is also never the case that both and in the backward pass of Figure 6 need to be computed in parallel the same time (which would require more memory): it’s necessary to compute the output of the former before the output of the latter can be used.777To help with intuition, we invite the reader to have a look at the animation of this process, available at the following link.

This means that lens composition picks a particular space-time tradeoff when solving the issue of backpropagating information. It uses

less space (as it doesn’t need to save intermediate states of computation in memory), but more time (as it needs to recompute data).

###### Remark 1.

This kind of space-time tradeoff has a name in the deep learning and automatic differentation community: it’s called gradient checkpointing[Checkpointing, SublinearCost]

. It is often used with very large neural networks where storing all the intermediate results is prohibitive memory wise, or when available computation resources are constrained memory-wise. While it is understood that lenses are intricately tied to the chain rule, to the best of our knowledge this is the first time the connection between lenses and gradient checkpointing has been established.

The explanation of why the structure of lenses ended up implementing this particular tradeoff can be seen in Figure 7, showing a composite of two lenses. Here we see the residual circled in blue mediating the passage from the forward pass to the backward pass. Observe that all the data communicated between the forward and the backward pass has to be squeezed through this -shaped hole.

This means that intermediate state of type that computed in the forward pass can’t be communicated to the backward pass. Instead, there is no other way, but for the backward pass to separately recompute this information from . 888If we have a composite of 3, 4, or in general more lenses, then there is levels of separate computation, and each level is has a sequence of maps of length composed. The memory required to compute gradients is in our graph is constant in the number of layers , but the number of node evaluations scales with . And this itself arises precisely because when in defining a lens we have no freedom to choose the type of data that will be communicated from the forward pass to the backward pass.

###### Remark 2.

This phenomenon seems to have first been observed in [SimpleAD, Section 3.1.] where the author described their initial attempts to efficiently compute reverse-mode derivatives with cartesian lenses, only to identify the aforementioned redundancy problems. He went on to propose a solution using closed lenses, something we touch upon in Remark 8. Interestingly, the author never used the term lens in the entire paper.

### 2.1 Where to?

We’ve established that lens composition implements a generalised form of chain rule in a manner that uses less space but more time. This is a result of the absence of an explicit way to refer to the internal state of lenses. Two questions now become natural to ask: a) How can we recover other space-time tradeoffs? and b) How can we explicitly refer to and manipulate this internal state? In next section we answer both of these questions with optics.

## 3 Optics

The category of optics is a generalisation of the category of lenses, and has been thoroughly studied in the literature [ProfunctorOptics, CategoriesOfOptics, StringDiagramsForOptics, ModularDataAccessors]. Unlike lenses, optics do not require a cartesian structure and can instead be defined for any base category that is merely monoidal.999This is not the most general definition of optics, see [DependentOptics, FibreOptics, ProfunctorOptics]. We denote this category by and proceed to unpack its contents. Its objects of are pairs of objects in , just like with lenses. However, differences start appearing once we start looking at the morphisms.

###### Definition 2 ([CategoriesOfOptics, Def. 2.0.1.]).

The set of optics is defined as the following coend

 Optic(\cC)(AA′,BB′)\coloneqq∫M:\cC\cC(A,M⊗B)×\cC(M⊗B′,A′)

Its elements are equivalence classes of triples , where , and . They’re quotiented out by the equivalence relation where if there is a residual morphism in such that the following diagrams commute:

 (1)

This definition might look daunting, but it is the result the dualisation of Motto 2 whose consequence will be a more straightforward definition of composition. Nonetheless, we will see that each part of the definition has intuitive meaning. An optic has three components. The object , the type of the internal state, the forward map , and the backward map . The shape of an optic is drawn in Figure 8, and it has a similar data flow as a lens. It takes in some in the forward pass, and using the the map it produces the product , for the chosen type . The environment then takes in the and responds with a , allowing the backward part to use and produce .

The last component of the definition is the equivalence relation. It’s a formal description of the idea that we think of optics sa being observed from the outside. This means that the type of the internal state of an optic isn’t externally available information. This in turn means that there is no way to distinguish between two optics that have the same extensional behavior101010In other words, the only way it’s possible to distinguish between two optics is if there is some input and some environment response such that two optics produce a different ., but different types of internal states.

###### Remark 3.

The directionality of the residual morphism in the equivalence relation does not matter. Because equivalence relations are symmetric, given and , both morphisms and (making the appropriate diagrams commute) induce an equivalence relation .

Now we move on to describing the relationship between lenses and optics.

### 3.1 Cartesian Lens - Cartesian Optic isomorphism

If the base of optics is cartesian monoidal, the resulting cartesian optics are isomorphic to cartesian lenses.

###### Proposition 1.

When is cartesian monoidal, we have .

Unlike the slick proof of this proposition ([CategoriesOfOptics, Prop. 2.0.4.], which we refer the reader to), here we take special care in unpacking the non-trivial action of this isomorphism on hom-sets. We first see how turning a lens into an optic reifies the internal state, allowing us to explicitly track and manipulate it. The reason we do this is Remark 4 which we will see is a shadow of a higher-categorical construction we will see in Section 5.

###### Proposition 2 (Cartesian lenses → cartesian optics).

For a cartesian category , and every pair of objects and there is a function reifying the residual of a lens defined as

 LensCart(\cC)(AA′,BB′) R→Optic(\cC)(AA′,BB′) (ff′) ↦(A,graph(f),f′)

This finally gives us justification for the choice of the graphical language used to draw lenses, where lenses were previously drawn in their suggestive optic representation. Just as in Figure 1, we see that a) the residual of the resulting optic is set to , and b) the input is copied before being sent down as the residual. Conversely, starting from an optic we can always erase the residual, and “normalise” the optic into its lens representation.

###### Proposition 3 (Cartesian optics → cartesian lenses).

For a cartesian category , and every pair of objects and there is a function erasing the residual of a lens defined as

 Optic(\cC)(AA′,BB′) E→LensCart(\cC)(AA′,BB′) (M,fw,bw) ↦(fwπ2((fwπ1)×B′)bw)
###### Remark 4.

The proof that is trivial, but going the other way it isn’t. Showing that involves showing is equivalent to which involves exhibiting a non-trivial witness for the equivalence: the residual morphism .

### 3.2 How do optics compose?

As previously suggested, optic composition has a different definition than lens composition.

###### Definition 3 (Optic composition, [CategoriesOfOptics, page 5.]).

Consider two optics:

and .

We define their composite as

 M \coloneqqM1⊗M2 (2) fw \coloneqqAfw1−−→M1⊗BM1⊗fw2−−−−−→M1⊗M2⊗C (3) bw \coloneqqM1⊗M2⊗C′M1⊗bw2−−−−−→M1⊗B′bw1−−→A′ (4)

This is essentially a composition of coparameterised maps in the forward pass, and a composition of parameterised maps in the backward pass [FibreOptics]. The above symbolic description has a simple pictorial one: we simply draw a box around the individual optics (Figure 9).111111We invite the reader to also have a look at the animation of the optic composition, available at the author’s blog post Optics vs. Lenses, Operationally.

As originally hinted, optic composition implements a different space-time tradeoff than lenses. We notice that the newfound liberty of choosing the type of internal state on a per-optic basis allows the composition of two optics with residuals and to pick the product as its internal state. This allows optics to break down the problem of saving intermediate state into smaller pieces: each optics takes care of storing their own data. In turn, this removes the need to recompute any information, at the expense of needing more memory. This tradeoff becomes tricky if memory is a limited resource, as composing optics in a sequence, causes their residuals to be composed in parallel (Eq. 2). For cartesian optics, this means that the longer our chain of composition is, the more memory we need, something that is not true for lenses.

Now we have seen two different ways to compose these bidirectional gadgets. Can we formally establish a categorical connection?

### 3.3 Two ways to compose?

The following point motivates the rest of this paper. Say we start with two lenses: and . There are two ways to obtain a composite optic, shown in Figure 10. We can either compose them as lenses, and then turn this composition into an optic, or we can first turn these lenses into optics, and then compose the optics.

If we first turn them into optics, and then compose, the result is an optic whose residual is of type . When implemented, this optic requires more memory, but less time, as it reuses computation. Alternatively, if we first compose them and then turn the resulting lens into an optic, we obtain an optic whose residual is the equal to the type of top-left input . This optic requires less memory, but more time. The question that motivates the rest of this paper is: are these optics equivalent?

If we look at the existing categorical framework, we find that the answer is yes. This can be seen in a few ways. The most straightforward one is to notice that asking whether these results are equivalent is asking precisely if the embedding preserves composition (i.e. whether it’s a 1-functor). It’s been shown in Prop. 1) that the answer is yes. Another way to show this is to exhibit a witness for this equivalence, and indeed we can: it’s the reparameterisation .131313Observe that it is crucial the underlying category is cartesian, as to prove that the diagrams in Eq. 1 commute we need to slide the map through the copy. This will be elaborated in detail in Remark 7.

We now observe that we have obtained an answer to the bolded question, but not the answer. We’ve obtained an answer of a particular denotational nature. This answer assumes that we’re observing these optics from the outside, therefore ignoring any matter of their internal setup. As there is no way to observe these optics’ internal state, there is no way for us to make a distinction between them. While this is a valid reference frame, it is not the only one.

In a modern, software oriented world, we’re not merely observing these optics from the ouside in. We’re building them from the inside out. We’re choosing their particular internal states, and it is nature that’s actually making a distinction between them – by only sometimes only computing the answer if we’ve chosen the efficient representation for our purposes.

As current categorical framework doesn’t have a high-enough resolution to formally capture these differences, there is a rising need to provide one – one that is able to make a distinction between denotationally equivalent, but operationally different kinds of optics.

In the rest of this paper, we will see how this can be done by using higher category theory.

## 4 Categorical interlude: (1-) vs. oplax colimits

In Figure 10 we have seen that the current coend definition of optics identifies a lot of information that we would like to explicitly keep track of. This happens often in mathematics, where colimits and traditional quotients identify “too much”, leading mathematicians to study more refined versions thereof. In this interlude we explicitly show how to do this. We first show how to add an extra level of fidelity by computing oplax colimits instead of colimits, and then showing that the same idea applies to coends, as they’re special kind of colimits.

We start out with the well-known monoidal adjunction

between and , where both categories are endowed with the cartesian product as the monoidal one.141414We note that the counit of this adjunction is the identity natural transformation, i.e. for every set , making a reflective subcategory of . The right adjoint functor sends a set to a discrete category, and the left adjoint sends a category to its set of connected components.151515Recall that the set of connected components of a category is a quotient set identifying any two objects connected by a sequence of arrows, where we ignore their direction. 161616The components of the unit of this adjunction are functors with the interesting characterisation that they send every morphism to identity. We show that this adjunction is instrumental in mediating the connection between 1-colimits and (op)lax colimits, latter of which provide an extra level of fidelity necessary for our purposes. This guides us in redefining the hom-set of optics to a hom-category of optics, and the category of optics to a 2-category of optics.

As colimits we’re interested in are -colimits, the situation is straightforward. Colimits arise out of a canonical higher-dimensional version thereof: the connected components of the (op)lax colimit of the original functor.

###### Lemma 1.

Let be a functor. Then there is an isomorphism

where is the adjunction between and .

This is precisely the well-known general formula for computing colimits in as described in [SevenSketches, Theorem 6.37.] where the equivalence relation described therein is the one arising as the image of .

If we are interested in having the equivalence relation be explicit as higher-categorical cells, all we have to do is compute the oplax colimit instead – which corresponds to taking the Grothendieck construction of our functor. This is precisely what we set out to do with the notion of a coend in the formulation of optics (note that in Remark 3 we’ve lost track of directionality of reparameterisation morphisms, precisely because connected components ignore directionality).

One last thing remains: that is to recast the coend as a colimit:

propositionCoendsAreColimits Let be a functor. Then we have that

 ∫XF(X,X)≅colim(π′F)

where .171717Equivalently, one may say that the coend of is the colimit of weighted by the functor. This is precisely what this proposition states: that the category of elements “absorbs” weights.

###### Proof.

Appendix. ∎

A higher-categorical version of this proposition can be shown to hold.

###### Proposition 4 (Oplax coends are oplax colimits).

Let be a category and an oplax functor. Then

 \sqintCF(C,C)≃colimoplax(π′F)

This involves a routine, but a painstakingly tedious checking of the corresponding universal properties which we thus omit. With these two propositions in hand we can show that there is a following isomorphism

 ∫CF(C,C)≅colim(π′F)≅π0(colimoplax(π′Fdiscr))≅π0(\sqintCdiscr(F(C,C)))

motivating the definition of the hom-category of optics as an oplax coend in the following section.

## 5 2-optics

With the above ideas in mind, we begin to define the 2-category of optics. Throughout this section, we fix a symmetric strict monoidal category .181818This means we assume the associativity and unitality hold strictly, but not symmetry. This kind of a monoidal category is sometimes referred to as a permutative category. We start by defining its hom-categories.

###### Definition 4.

We define the hom-category of 2-optics as the following oplax coend

 2-Optic(\cC)(AA′,BB′)\coloneqq\sqintM:\cC\cC(A,M⊗B)×\cC(M⊗B′,A′)

where is implicitly treated as a locally discrete 2-category.

Explicitly, its objects are triples (as in 2). A morphism is given by a map such that the following diagrams commute:

 (5)

Morphisms of optics are subject to the following axioms:

• , and

• for any and .

They tell us that a) an optic moprhism induced by the identity residual morphism is equal to the identity optic morphism morphism, and b) a composition of optic morphisms that are individually induced by residual morphisms is equal to the optic morphism induced by the composition of the aforementioned residual morphisms.

###### Remark 5.

Unlike with 1-optics (Remark 3), morphisms of optics are not quotiented out by an equivalence relation, but instead residual morphisms are refieid as explicit 2-cells.

Directedness plays an important role here. A 2-cell can be interpreted operationally in a few ways:

• We say it moves the boundary down from to .

• We say it moves the reparameterisation up from the backward pass to the forward pass.

• We think of the arrow as saying “can be optimised to”. The idea is that the transport of the reparameterisation to the forward pass allows us us to statically simplify the resulting computation, potentially removing any redundancies.

We’re finally in a position of being able to define the 2-category of optics.

###### Definition 5.

We define the 2-category of optics with the following data:

• Its objects are the same as those of , i.e. pairs of objects in ;

• The hom-category is defined as in Def. 4, i.e. morphisms are optics and 2-cells are optic reparameterisations

###### Proof.

As this definition closely follows that of the 1-categorical , all we have to check is that this coherently behaves with respect to strictness. For instance, composing three optics with residuals , and respectively yields an optic with residuals either or . But as our starting monoidal category has strict associators, these are equal. Similar argument holds for unitality, making this a 2-category.191919We observe that if our starting category was merely symmetric monoidal, would be a bicategory, which is a headache we want to avoid.

This 2-categorical construction can always be turned back to the 1-categorical one by locally quotienting things out.

###### Proposition 5.

There is an isomorphism

 π0∗(2-Optic(\cC))≅Optic(\cC)

where is the enriched base change of the connected components functor .

###### Proof.

This is straightforward to show as is identity-on-objects, and on the hom-category it’s defined as the application of . ∎

Having upgraded our optics to a 2-category, we ask whether the previously defined equivalence between cartesian lenses and cartesian optics described in the subsection 3.1 has a higher dimensional counterpart. We will see that the answer is yes, and we proceed to unpack these much more involved constructions.

### 5.1 Cartesian 2-optics

In this subsection we assume the monoidal product of is given by the cartesian one. Recall that in Prop. 1 we have shown that there is a local isomorphisms between the hom-sets of lenses and optics. As we’ve upgraded our optics to a 2-category, we might suspect that the corresponding isomorphism is upgraded too. We see that it is – to an adjunction.

 \tikzcd[ampersandreplacement=&,rowsep=2ex]&LensCart(\cC)(AA′,BB′)&&2-Optic(\cC)(AA′,BB′)&\arrow["R",curve=height=−30pt,from=3−1,to=3−3]\arrow["E",curve=height=−30pt,from=3−3,to=3−1]\arrow["⊣"anchor=center,rotate=−90,draw=none,from=1−2,to=5−2]

given by

1. Residual reifier , the left adjoint functor which chooses a canonical residual for every lens:

 R((ff′))\coloneqq(A,graph(f),f′)

As the domain is a set, the action of on morphisms is trivial.

2. Residual eraser , the right adjoint which normalises the optic to its cartesian representation:

 E((M,fw,bw))\coloneqq(fwπ2(fwπ1)×B′bw)

As the codomain is a set, every morphism must be sent to the identity one.

3. (Vacuously natural) identity transformation , the unit of the adjunction.

4. The natural transformation , the counit of the adjunction whose component at each optic is an optic morphism

 ϵ(M,fw,bw):R(E(M,fw,bw))⇒(M,fw,bw)

defined by the reparameterisation in .

Appendix. ∎

###### Remark 6.

As unit is the identity 2-cell (and thus an isomorphism), this means that is locally a coreflective subcategory of . Even more, this is sometimes called a rali or a lari adjunction [Lali, Def. 1.2.] as the unit is the identity 2-cell.

This defines a local adjunction on hom-categories. To show that there is some higher correspondence between the and that is not just local, we need to check whether we can use it to define functors going both ways. As our codomain is now a 2-category, we can reasonably expect a an (op)lax functor to appear. And having in mind the question posed in Subsection 3.3 we will see that our functor is indeed oplax, as it doesn’t preserve composition on the nose, instead distinguishing between the aforementioned optics.

###### Theorem 1.

There is an identity-on-objects, oplax 2-functor embedding the category of lenses into the 2-category of optics:

 ι:LensCart(\cC)→2-Optic(\cC)

defined as follows. Its action on hom-sets of is defined in Thm. 5.1. Its oplaxator is the natural transformation

which to every pair of lenses and assigns the reparameterisation between the corresponding optics and . Its opunitor to each object assigns a natural transformation whose unique component 2-cell is given by the reparameterisation .

###### Proof.

We first need to prove that is a well-defined reparameterisation between and . These two unpack to two optics previously drawn in Figure 10:

 ι((ff′)(gg′)) ⇝(A,\definecolor[named]pgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0graph(fg),(graph(f)×C′)(A×g′)f′) ι((ff′))ι((gg′)) ⇝(A×B,\definecolor[named]pgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0graph(f)(A×graph(g)),(A×g′)f′)

Once reparameterised, we can see that the residuals and backward maps of these optics are equal on the nose. This leaves us with just proving that the forward parts are. We postpone this proof to Remark 7 as it has special meaning in terms of rewrites. From the naturality of the delete map it is straightforward to prove that the opunitor reparameterisation is well-defined too.

Lastly, what needs to be proven is that lax associativity and lax left and right unity are satisfied ([TwoDimensionalCategories, Eq. 4.1.19 and 4.1.20]). This becomes evident once the diagrams are properly unpacked, and simple axioms of a cartesian category (associativity and naturality of copy, interaction of copy and delete) are sufficient to prove it. ∎

###### Remark 7.

The attentive reader might have noticed a peculiarity regarding the two possible ways of composing lenses into optics. Namely, reparameterising the forward part of the optic with yields the forward part . To show that this morphism is indeed equal to the forward part of the optic we need to exhibit an additional proof.

This proof is rather simple as the rewrite can be made by applying associativity and naturality of the copy map. But surprisingly, this proof is the main reason why optics can trade space for time. This is because we can compute the once, and the copy the result instead of copying the input and applying twice, separately.

###### Remark 8.

We do not provide a proof in this paper, but closed lenses exhibit lax structure when embedded into 2-optics. However, they happen not to permit any additional rewrites as a result of this. This means that their operational characteristics end up the same as those of optics. This might be the reason why essentially closed lenses are what’s used in [SimpleAD] and many automatic differentiation libraries, as function spaces are a feature in most languages, but dependent types, which are required for optics, are not.

Having defined the embedding of lenses into optics, we can also go the other way. This time, we have a strict 2-functor.

###### Proposition 6.

There is a strict identity-on-objects 2-functor normalising an optic to its lens representation whose action on hom-objects is defined in Thm 5.1.

### 5.2 What world do these constructions live in?

So far we’ve had an obstacle-free path retelling the story of optics in the 2-categorical language. We have defined the 2-category of optics and showed that, in the cartesian case, there’s an oplax functor going to lenses, and a strict 2-functor going backwards. As their lower-dimensional version formed an isomorphism, the natural next step is to check whether these two weak functors form some kind of a weaker version of an isomorphism – such as an equivalence or an adjunction. This necessitates finding a suitable ambient 2-category, and is where peculiarities start appearing.

We first recall all the data defined so far, and some relevant properties:

• Our 0-cells are 2-categories;

• Our 1-cells are not all strict;

• Our 1-cells and are identity-on-objects;

• Thm. 5.1 establishes a local hom-category adjunction, and not a mere isomorphism.

Looking at only the first three bullet points, we would be lead down a rabbit hole that eventually proves to be a dead end. As the second bullet point states that our 1-cells are not all strict, this rules out the 2-category of 2-categories, 2-functors and lax transformations as a suitable candidate. This means we need a 2-category whose 1-cells are lax functors. Famously, such a 2-category where 2-cells are any of the usual strict/pseudo/(op)lax/ transformations actually doesn’t exist [ProblemWithLaxFunctors]. What does exist is a 2-category of 2-categories, lax functors and a restricted kind of an oplax natural transformation called an icon ([Icon], [TwoDimensionalCategories, Theorem 4.6.13.]). Icons are oplax natural transformations that are defined only when the underlying lax functors agree on objects. However, we see from the third bullet point that this is indeed the case for us! We can indeed define these icons, and we will see that one of them is identity. This makes one of the triangle identities commute automatically. However, problems arise when when we check the other one – we find that it doesn’t commute.

This is because of the fourth bullet point: an adjunction requires a local isomorphism, but we locally have an adjunction itself. This rules out the possibility of an adjunction internal to a 2-category – meaning we have to search for a weaker 2-categorical analogue thereof. This leads us to consider the notion of a lax 2-adjunction instead, defined internal to a 3-category. This is a concept weak enough in for our purposes, describing exactly the setting of a local adjunction between hom-categories.

This is where our search, in its current form, stops. Even though there is a special restricted kind of 2-category whose 2-cells are icons, there truly is no 3-category whose 0-cells are 2-categories and 1-cells are lax functors. Thus the question as we’ve posed it indeed has no answers.

On the other hand, we might want to pose a different question. All of our constructions are restricted in some ways – 1-cells are identity on objects and 2-cells have identity components. Perhaps our approach is failing because 0-cells are also restricted, but we’re not looking at them from the right perspective?

This is indeed the case. It turns out that there is an embedding where is the 3-category of double categories, lax functors, lax transformations and modifications. It sends a 2-category to a vertically discrete double category, a lax functor to a lax functor between the corresponding double categories and an icon to vertical transformations.

This means that our 2-category of optics should really be thought of as a double category with only trivial vertical arrows. As is a 3-category this means that it is a plausible setting for defining a lax 2-adjunction. This leaves us with the conjecture that the well-known isomorphism between and is a shadow of the lax 2-adjunction in the 3-category between and appropriately thought of as double categories. The proof of this conjecture is something we leave to future work.

## Appendix A Appendix

*

###### Proof.

We first prove well-definedness of the residual eraser . As maps every optic morphism to identity, we have to show that . But here we use the existing equivalence between lenses and the 1-category of optics (Prop. 1) – where these lenses are equivalent if and only if the corresponding 1-optics are. That is indeed true, as is a witness to such equivalence.

We now prove that both and is well-defined. For we notice that starting from a lens, reifying its residual and then erasing it lands us back exactly with the starting lens. It’s easy to see that this assignment is vacuously natural as is discrete. For , we can see that starting from an optic , erasing the residual (yielding ) and then reifying the residual results in an optic . We need to check whether is a well-defined optic 2-cell:

 (A,graph(fwπ2),(fwπ1)×B′bw)\xRightarrowfwπ1(M,fw,bw) (6)

But this is easy to verify. As optic 2-cells move reparameterisations backward, the place where we have to look for a suitable reparameterisation of type is precisely in the backward pass of the left-hand side. And it is indeed there - it’s the morphism . We next need to prove that is natural. This means that for every optic morphism the equation needs to hold. As is identity, this reduces to showing that which follows from Eq. 5 (left).

Lastly, to prove this data indeed defines an adjunction, we need to show that the following diagrams commute.

As is the identity natural transformation, this means that and , reducing the proofs to and , respectively. For the former, we need to show that applying the counit from Eq. 6 on naturally the identity morphism. This is easy to show as the underlying reparameterisation morphism in for via Prop. 8 reduces to identity. For the latter, we need to show that applying to the same counit yields identity. This is also straightforward as maps every morphism to identity. ∎

###### Definition 6 (Ends as limits).

Let be a category. We call the twisted arrow category of defined as the category of elements of its hom functor.

 tw(\cC)\coloneqqEl(Hom\cC)

It comes equipped with the projection .

###### Proposition 7.

There is a canonical isomorphism .

*

###### Proof.
 ∫CF(C,C) ≅ (Duality of ends and coends) ∫CFop(C,C) ≅ (Ends as limits) ≅ (Duality of limits and colimits) colim(tw(\cCop)opπop\cCop−−−→\cCop×\cCF→Set) ≅ (Prop. ???) colim(tw(\cC)opπ′\cC−−→\cCop×\cCF→Set)

###### Definition 7.

Let be a morphism in cartesian category . Then we denote by the composite

 AΔA−−→A×AA×f−−→A×B

This is called the graph of f.202020It’s called the graph of f because its image is a set of pairs which we can think of as points in a coordinate plane to be graphed.

###### Proposition 8.

Let in a cartesian category . Then we have that

 graph(f)π2 =f,and graph(f)π1 =idA