1 Introduction
Lenses and optics are have recently received a great deal of attention from the applied category theory community. They are abstract data structures that model systems exhibiting bidirectional data flow. Thereβs a number of disparate places theyβve been discovered in: deep learning (
[GradientBasedLearning, LensesAndLearners]), game theory (
[DiegeticOpenGames, CompositionalGameTheory]), bayesian learning ([Bayesianlearning, BayesianOG][ValueIterationIsOpticComposition]), database theory ([FunctorialAggregation]), dynamical systems ([OpenDynamicalSystems]), data accessors ([ModularDataAccessors]), trading protocols ([Escrows]), server operations ([LensesForComposableServers]) and more ([LensResources]).As evident by the breadth of their applications, lenses and optics donβt assume that the underlying systems are of any particular kind. Instead, they are defined parametrically for some base category , which is only required to satisfy a minimal set of axioms. By appropriately instantiating this category we can recover various kinds of systems β deterministic, probabilistic, differentiable, and so on. This makes it possible to treat the bidirectionality in an abstract way, proving theorems about whole classes of bidirectional processes that satisfy particular properties.
The two constructions we focus on in this paper β cartesian lenses and optics^{1}^{1}1We note that there is a whole zoo of bidirectional gadgets, each with their own kind of behavior, and their own abstract interface that the underlying world needs to satisfy. A lot of effort has been put in towards representing all of these constructions in an unifying way (see any of [DependentOptics, FibreOptics, ProfunctorOptics, GeneralisedLensCategories]) β are related, but have different requirements on the base category. For optics to be defined, we require the base category to permit parallel composition of processes, i.e. a monoidal structure. For lenses we additionally require this structure to be cartesian, i.e. the ability to coherently copy and delete information. These play well together β defining optics in a cartesian category gives us a category isomorphic to lenses, as it is wellestablished in the literature.
In this we paper observe that this isomorphism is denotational in nature and blind to operational concerns relevant to their practical implementations. Namely, it treats optics extensionally β describing them as being observed from the outside. This means that any matters of their internal setup, especially ones relevant to making a distiction between an efficent and an inefficient implementation, are ignored. But in a modern, software oriented world weβre not merely observing these optics from the outside in β weβre instead building them from the inside out. Weβre choosing their particular internal states, and in most cases we donβt have the luxury of not distinguishing between an efficient and an inefficient representation, as often only the former can compute an answer for us. As the current categorical framework doesnβt have a highenough resolution to formally capture these distinctions, we seek to provide one. We lift the existing 1categorical formalism to a 2categorical one. We show how to track and manipulate the internal state of optics, making a distinction between denotationally equivalent, but operationally different kinds of optics.
We first start out by unpacking the category of lenses and emphasizing that a seldom talked about perspective: that lenses are cartesian optics with one option removed: the option to choose the type of its internal state. We show how this lack of a tangible way to refer to this important notion has significant operational consequences when composing lenses. Namely, weβll see that lenses implement a particular kind of a spacetime tradeoff thatβs in the deep learning literature called gradient checkpointing. We move on to unpacking the category of optics and notice they they have a different composition rule than lenses, implementing a different spacetime tradeoff. What motivates the rest of the paper is the observation that the difference is completely invisible to the categorical machinery.
We observe that optics are defined by a particular kind of colimit, suggesting an avenue forward by instead defining them as an oplax colimit. We do so, and thus define the 2optics: a 2category whose 2cells now explicitly track their internal state. We show that they coherently reduce to 1optics by locally computing their connected components.
We then go on to explore the isomorphism in this 2categorical setting. We show that the 2categorical setting now locally hosts an adjunction between the corresponding categories, and that the embedding of lenses into cartesian optics is upgraded from a functor to an oplax functor whose oplaxator now detects the different composition rule. We determine the difficulties in showing the oplax functor forms in any of the standard categories whose 1cells are lax functors. We establish a conjecture that the wellknown isomorphism between cartesian lenses and optics arises out of the lax 2adjunction between their doublecategorical counterparts, as an image under the local connected components quotient.
1.1 Acknowledgements.
We thank Igor BakoviΔ, Fosco Loregian, Mario RomΓ‘n, Matteo Capucci and Jules Hedges for helpful conversations.
1.2 Notation.
We write morphisms in diagrammatic order, so composition of and is written as . We write for the projection in the 2nd component.
2 Cartesian Lenses
Lenses come in many shapes and sizes. In this paper, we tell the story from the point of view of cartesian^{2}^{2}2As opposed to closed. lenses ([BimorphicLenses, CategoriesOfOptics, ProfunctorOptics]). This is a category which we denote by . We proceed to unpack its contents and, as most of the content of this paper is motivated by its previously unnoticed operational aspects, we take special care in doing so. We look out for any potentially resourcerelevant operations such as data copying or recomputation.
can be defined for any base category which is cartesian monoidal. Its objects are pairs of objects in , denoted by , where we interpret the value of type as going forward and value of type as going backward.
A morphism is a cartesian lens. It consists of a two morphisms in , and , roughly thought of as the forward and the backward part of a lens.^{3}^{3}3Sometimes the terminology and is used instead of and . We can visualise lenses graphically using the formalism of string diagrams [Selinger] (Figure 1), an especially useful visual language for studying the flow of information in a lens.
The flow of information works as follows. Information starts at the input of type . The lens takes in this input and produces two things: a copy of it (sent down the vertical wire, where the operation of copying is drawn as a black dot) and the output (via the ) map. This is the forward pass of the lens, and is drawn with the gray outline. Then, the environment takes this output and turns it into a response (not drawn). This lands us in the backward pass of the lens. Here the lens via the map consumes two things: the response and the previously saved copy of the input on the vertical wire, turning them back into .
A lens has an inside and an outside. The outside are the ports and . These ports are the interface to which other lenses connect. The inside is the vertical wire whose type is . The vertical wire is the internal state of the lens (sometimes also called the residual) β mediating the transition between the forward and the backward pass.
Here we are explicitly referring to the internal state because of whatβs to come, but we emphasize that in the lens literature this concept hasnβt been reified. In the lens literature the internal state is not explicit data that can be manipulated, and is instead being implicitly threaded through definitions and theorems β always being pegged to the forward part of the domain of a lens. More precisely, the type of the internal state of a lens is always equal to . In what follows, we will see how this lack of a tangible way to refer to this important notion has significant operational consequences when composing lenses.
βThe simplicity of the presentation of lenses is balanced by the complexity of their composition.β
Armed with the above motto we proceed to unpack the definition of lens composition. Suppose we have two lenses: and , as drawn below in Figure 3.
Using the grapical formalism of the figure above, it seems reasonable to define the composite of these two lenses simply by plugging them along the two matching ports . We draw the result of this in Figure 4. We invite the reader to ponder this definition before moving on. Is this composition welldefined?
The answer is no! What is drawn above is not a lens. It turns out that this elegant and plausible looking solution has an issue. Namely, if we look at the figure, we see that the internal state of this supposed lens is . But weβve previously established that the type of the internal state of every lens with domain is pegged to itself, as internal state is not data available for manipulation. This means that what weβve defined above is some kind of a bidirectional process, but not a lens. Another way to see this is to try to write down the and maps explicitly. Below we explicitly do so β we write out the correct
definition of lens composition (forgetting the above image for a moment).
Definition 1 (Lens composition).
Consider two lenses:
and .
Their composite is defined as:
While the definiton of the composite is simple, the composite is more complex.^{4}^{4}4We observe that
defines a generalised form of chain rule
[GradientBasedLearning, p. 11] If we look at at , we see first see applied to , which copies the input and applies to one of the copies.^{5}^{5}5We write out the formal definition of in Def. 7.That copy results in a , which is used in . The map gives us a which is used together with the other copy of to obtain an using . Observe that this is the only way lens composition can be defined. ^{6}^{6}6This can be seen by the reasoning going backwards: βthe only thing that we can use to produce is , and the only way to produce its inputs is byβ¦β, Lens composition is shown graphically in Figure 5.We can immediately observe that this is different than our original guess: 1) there are two maps, and 2) the input is copied twice, not once. With the hindsight that weβre interested in implementing these lenses in software, the fact that some functions are computed twice raises some suspicions about the feasibility such an implementation. To get a better sense of whatβs going on, we up the stakes and depict a composition of three lenses in Figure 6.
At this point things start to look crowded. There are now 6 maps. We are also copying three times in total. In general, it seems that composing more lenses only exacerbates the problem. What is going on?
If we look closely, we see that the backward pass of this composite for each map independently computes from scratch what that map needs. For instance, uses to compute its internal state, and uses , while uses just the already available , but none of these computations share results of computation with each other.
This strategy of recomputing every intermediate result from scratch might certainly seem disadvantageous, but we observe that itβs a part of a tradeoff: this strategy uses less memory. Only the initial state needs to be preserved in memory, and everything needed from the backward pass can be computed from it. It is also never the case that both and in the backward pass of Figure 6 need to be computed in parallel the same time (which would require more memory): itβs necessary to compute the output of the former before the output of the latter can be used.^{7}^{7}7To help with intuition, we invite the reader to have a look at the animation of this process, available at the following link.
This means that lens composition picks a particular spacetime tradeoff when solving the issue of backpropagating information. It uses
less space (as it doesnβt need to save intermediate states of computation in memory), but more time (as it needs to recompute data).Remark 1.
This kind of spacetime tradeoff has a name in the deep learning and automatic differentation community: itβs called gradient checkpointing[Checkpointing, SublinearCost]
. It is often used with very large neural networks where storing all the intermediate results is prohibitive memory wise, or when available computation resources are constrained memorywise. While it is understood that lenses are intricately tied to the chain rule, to the best of our knowledge this is the first time the connection between lenses and gradient checkpointing has been established.
The explanation of why the structure of lenses ended up implementing this particular tradeoff can be seen in Figure 7, showing a composite of two lenses. Here we see the residual circled in blue mediating the passage from the forward pass to the backward pass. Observe that all the data communicated between the forward and the backward pass has to be squeezed through this shaped hole.
This means that intermediate state of type that computed in the forward pass canβt be communicated to the backward pass. Instead, there is no other way, but for the backward pass to separately recompute this information from . ^{8}^{8}8If we have a composite of 3, 4, or in general more lenses, then there is levels of separate computation, and each level is has a sequence of maps of length composed. The memory required to compute gradients is in our graph is constant in the number of layers , but the number of node evaluations scales with . And this itself arises precisely because when in defining a lens we have no freedom to choose the type of data that will be communicated from the forward pass to the backward pass.
Remark 2.
This phenomenon seems to have first been observed in [SimpleAD, Section 3.1.] where the author described their initial attempts to efficiently compute reversemode derivatives with cartesian lenses, only to identify the aforementioned redundancy problems. He went on to propose a solution using closed lenses, something we touch upon in Remark 8. Interestingly, the author never used the term lens in the entire paper.
2.1 Where to?
Weβve established that lens composition implements a generalised form of chain rule in a manner that uses less space but more time. This is a result of the absence of an explicit way to refer to the internal state of lenses. Two questions now become natural to ask: a) How can we recover other spacetime tradeoffs? and b) How can we explicitly refer to and manipulate this internal state? In next section we answer both of these questions with optics.
3 Optics
The category of optics is a generalisation of the category of lenses, and has been thoroughly studied in the literature [ProfunctorOptics, CategoriesOfOptics, StringDiagramsForOptics, ModularDataAccessors]. Unlike lenses, optics do not require a cartesian structure and can instead be defined for any base category that is merely monoidal.^{9}^{9}9This is not the most general definition of optics, see [DependentOptics, FibreOptics, ProfunctorOptics]. We denote this category by and proceed to unpack its contents. Its objects of are pairs of objects in , just like with lenses. However, differences start appearing once we start looking at the morphisms.
Definition 2 ([CategoriesOfOptics, Def. 2.0.1.]).
The set of optics is defined as the following coend
Its elements are equivalence classes of triples , where , and . Theyβre quotiented out by the equivalence relation where if there is a residual morphism in such that the following diagrams commute:
(1) 
This definition might look daunting, but it is the result the dualisation of Motto 2 whose consequence will be a more straightforward definition of composition. Nonetheless, we will see that each part of the definition has intuitive meaning. An optic has three components. The object , the type of the internal state, the forward map , and the backward map . The shape of an optic is drawn in Figure 8, and it has a similar data flow as a lens. It takes in some in the forward pass, and using the the map it produces the product , for the chosen type . The environment then takes in the and responds with a , allowing the backward part to use and produce .
The last component of the definition is the equivalence relation. Itβs a formal description of the idea that we think of optics sa being observed from the outside. This means that the type of the internal state of an optic isnβt externally available information. This in turn means that there is no way to distinguish between two optics that have the same extensional behavior^{10}^{10}10In other words, the only way itβs possible to distinguish between two optics is if there is some input and some environment response such that two optics produce a different ., but different types of internal states.
Remark 3.
The directionality of the residual morphism in the equivalence relation does not matter. Because equivalence relations are symmetric, given and , both morphisms and (making the appropriate diagrams commute) induce an equivalence relation .
Now we move on to describing the relationship between lenses and optics.
3.1 Cartesian Lens  Cartesian Optic isomorphism
If the base of optics is cartesian monoidal, the resulting cartesian optics are isomorphic to cartesian lenses.
Proposition 1.
When is cartesian monoidal, we have .
Unlike the slick proof of this proposition ([CategoriesOfOptics, Prop. 2.0.4.], which we refer the reader to), here we take special care in unpacking the nontrivial action of this isomorphism on homsets. We first see how turning a lens into an optic reifies the internal state, allowing us to explicitly track and manipulate it. The reason we do this is Remark 4 which we will see is a shadow of a highercategorical construction we will see in Section 5.
Proposition 2 (Cartesian lenses cartesian optics).
For a cartesian category , and every pair of objects and there is a function reifying the residual of a lens defined as
This finally gives us justification for the choice of the graphical language used to draw lenses, where lenses were previously drawn in their suggestive optic representation. Just as in Figure 1, we see that a) the residual of the resulting optic is set to , and b) the input is copied before being sent down as the residual. Conversely, starting from an optic we can always erase the residual, and βnormaliseβ the optic into its lens representation.
Proposition 3 (Cartesian optics cartesian lenses).
For a cartesian category , and every pair of objects and there is a function erasing the residual of a lens defined as
Remark 4.
The proof that is trivial, but going the other way it isnβt. Showing that involves showing is equivalent to which involves exhibiting a nontrivial witness for the equivalence: the residual morphism .
3.2 How do optics compose?
As previously suggested, optic composition has a different definition than lens composition.
Definition 3 (Optic composition, [CategoriesOfOptics, page 5.]).
Consider two optics:
and .
We define their composite as
(2)  
(3)  
(4) 
This is essentially a composition of coparameterised maps in the forward pass, and a composition of parameterised maps in the backward pass [FibreOptics]. The above symbolic description has a simple pictorial one: we simply draw a box around the individual optics (Figure 9).^{11}^{11}11We invite the reader to also have a look at the animation of the optic composition, available at the authorβs blog post Optics vs. Lenses, Operationally.
As originally hinted, optic composition implements a different spacetime tradeoff than lenses. We notice that the newfound liberty of choosing the type of internal state on a peroptic basis allows the composition of two optics with residuals and to pick the product as its internal state. This allows optics to break down the problem of saving intermediate state into smaller pieces: each optics takes care of storing their own data. In turn, this removes the need to recompute any information, at the expense of needing more memory. This tradeoff becomes tricky if memory is a limited resource, as composing optics in a sequence, causes their residuals to be composed in parallel (Eq. 2). For cartesian optics, this means that the longer our chain of composition is, the more memory we need, something that is not true for lenses.
Now we have seen two different ways to compose these bidirectional gadgets. Can we formally establish a categorical connection?
3.3 Two ways to compose?
The following point motivates the rest of this paper. Say we start with two lenses: and . There are two ways to obtain a composite optic, shown in Figure 10. We can either compose them as lenses, and then turn this composition into an optic, or we can first turn these lenses into optics, and then compose the optics.
If we first turn them into optics, and then compose, the result is an optic whose residual is of type . When implemented, this optic requires more memory, but less time, as it reuses computation. Alternatively, if we first compose them and then turn the resulting lens into an optic, we obtain an optic whose residual is the equal to the type of topleft input . This optic requires less memory, but more time. The question that motivates the rest of this paper is: are these optics equivalent?
If we look at the existing categorical framework, we find that the answer is yes. This can be seen in a few ways. The most straightforward one is to notice that asking whether these results are equivalent is asking precisely if the embedding preserves composition (i.e. whether itβs a 1functor). Itβs been shown in Prop. 1) that the answer is yes. Another way to show this is to exhibit a witness for this equivalence, and indeed we can: itβs the reparameterisation .^{13}^{13}13Observe that it is crucial the underlying category is cartesian, as to prove that the diagrams in Eq. 1 commute we need to slide the map through the copy. This will be elaborated in detail in Remark 7.
We now observe that we have obtained an answer to the bolded question, but not the answer. Weβve obtained an answer of a particular denotational nature. This answer assumes that weβre observing these optics from the outside, therefore ignoring any matter of their internal setup. As there is no way to observe these opticsβ internal state, there is no way for us to make a distinction between them. While this is a valid reference frame, it is not the only one.
In a modern, software oriented world, weβre not merely observing these optics from the ouside in. Weβre building them from the inside out. Weβre choosing their particular internal states, and it is nature thatβs actually making a distinction between them β by only sometimes only computing the answer if weβve chosen the efficient representation for our purposes.
As current categorical framework doesnβt have a highenough resolution to formally capture these differences, there is a rising need to provide one β one that is able to make a distinction between denotationally equivalent, but operationally different kinds of optics.
In the rest of this paper, we will see how this can be done by using higher category theory.
4 Categorical interlude: (1) vs. oplax colimits
In Figure 10 we have seen that the current coend definition of optics identifies a lot of information that we would like to explicitly keep track of. This happens often in mathematics, where colimits and traditional quotients identify βtoo muchβ, leading mathematicians to study more refined versions thereof. In this interlude we explicitly show how to do this. We first show how to add an extra level of fidelity by computing oplax colimits instead of colimits, and then showing that the same idea applies to coends, as theyβre special kind of colimits.
We start out with the wellknown monoidal adjunction
between and , where both categories are endowed with the cartesian product as the monoidal one.^{14}^{14}14We note that the counit of this adjunction is the identity natural transformation, i.e. for every set , making a reflective subcategory of . The right adjoint functor sends a set to a discrete category, and the left adjoint sends a category to its set of connected components.^{15}^{15}15Recall that the set of connected components of a category is a quotient set identifying any two objects connected by a sequence of arrows, where we ignore their direction. ^{16}^{16}16The components of the unit of this adjunction are functors with the interesting characterisation that they send every morphism to identity. We show that this adjunction is instrumental in mediating the connection between 1colimits and (op)lax colimits, latter of which provide an extra level of fidelity necessary for our purposes. This guides us in redefining the homset of optics to a homcategory of optics, and the category of optics to a 2category of optics.
As colimits weβre interested in are colimits, the situation is straightforward. Colimits arise out of a canonical higherdimensional version thereof: the connected components of the (op)lax colimit of the original functor.
Lemma 1.
Let be a functor. Then there is an isomorphism
where is the adjunction between and .
This is precisely the wellknown general formula for computing colimits in as described in [SevenSketches, Theorem 6.37.] where the equivalence relation described therein is the one arising as the image of .
If we are interested in having the equivalence relation be explicit as highercategorical cells, all we have to do is compute the oplax colimit instead β which corresponds to taking the Grothendieck construction of our functor. This is precisely what we set out to do with the notion of a coend in the formulation of optics (note that in Remark 3 weβve lost track of directionality of reparameterisation morphisms, precisely because connected components ignore directionality).
One last thing remains: that is to recast the coend as a colimit:
propositionCoendsAreColimits Let be a functor. Then we have that
where .^{17}^{17}17Equivalently, one may say that the coend of is the colimit of weighted by the functor. This is precisely what this proposition states: that the category of elements βabsorbsβ weights.
Proof.
Appendix. β
A highercategorical version of this proposition can be shown to hold.
Proposition 4 (Oplax coends are oplax colimits).
Let be a category and an oplax functor. Then
This involves a routine, but a painstakingly tedious checking of the corresponding universal properties which we thus omit. With these two propositions in hand we can show that there is a following isomorphism
motivating the definition of the homcategory of optics as an oplax coend in the following section.
5 2optics
With the above ideas in mind, we begin to define the 2category of optics. Throughout this section, we fix a symmetric strict monoidal category .^{18}^{18}18This means we assume the associativity and unitality hold strictly, but not symmetry. This kind of a monoidal category is sometimes referred to as a permutative category. We start by defining its homcategories.
Definition 4.
We define the homcategory of 2optics as the following oplax coend
where is implicitly treated as a locally discrete 2category.
Explicitly, its objects are triples (as in 2). A morphism is given by a map such that the following diagrams commute:
(5) 
Morphisms of optics are subject to the following axioms:

, and

for any and .
They tell us that a) an optic moprhism induced by the identity residual morphism is equal to the identity optic morphism morphism, and b) a composition of optic morphisms that are individually induced by residual morphisms is equal to the optic morphism induced by the composition of the aforementioned residual morphisms.
Remark 5.
Unlike with 1optics (Remark 3), morphisms of optics are not quotiented out by an equivalence relation, but instead residual morphisms are refieid as explicit 2cells.
Directedness plays an important role here. A 2cell can be interpreted operationally in a few ways:

We say it moves the boundary down from to .

We say it moves the reparameterisation up from the backward pass to the forward pass.

We think of the arrow as saying βcan be optimised toβ. The idea is that the transport of the reparameterisation to the forward pass allows us us to statically simplify the resulting computation, potentially removing any redundancies.
Weβre finally in a position of being able to define the 2category of optics.
Definition 5.
We define the 2category of optics with the following data:

Its objects are the same as those of , i.e. pairs of objects in ;

The homcategory is defined as in Def. 4, i.e. morphisms are optics and 2cells are optic reparameterisations
Proof.
As this definition closely follows that of the 1categorical , all we have to check is that this coherently behaves with respect to strictness. For instance, composing three optics with residuals , and respectively yields an optic with residuals either or . But as our starting monoidal category has strict associators, these are equal. Similar argument holds for unitality, making this a 2category.^{19}^{19}19We observe that if our starting category was merely symmetric monoidal, would be a bicategory, which is a headache we want to avoid. β
This 2categorical construction can always be turned back to the 1categorical one by locally quotienting things out.
Proposition 5.
There is an isomorphism
where is the enriched base change of the connected components functor .
Proof.
This is straightforward to show as is identityonobjects, and on the homcategory itβs defined as the application of . β
Having upgraded our optics to a 2category, we ask whether the previously defined equivalence between cartesian lenses and cartesian optics described in the subsection 3.1 has a higher dimensional counterpart. We will see that the answer is yes, and we proceed to unpack these much more involved constructions.
5.1 Cartesian 2optics
In this subsection we assume the monoidal product of is given by the cartesian one. Recall that in Prop. 1 we have shown that there is a local isomorphisms between the homsets of lenses and optics. As weβve upgraded our optics to a 2category, we might suspect that the corresponding isomorphism is upgraded too. We see that it is β to an adjunction.
theoremCartLocalAdjunction There is an adjunction
given by

Residual reifier , the left adjoint functor which chooses a canonical residual for every lens:
As the domain is a set, the action of on morphisms is trivial.

Residual eraser , the right adjoint which normalises the optic to its cartesian representation:
As the codomain is a set, every morphism must be sent to the identity one.

(Vacuously natural) identity transformation , the unit of the adjunction.

The natural transformation , the counit of the adjunction whose component at each optic is an optic morphism
defined by the reparameterisation in .
Proof.
Appendix. β
Remark 6.
As unit is the identity 2cell (and thus an isomorphism), this means that is locally a coreflective subcategory of . Even more, this is sometimes called a rali or a lari adjunction [Lali, Def. 1.2.] as the unit is the identity 2cell.
This defines a local adjunction on homcategories. To show that there is some higher correspondence between the and that is not just local, we need to check whether we can use it to define functors going both ways. As our codomain is now a 2category, we can reasonably expect a an (op)lax functor to appear. And having in mind the question posed in Subsection 3.3 we will see that our functor is indeed oplax, as it doesnβt preserve composition on the nose, instead distinguishing between the aforementioned optics.
Theorem 1.
There is an identityonobjects, oplax 2functor embedding the category of lenses into the 2category of optics:
defined as follows. Its action on homsets of is defined in Thm. 5.1. Its oplaxator is the natural transformation
which to every pair of lenses and assigns the reparameterisation between the corresponding optics and . Its opunitor to each object assigns a natural transformation whose unique component 2cell is given by the reparameterisation .
Proof.
We first need to prove that is a welldefined reparameterisation between and . These two unpack to two optics previously drawn in Figure 10:
Once reparameterised, we can see that the residuals and backward maps of these optics are equal on the nose. This leaves us with just proving that the forward parts are. We postpone this proof to Remark 7 as it has special meaning in terms of rewrites. From the naturality of the delete map it is straightforward to prove that the opunitor reparameterisation is welldefined too.
Lastly, what needs to be proven is that lax associativity and lax left and right unity are satisfied ([TwoDimensionalCategories, Eq. 4.1.19 and 4.1.20]). This becomes evident once the diagrams are properly unpacked, and simple axioms of a cartesian category (associativity and naturality of copy, interaction of copy and delete) are sufficient to prove it. β
Remark 7.
The attentive reader might have noticed a peculiarity regarding the two possible ways of composing lenses into optics. Namely, reparameterising the forward part of the optic with yields the forward part . To show that this morphism is indeed equal to the forward part of the optic we need to exhibit an additional proof.
This proof is rather simple as the rewrite can be made by applying associativity and naturality of the copy map. But surprisingly, this proof is the main reason why optics can trade space for time. This is because we can compute the once, and the copy the result instead of copying the input and applying twice, separately.
Remark 8.
We do not provide a proof in this paper, but closed lenses exhibit lax structure when embedded into 2optics. However, they happen not to permit any additional rewrites as a result of this. This means that their operational characteristics end up the same as those of optics. This might be the reason why essentially closed lenses are whatβs used in [SimpleAD] and many automatic differentiation libraries, as function spaces are a feature in most languages, but dependent types, which are required for optics, are not.
Having defined the embedding of lenses into optics, we can also go the other way. This time, we have a strict 2functor.
Proposition 6.
There is a strict identityonobjects 2functor normalising an optic to its lens representation whose action on homobjects is defined in Thm 5.1.
5.2 What world do these constructions live in?
So far weβve had an obstaclefree path retelling the story of optics in the 2categorical language. We have defined the 2category of optics and showed that, in the cartesian case, thereβs an oplax functor going to lenses, and a strict 2functor going backwards. As their lowerdimensional version formed an isomorphism, the natural next step is to check whether these two weak functors form some kind of a weaker version of an isomorphism β such as an equivalence or an adjunction. This necessitates finding a suitable ambient 2category, and is where peculiarities start appearing.
We first recall all the data defined so far, and some relevant properties:

Our 0cells are 2categories;

Our 1cells are not all strict;

Our 1cells and are identityonobjects;

Thm. 5.1 establishes a local homcategory adjunction, and not a mere isomorphism.
Looking at only the first three bullet points, we would be lead down a rabbit hole that eventually proves to be a dead end. As the second bullet point states that our 1cells are not all strict, this rules out the 2category of 2categories, 2functors and lax transformations as a suitable candidate. This means we need a 2category whose 1cells are lax functors. Famously, such a 2category where 2cells are any of the usual strict/pseudo/(op)lax/ transformations actually doesnβt exist [ProblemWithLaxFunctors]. What does exist is a 2category of 2categories, lax functors and a restricted kind of an oplax natural transformation called an icon ([Icon], [TwoDimensionalCategories, Theorem 4.6.13.]). Icons are oplax natural transformations that are defined only when the underlying lax functors agree on objects. However, we see from the third bullet point that this is indeed the case for us! We can indeed define these icons, and we will see that one of them is identity. This makes one of the triangle identities commute automatically. However, problems arise when when we check the other one β we find that it doesnβt commute.
This is because of the fourth bullet point: an adjunction requires a local isomorphism, but we locally have an adjunction itself. This rules out the possibility of an adjunction internal to a 2category β meaning we have to search for a weaker 2categorical analogue thereof. This leads us to consider the notion of a lax 2adjunction instead, defined internal to a 3category. This is a concept weak enough in for our purposes, describing exactly the setting of a local adjunction between homcategories.
This is where our search, in its current form, stops. Even though there is a special restricted kind of 2category whose 2cells are icons, there truly is no 3category whose 0cells are 2categories and 1cells are lax functors. Thus the question as weβve posed it indeed has no answers.
On the other hand, we might want to pose a different question. All of our constructions are restricted in some ways β 1cells are identity on objects and 2cells have identity components. Perhaps our approach is failing because 0cells are also restricted, but weβre not looking at them from the right perspective?
This is indeed the case. It turns out that there is an embedding where is the 3category of double categories, lax functors, lax transformations and modifications. It sends a 2category to a vertically discrete double category, a lax functor to a lax functor between the corresponding double categories and an icon to vertical transformations.
This means that our 2category of optics should really be thought of as a double category with only trivial vertical arrows. As is a 3category this means that it is a plausible setting for defining a lax 2adjunction. This leaves us with the conjecture that the wellknown isomorphism between and is a shadow of the lax 2adjunction in the 3category between and appropriately thought of as double categories. The proof of this conjecture is something we leave to future work.
References
Appendix A Appendix
*
Proof.
We first prove welldefinedness of the residual eraser . As maps every optic morphism to identity, we have to show that . But here we use the existing equivalence between lenses and the 1category of optics (Prop. 1) β where these lenses are equivalent if and only if the corresponding 1optics are. That is indeed true, as is a witness to such equivalence.
We now prove that both and is welldefined. For we notice that starting from a lens, reifying its residual and then erasing it lands us back exactly with the starting lens. Itβs easy to see that this assignment is vacuously natural as is discrete. For , we can see that starting from an optic , erasing the residual (yielding ) and then reifying the residual results in an optic . We need to check whether is a welldefined optic 2cell:
(6) 
But this is easy to verify. As optic 2cells move reparameterisations backward, the place where we have to look for a suitable reparameterisation of type is precisely in the backward pass of the lefthand side. And it is indeed there  itβs the morphism . We next need to prove that is natural. This means that for every optic morphism the equation needs to hold. As is identity, this reduces to showing that which follows from Eq. 5 (left).
Lastly, to prove this data indeed defines an adjunction, we need to show that the following diagrams commute.
As is the identity natural transformation, this means that and , reducing the proofs to and , respectively. For the former, we need to show that applying the counit from Eq. 6 on naturally the identity morphism. This is easy to show as the underlying reparameterisation morphism in for via Prop. 8 reduces to identity. For the latter, we need to show that applying to the same counit yields identity. This is also straightforward as maps every morphism to identity. β
Definition 6 (Ends as limits).
Let be a category. We call the twisted arrow category of defined as the category of elements of its hom functor.
It comes equipped with the projection .
Proposition 7.
There is a canonical isomorphism .
*
Proof.
β
Definition 7.
Let be a morphism in cartesian category . Then we denote by the composite
This is called the graph of f.^{20}^{20}20Itβs called the graph of f because its image is a set of pairs which we can think of as points in a coordinate plane to be graphed.
Proposition 8.
Let in a cartesian category . Then we have that