0. Introduction and summary
The monograph [Ma99] surveyed various versions and applications of the notion of varieties , whose tangent sheaf is endowed with a commutative, associative and –bilinear multiplication. Such got a generic name F–manifolds by that time. Attention of geometers and specialists in mathematical physics was drawn to them, in particular, because many deformation spaces of various origin are endowed with natural –structures.
, it was observed, that geometry of spaces of probability distributions on finite sets, “geometry of information”, that was developing independently for several decades, also led to some classes of–manifolds.
In this paper, we are studying categorical encodings of geometries of classical and quantum information, including its –manifolds facets, appearing in Sec. 4, and further developed in Sec. 5. Sec. 1 is a brief survey of both geometries. In the categorical encoding we stress the aspects related to the monoidal structures and dualities of the relevant categories which we survey in Sec. 2 and 3. Finally, in Sec. 6 we introduce and study high level categorifications of information geometries lifting it to the level of motives.
1. Classical and quantum probability distributions
1.1. Classical probability distributions on finite sets.
Let be a finite set. Denote by the –linear space of functions .
By definition, a classical probability distribution on is a point of the simplex in
spanned by the end–points of basic coordinate vectors in:
We denote by its maximal open subset
Sometimes it is useful to replace by in the definition above. Spaces of distributions become subspaces of cones. For a general discussion of the geometry of cones, see Sec. 3 below.
1.2. Quantum probability distributions on finite sets.
The least restrictive environment, in which we can define quantum probability distributions on a finite set, according to Sec. 8 of [Mar19], involves an additional choice of finite dimensional Hilbert space . This means that is finite dimensional vector space over the field of complex numbers , endowed with a scalar product such that for any and we have
Here means complex conjugation.
In particular, is –bilinear.
Whenever is chosen, we can define for any finite set the finite dimensional Hilbert space , the direct sum of copies of .
Finally, a quantum probability distribution on is a linear operator such that and Here is the Hermitian conjugation. Such an operator is also called a density matrix.
Remarks. The space
represents the quantum space of internal degrees of freedom of one point. Its choice may be motivated by physical considerations, if we model some physical systems. Mathematically, different choices of may be preferable when we pass to the study of categorifications: cf. below.
1.3. Categories of classical probability distributions
([Mar19], Sec. 2).
Let be two finite sets. Consider the real linear space consisting of maps : .
Such a map is called , if
(i) for all .
(ii) for all .
Consider pairs , consisting of a finite set and a probability distribution on (one point of the closed set of probability distributions, as above).
These pairs are objects of the category , morphisms in which are stochastic matrices
They are related to the distributions , by the formula , i. e.
where is the classical probability distribution, assigning to the probability .
Composition of morphisms is given by matrix multiplication.
Checking correctness of this definition is a rather straightforward task. In particular, for any and any stochastic matrix , , and
so that is a probability distribution.
Associativity of composition of morphisms follows from associativity of matrix multiplication.
1.4. Categories of quantum probability distributions
([Mar19], Sec. 8).
We now pass to the quantum analogs of these notions.
Let again be a finite set, now endowed as above, with a finite dimensional Hilbert space and a density matrix .
Given a finite dimensional Hilbert space , consider an algebra of linear operators on this space, containing the convex set of density operators/matrices as in Sec.1.2 above, satisfying the additional condition .
A linear map is called positive if it maps positive elements in to positive elements and it is completely positive if for all the operator is positive on . Completely positive maps form a cone : for all relevant information regarding cones, see Sec. 3 below.
A quantum channel is a trace preserving completely positive map . Composition of quantum channels is clearly again a quantum channel. A quantum channel can be represented by a matrix, the Choi matrix which is obtained by writing in the form
The first pair of indices of defines the row of the matrix and the second pair the column. Because quantum channels behave well under composition, they can be used to define morphisms of a category of finite quantum probabilities.
A quantum analog of the respective statistical matrix is the so called stochastic Choi matrix (see [Mar19], (8.1), where we replaced Marcolli’s notation by our , with for quantum).
These triples are objects of the category , morphisms in which can be represented by stochastic Choi matrices so that
We have omitted here a description of the encoding of stochastic Choi matrices involving the choices of bases in appropriate vector spaces, and checking the compatibility of bases changes with composition of morphisms.
As soon as one accepts this, the formal justification of this definition can be done in the same way as that of Proposition 1.3.1.
1.5. Monoidal categories
Speaking about monoidal categories, we adopt basic definitions, axiomatics, and first results about categories, sites, sheaves, and their homological and homotopical properties developed in [KS06]. In particular, sets of objects and morphisms of a category always will be small sets ([KS06], p. 10).
Sometimes we have to slightly change terminology, starting with monoidal categories themselves. We will call a monoidal category here a family of data, called a tensor category during entire Chapter 4 of [KS06] and the rest of the book, with exception of two lines in remark 4.2.17, p. 102.
According to the Definition 4.2.1. of [KS06] (p. 96), a monoidal category is a triple (), where is a category, a bifunctor , and is an “associativity” isomorphism of triple functors, constructed form by two different bracketings.
The asociativity isomorphism must fit into the commutative diagram (4.2.1) on p. 96 of [KS06].
According to the Def. 4.2.5, p. 98 of [KS06], a unit object of a monoidal category is an object, endowed with an isomorphism such that the functors and are fully faithful. By default, our monoidal categories, or their appropriate versions, will be endowed by unit objects.
Lemma 4.2.6 of [KS06], pp. 98–100, collects all natural compatibility relations between the monoidal multiplication and the unit object , categorifying the standard properties of units in set–theoretical monoids.
1.6. Duality in monoidal environments.
The remaining part of this Section contains a brief review, based upon [Ma17], of categorical aspects of monoidality related to dualities between monoidal categories with units.
They must be essential also for the understanding of quantum probability distributions, because generally the relevant constructions appeared during the study of various quantum models: see references in [Ma88], [Ma17], and a later development [MaVa20].
Let be a monoidal category, and let be an object of . (Notice that here we changed the notation of the monoidal product, earlier , and replaced it by ).
A functor is a duality functor, if it is an antiequivalence of categories, such that for each object of the functor
is representable by .
In this case is called a dualizing object.
(i) The data are called a Grothendieck–Verdier (or GV–)category.
(ii) is a monoidal category with unit object .
1.7. Example: quadratic algebras
Let be a field. A quadratic algebra is defined as an associative graded algebra generated by over , and such that the ideal of all relations between generators is generated by the subspace of quadratic relations .
Quadratic algebras form objects of a category , morphisms in which are homomorphisms of graded algebras identical on terms of degree . It follows that morphisms are in canonical bujection between such linear maps for which .
The main motivation for this definition was a discovery, that if in the study of a large class of quantum groups we replace (formal) deformations of the universal enveloping algebras of the relevant Lie algebras by (algebraic) deformations of the respective algebras of functions, then in many cases we land in the category .
The monoidal product in can be introduced directly as a lift of the tensor product of linear spaces of generators: is generated by , and its space of quadratic relations is , where the permutation map sends to .
Remark. Slightly generalizing these definition, we may assume that our algebras are –graded, that is supergraded. This will lead to the appearance of additional signs in various places. In particular, in the definition of there will be sign , if both and
This might become very essential in the study of quantum probability distributions where physical motivation comes from models of fermionic lattices.
We return now to our category .
Define the dualization in as a functor extending the linear dualization on the spaces of generators. The respective subspace of quadratic relations will be , the orthogonal complement to .
Consider as the object of , and put .
Then is a GV–category.
More precisely, in the respective dual category the “white product” is another lift of the tensor product of linear generators , with quadratic relations .
For a proof, see [Ma88], pp. 19–28.
2. Monoidal duality in categories of classical probability distributions
According to [KS06], (Examples 4.2.2 (vi) and (v), p.96), if a category admits finite products , resp. finite coproducts , then both of them define monoidal structures on this category.
Both products and coproducts are defined by their universal properties in the Def. 2.2.1, p. 43, of [KS06].
If admits finite inductive limits and finite projective limits, then it has an initial object , which is the unit object for the monoidal product . Any morphism is an isomorphism, and ([KS06], Exercise 2.26, p. 69).
When studying monoidal dualities in various categories of structured sets, it is useful to keep in mind the following archetypal example. For any set , denote by the set of all non–empty subsets of . Then, for any two sets without common elements , , there exists a natural bijection
Namely, if a non–empty subset has empty intersection with , resp. , it produces the last two terms in the r.h.s. of (2.1). Otherwise, it produces a pair of non–empty subsets and .
A more convenient version of (2.1) can be obtained, if one works in a category of pointed sets, and defines the set of subsets as previously, but adding to it the empty subset as the marked point. Then the union in (2.1) is replaced by the smash product , and the map (2.1) extended to a bifunctor of , becomes a “categorification” of the formula for an exponential map underlying transitions between classical models in physics and quantum ones. In particular, (2.1) connects the unit element for direct product with the unit/zero element for the coproduct/smash product.
We will now describe a version of these constructions applied to sets of probability distributions.
2.1.1. A warning
If we construct a functor or , sending to and exchanging their units and , it cannot be completed to a duality functor in the sense of Def.1.6.1 above: a simple count of cardinalities shows it.
However, we will still consider such functors as weaker versions of monoidal dualities, and will not warn a reader about it anymore.
2.2. Category of classical probability distributions.
If the cardinality of a finite set is one, there is only one classical probability distribution on , namely . In [Mar19], Sec.2, such objects are called singletons.
Morphisms that factor through zero objects are generally called zero morphisms. In , they are explicitly described as “target morphisms” by [Mar19], 2.1.2: they are such morphisms , for which .
Category is not large enough for us to be able to use essential constructions and results from [KS06], related to multiplicative/additive transitions sketched in Sec.2.1 above.
M. Marcolli somewhat enlarged it by replacing in the definition of objects of finite sets by pointed finite sets. In [Mar19], the resulting category is denoted , so that is embedded in it. Objects, morphisms sets in , their compositions etc. are explicitly described in Def. 2.8, 2.9, 2.10 of [Mar19]. Objects of are called probabilistic pointed sets in Def.2.8.
Besides embedding, there is also a forgetful functor ([Mar19], Remark 2.11).
2.2.1. Coproducts of probabilistic pointed sets and classical probability distributions.
Start with the usual smash product of pointed sets
where is the “smash” of the union of two coordinate axes . It induces on probabilistic pointed sets obtained from finite probability distributions the product of statistically independent probabilities.
([Mar19], Lemma 2.14) .
We will denote by any object of , consisting of a finite set and a point in it with prescribed probability 1. All these objects are isomorphic.
(i) The triple is a monoidal category with unit.
(ii) The triple is a monoidal category with unit.
For detailed proofs, see [Mar19], Sec. 2.
2.3. A generalization
Let be a monoidal category with unit/zero object.
Generalizing the passage from to , M. Marcolli defines the category , a probabilistic version of ([Mar19], Def. 2.18).
One object of is a formal finite linear combination , where are objects of .
One morphism is a pair , where is a stochastic matrix with , and are real numbers in such that .
As before, one can explicitly define a categorical coproduct in , so that it becomes a monoidal category with zero object.
3. Monoidal duality in categories of quantum probability distributions
3.1. Category of quantum probability distributions.
We now return to the category of quantum probability distributions.
We will be discussing the relevant versions of monoidal dualities for enrichments of , based upon variable categories .
([Mar19], Def. 8.2).
Given a category , its quantum probabilistic version is defined as follows.
One object of is a finite family , where
and is the Choi matrix of a quantum channel as in Sec. 1.4 above.
One morphism between two such objects, (source) and (target) is given by a Choi matrix, as in Sec. 1.4 above, entries of which now are morphisms in .
Composition of two morphisms is defined similarly to the classical case, so that the usual associativity diagrams lift to .
3.2. Monoidal structures.
Assume now that is endowed with a monoidal structure with unit/zero object. Then it can be lifted to in the same way as in classical case: see [Mar19], Proposition 8.5.
We can therefore extend the (weak) duality formalism of Sec. 2 to the case of quantum probability distributions, keeping in mind the warning stated in Sec. 2.1.1.
4. Convex cones and –manifolds
4.1. Convex cones
Let be a finite–dimensional real linear space. A non–empty subset is called a cone, if
(i) is closed with respect to addition and multiplication by positive reals;
(ii) The topological closure of does not contain a real affine subspace of positive dimension.
Basic example. Let be a convex open subset of whose closure does not contain . Then the union of all half–lines in , connecting with a point of , is an open cone in .
Here convexity of means that any segment of real line connecting two different points of , is contained in .
4.2. Convex cones of probability distributions on finite sets.
Let be a finite set. As in 1.1 above, we start with the real linear space and denote by the union of all oriented real half–lines from zero to one of the points of , or else of .
Such cones are called open, resp. closed, cones of classical probability distributions on .
4.3. Characteristic functions of convex cones.
Given a convex cone in finite–dimensional real linear space , construct its characteristic function in the following way.
Let be the dual linear space, and the canonical scalar product between and . Choose also a volume form on invariant wrt translations in . Then put
Such a volume form and a characteristic function are defined up to constant positive factor.
This follows almost directly from the definition.
Now we focus on the differential geometry of convexity.
A cone is a smooth manifold, its tangent bundle can be canonically trivialised, , in the following way: is identified with by the parallel transport sending to . Choose an affine coordinate system in and put
The metric is a Riemannian metric on . The associated torsionless canonical connection on has components
where are defined by
Therefore, by putting
we define on a commutative –bilinear composition.
4.4. Convex cones of stochastic matrices and categories of classical probability distributions
([Mar19], Sec. 2).
Our first main example are cones of stochastic matrices from Sec.1.3 above.
Let be two finite sets. Consider the real linear space consisting of maps : , where is a stochastic matrix. As explained above, such stochastic matrices can be considered as morphisms in the category .
The sets of morphisms are convex sets.
4.5. Convex cones of quantum probability distributions on finite sets.
([Mar19], Sec. 8 and others.) Using now stochastic Choi matrices, encoding morphisms between objects of the category , as explained in Sec.1.4 above, we will prove the following quantum analog of the Proposition 4.4.1.
The sets of morphisms in are convex sets.
We leave both proofs as exercises for the reader.
4.5.1. Remarks: Comparison between classical and quantum probability distributions.
(i) The –linear spaces or from above correspond to –linear spaces or from [Mar19], Def. 8.1.
(ii) The –dual spaces correspond to the –antidual spaces from [Mar19].
The real duality pairing is replaced by the complex antiduality pairing: this means that, for and , we have
where is the complex conjugation map.
(iii) This implies that the direct sum of real spaces on the classical side which is replaced by on the quantum side, can be compared with the direct sum of also real subspaces of corresponding to the eigenvalues of the operator combining .
A parametric variation of this structure should lead to paracomplex geometry, which entered the framework of geometry of quantum information in Sec.4 of [CoMa20]. The algebra of paracomplex numbers (cf. [Ya68]) is defined as the real vector space with the multiplication
Put . Then , and moreover
Given a paracomplex number , its conjugate is defined by . The paracomplex numbers form a commutative ring with characteristic 0. Naturally, arises the notion paracomplex structure on a vector space.
The paracomplex structure enters naturally the scene of the manifold of probability distributions over a finite set, and more generally to the case of statistical manifolds related to exponential families. Indeed, these real manifolds are identified to a projective space over the algebra of paracomplex numbers (see Proposition 5.9 in [CoMa20]). It is well known that this manifold of probability distributions is endowed with a pair of affine, dual connections and . So, the underlying affine symmetric space is defined over a Jordan algebra which is generated by and verifies or . This manifold not being complex, the paracomplex case remains the only possibility (Proposition 5.4 [CoMa20]).
There are interesting questions related to the occurrence of paracomplex -manifolds in information geometry. For instance, the question of a classification of these paracomplex structures, along the lines of recent classification results for small dimensional -manifolds over by Hertling and collaborators, or of Dubrovin’s analytic theory.
5. Clifford algebras and Frobenius manifolds
5.1. Hilbert spaces over Frobenius algebras.
Let be a commutative algebra over of finite dimension , generated by linearly independent elements satisfying relations .
Moreover assume, that is endowed with a nondegenerate bilinear form (Frobenius form), satisfying the associativity property , and a homomorphism , whose kernel contains no non–zero left ideal of .
One can see that then , where .
As usual, in such cases we will omit summation over repeated indices and write the r.h.s. simply as .
Denote by the dual basis to with respect to .
Now consider a right free -module of rank . It has a natural structure of real –dimensional linear space. It can be also represented as the space of matrices over .
Generally, below we will be considering Hilbert spaces , endowed with a compatible action of .
Consider a particular case . One can see that there exist three different algebras (up to isomorphism): complex numbers, dual numbers, and paracomplex numbers.
The respective bases, denoted in Sec. 5.1 above, have traditional notations: , ; , ; and , .
5.2. General case: Clifford algebras
Let be a field of characteristic ; a finite dimensional linear space over ; a non–degenrate quadratic form on . It defines the symmetric scalar product on : .
A Clifford algebra over is an associative unital –algebra of finite dimension , endowed with generators satisfying relations
Notice that is an implicit part of the structure in this definition.
If , then is the exterior algebra of over , hence linear dimension of is , where over . This formula holds also for of arbitrary rank.
Finally, if and is non–degenerate, it has a signature , that is, in an appropriate basis has the standard form , hence . We will denote the respective Clifford algebras by .
This gives a complete list of (isomorphism classes of) Frobenius –algebras of finite dimension.
Clifford algebras have properties implying the existence of a symmetric scalar product on the vector space . More precisely, where is the quadratic form associated to the Clifford algebra. Using this definition, we can obtain the characteristic function, which is defined in section 4.3. Recall that the characteristic function is explicitly given by
where is the dual linear space, and is the canonical scalar product between and , and is a volume form on invariant w.r.t. translations in . Therefore, one establishes a direct relation between those Clifford algebras and the characteristic functions defined in section 4.3, and hence a relation to the -manifolds.
5.3. The splitting theorem
has a canonical splitting into subsectors that are irreducible modules over respective Clifford algebras.
Let be a manifold, endowed with an affine flat structure, a compatible metric , and an even symmetric rank 3 tensor . Define a multiplication operation on the tangent sheaf by . The manifold is Frobenius if it satisfies local potentiality condition for , i.e. locally everywhere there exists a potential function such that , where are flat tangent fields and an associativity condition: (see [Ma99]).
Clifford algebras can be considered under the angle of matrix algebras as Frobenius algebras. They are equipped with a symmetric bilinear form , such that . We consider a module over this Frobenius algebra and construct the real linear space to which it is identified, denoted .
Let us first discuss the rank 3 tensor . We construct it on , using the -tensor formula
in the adapted basis.
There exists a compatible metric, inherited from the non-degenerate, symmetric bilinear form defined on the algebra , given by , where . Call this metric .
Now that the rank 3 tensor and the metric have been introduced, we discuss the multiplication operation . This multiplication operation is inherited from the multiplication on the algebra and given by . It can be written explicitly by introducing a bilinear symmetric map , which in local coordinates is
Here . The multiplication is thus defined by
with and are local flat tangent fields.
Since we have defined the multiplication operation, we can verify the associativity property. Indeed, recall that the metric is inherited from the Frobenius form , which satisfies . Naturally, this property is inherited on , where this associativity relation is given by for flat tangent fields.
Finally, by using the relation between and , we can see that the potentiality property is satisfied. ∎
6. Motivic information geometry
This last section is dedicated to the construction of the highest (so far) floor of the Babel Tower of categorifications of probabilities.
We investigate possible extensions of some aspects of the formalism of information geometry to a motivic setting, represented by various types of Grothendieck rings.
A notion of motivic random variables was developed in[Howe19], [Howe20], based on relative Grothendieck rings of varieties. In the setting of motivic Poisson summation and motivic height zeta functions, as in [Bilu18], [ChamLoe15], [CluLoe10], [HruKaz09], one also considers other versions of the Grothendieck ring of varieties, in particular the Grothendieck ring of varieties with exponentials. A notion of information measures for Grothendieck rings of varieties was introduced in [Mar19b]
, where an analog of the Shannon entropy, based on zeta functions, is shown to satisfy a suitable version of the Khinchin axioms of information theory. We elaborate here some of these ideas with the goal of investigating motivic analogs of the Kullback–Leibler divergence and the Fisher–Rao information metric used in the context of information geometry (see[AmNag07]).
6.1. Grothendieck ring with exponentials and relative entropy
We show here that, in the motivic setting, it is possible to implement a version of Kullback–Leibler divergence based on zeta functions, using the Grothendieck ring of varieties with exponentials, defined in [ChamLoe15].
The Grothendieck ring with exponentials , over a field , is generated by isomorphism classes of pairs , where is a –variety, and a morphism . Two such pairs and are isomorphic, if there is an isomorphism of –varieties such that .
The relations in are given by
for a closed subvariety and its open complement , and the additional relation
where is the projection on the second factor.
The ring structure is given by the product
The original motivation for introducing the Grothendieck ring with exponentials was to provide a motivic version of exponential sums. Indeed, for a variety over a finite field , with a morphism , a choice of character determines an exponential sum
of which the class is the motivic counterpart. The relation corresponds to the property that, for any given character , one has .
Here we interpret the classes with as pairs of a variety and a potential (or Hamiltonian) . A family of commuting Hamiltonians is represented in this setting by a class with , where for the function given by is our Hamiltonian .
Note that, for this interpretation of classes as varieties with a potential (Hamiltonian) we do not need to necessarily impose the relation . Thus, we can consider the following variant of the Grothendieck ring with exponentials.
The coarse Grothendieck ring with exponentials is generated by isomorphism classes of pairs of a –variety and a morphism as above.
The relations etween them are generated by , for a closed subvariety and its open complement .
The product is .
The Grothendieck ring with exponentials is the quotient of this coarse version by the ideal, generated by .
We will be using motivic measures and zeta functions that come from a choice of character as above, for which the elements will be in the kernel. So all the motivic measures we will be considering on will factor through the Grothendieck ring of Definition 6.1.1
For a class , the symmetric products are defined as
with the symmetric product and given by
The analog of the Kapranov motivic zeta function in is given by
Given a motivic measure , for some commutative ring , one can consider the corresponding zeta function
As our basic example, consider the finite field case and the motivic measures and zeta functions discussed in Section 7.8 of [MaMar21], where the motivic measure is determined by a choice of character ,
and the associated zeta function is given by (see Proposition 7.8.1 of [MaMar21])
where for given and , we have for the level sets
We can then consider two possible variations, with respect to which we want to compute a relative entropy through a Kullback–Leibler divergence: the variation of the Hamiltonian, obtained through a change in the function , and the variation in the choice of the character in this motivic measure . We will show how to simultaneously account for both effects.
To mimic the thermodynamic setting described in the previous subsection, consider functions of the form for given morphisms , with acting on by multiplication and .
Given a motivic measure associated to the choice of a character , and a class in , we consider the “probability distribution”
for . We write when we need to emphasise the dependence on the morphism and the character . We leave the –dependence implicit. Of course, this is not a probability distribution in the usual sense, since it takes complex rather than positive real values, though it still satisfies the normalisation condition. We still treat it formally like a probability so that we can consider an associated notion of Kullback–Leibler divergence .
Given a choice of a branch of the logarithm, to a character , with a locally compact abelian group, we can associate a group homomorphism .
Given , consider a class with and , and characters . The Kullback–Leibler divergence then is
where is the expectation value with respect to .
Similarly, with , we have
We write the zeta function as
so that, if we formally regard as above
as our “probability distribution”, we have