1.1 Commutative and noncommutative grammars
The best known approach to categorial grammars is based on noncommutative variants of linear logic, most notably, on Lambek calculus  and its variations/extensions, such as non-associative Lambek calculs , various mixed multimodal systems , displacement calculus  etc.
Abstract categorial grammars (ACG) , as well as their close relatives -grammars  or linear grammars , arise from an alternative or, rather, complementary approach, based on the ordinary implicational linear logic and linear -calculus. They can be called “commutative” in contrast to the “noncommutative” Lambek grammars. Commutative grammars are remarkably flexible and expressive (sometimes even too expressive for effective parsing ). On the other hand they are also remarkably simple because of the much more familiar and intuitive underlying logic.
However, as far as natural language modeling is concerned, it turns out that, in many situations, commutative grammars, such as ACG, behave very poorly compared to noncommutative variants. In fact, it has even been argued that ACG are descriptively inadequate . Simple and striking instances of this inadequacy arise, for example, when linguistic coordination is considered.
The reason is that, as an analysis shows, “commutative” types of ACG and its relatives are too coarse to distinguish actual linguistic categories. Thus, if we want to model important linguistic phenomena in the commutative setting, we need somehow to enrich the formalism with a finer structure of subtypes corresponding to the “noncommutative” types of Lambek calculus.
One solution to this problem, was proposed in  (but see also ), where an explicit subtyping mechanism was added to the system. Unfortunately, this results in a rather impressive complication of the formalism (as it seems to us). Another proposed direction is, simply, to enrich a commutative system with explicit noncommutative constructions. (This suggests a comparison with the quite long known Abrusci-Ruet logic .) Hybrid type logical categorial grammars (HTLCG)  have three kinds of implication on the level of types (one commutative implication of linear logic and two noncommutative slashes of Lambek calculus) and two kinds of application on the level of terms, the usual application of -terms and an additional operation of concatenation. Both approaches, of  and of , led to, at least partially, successful developments. An apparent drawback though, as it seems to us, is that the attractive simplicity of ACG gets somewhat lost.
An interesting perspective comes from considering first order logic , . It turns out that different grammatical formalisms including Lambek grammars, ACG and HTLCG can be faithfully represented as fragments of first order multiplicative intuitionistic linear logic (MILL1). This suggests another approach to combining commutative and noncommutative features, as well as provides some common ground on which different systems can be compared.
1.2 Content of this work
Tensor grammars of this work are an elaboration of the so called linear logic grammars (LLG) introduced in .
LLG are another example of commutative grammars, based on the classical, rather than intuitionistic, multiplicative linear logic (MLL). They were defined in terms of certain bipartite graphs (generalizing MLL proof-nets) with string-labeled edges.
LLG (as well as tensor grammars of this work) can be seen as a surface representation of ACG. Derivations of ACG translate to derivations of tensor grammars and this translation is isomorphic on the level of string languages (as well as tree languages). On the logical side, this is, simply, a reflection of the fact that implicational linear logic is a conservative fragment of classical MLL and linear -terms can be represented as proof-nets. An advantage of this representation, as it seems to us, is that the syntax becomes extremely simple and a direct geometric meaning is transparent.
In this work we introduce an in-line notation for edge-labeled graphs of LLG and reformulate the system in these new terms. Then we address the problem of encoding noncommutative operations of Lambek calculus in our setting. This turns out possible after enriching the system with new unary operators, which results in extended tensor grammars.
1.2.1 Tensor grammars
Tensor terms are, basically, tuples of words (written multiplicatively, as products) with labeled endpoints. We write words in square brackets and represent the endpoints as lower and upper indices, lower indices standing for left endpoints, and upper indices, for right ones. Thus, tensor terms can have the form
(where stands for the empty word) and so on. The index notation is taken directly from usual tensor algebra.
An index in a term can be repeated at most twice, once as an upper, and once as a lower one. A repeated index means that the corresponding words are concatenated along matching endpoints. For example, we have the term equality
(the product is commutative).
Tensor term calculus (TTC) equips tensor terms with types. Tensor types are MLL formulas decorated with indices, which should match indices in corresponding terms, with the convention that upper indices in types match lower indices in terms and vice versa. A tensor typing judgement looks, for example, as the following:
Typing rules, of course, are rules of MLL decorated with terms and indices.
A tensor grammar is defined then by a set of lexical axioms, which are tensor typing judgements as above, and the sentence type with exactly one upper and one lower index. Tensor terms of the sentence type are single words and they constitute the language of the grammar.
Of course, this is an oversimplified inaccurate sketch rather than a consistent formalism. But we think that it is easy to believe that technical details can be worked out, as well as that the resulting formalism indeed provides a representation for ACG. Unfortunately it does not allow simple representation of noncommutative operations of Lambek calculus.
Before approaching noncommutativity we note that indices in our formalism can be thought of as first order variables, which suggests a comparison with MILL1 and the work in . The system of MILL1
has an extra degree of freedom because of binding operators in formulas, i.e. quantifiers. It is thanks to quantifiers that representation of noncommutative systems becomes possible.
This suggests that we need to extend tensor grammars with index binding operators on types.
1.2.2 Extended tensor types
Tensor representation makes very transparent how the “non-commutative” types of Lambek calculus look inside “coarse” implicational types of commutative grammars.
Atomic types of Lambek calculus correspond to tensor types with one upper and one lower index. If and are two such types, then, by the standard definition of linear implication in MLL, the implicational type is the type (where we denote linear negation as a bar). Thus, elements of the type are tensor terms with four indices.
We can single out two important subtypes of the tensor type . The first subtype consists of terms of the form
and the second one, of terms of the form
It is easily computed that terms of form (1) act on elements of by multiplication (concatenation) on the left, and elements of form (2), by multiplication on the right. Indeed, for a term of type , we have
Following this insight, we emulate noncommutative implications by means of a new type constructor , which binds indices in types.
If is a tensor type with a free upper index and lower index , we define the new type , in which the indices and are no longer free. The rule for introducing is
and noncommutative implications are encoded as
In fact, since we work in the setting of classical linear logic, we also have to introduce the dual operator of . Remarkably, this second binding operator serves to encode the product of Lambek calculus.
Along these lines we develop extended tensor type calculus (ETTC) of tensor terms and define extended tensor grammars. Both ACG and Lambek grammars embed in ETTC. We would expect that HTLCG do embed as well, but this requires a proof.
Again, it seems to us that the formalism of ETTC is rather simple and intuitive, and this makes the work interesting.
1.2.3 Background and organization
2 Tensor terms
2.1 Term expressions
Throughout the paper we assume that we are given an infinite set of indices. They will be used in all syntactic objects (terms, types, typing judgements) that we consider.
Now let be an alphabet of terminal symbols or, simply, terminals.
We will build tensor term expressions from terminal symbols, using elements of as upper and lower indices. The set of upper, respectively lower, indices occurring in a term expression will be denoted as , respectively . The set of all indices of will be denoted as .
Tensor terms will be defined as tensor term expressions quotiented by an appropriate equivalence.
Term expressions are defined by induction.
If and , then is a term expression;
if then is a term expression;
if are term expressions with
then is a term expression.
The multiplication symbol (dot) will usually be omitted, i.e., we will write for .
The definition implies that an index can occur in a term expression at most twice: once as an upper one and once as a lower one. We say that an index occurring in a term expression is free in , if it occurs in once. Otherwise we say that the index is bound.
We denote the set of free upper, respectively lower indices of as , respectively . We denote the set of all free indices of as . We say that a term expression is normal if it has no bound indices.
We say that a term expression is closed if it has no occurrence of a terminal symbol.
An elementary regular term expression is an expression of the form , where . An elementary singular term expression is an expression of the form or . Singular term expressions should be considered as pathological (in the context of this work), we need them for consistency of definitions. We will discuss their meaning shortly.
We define congruence of term expressions as the smallest equivalence relation satisfying the conditions
Congruence has a simple geometric meaning, which we will discuss shortly.
Tensor terms (over a given alphabet of terminal symbols) are defined as equivalence classes for congruence of term expressions.
Multiplication of terms is associative, therefore we will usually omit brackets in term expressions in the sequel.
The sets of free upper and lower indices of a term expression are easily seen to be invariant under congruence. Thus they are well-defined for terms as well. For a term we write , for the sets of free upper, respectively lower, indices of and .
A crucial role will be played by the following closed constants (Kronecker deltas), familiar from linear algebra:
where stands for the empty word.
We have the relations
and these imply the following property.
Let be a tensor term expression, and
Let the term expression be obtained from by replacing the indices
Apparently, writing long sequences of indices as in the above proposition would be rather cumbersome. In the sequel, we adopt the convention that capital Latin letters stand for sequences of indices and small Latin letters, for individual indices.
2.2.1 Geometric representation and normalization
Regular tensor term expressions and terms have a simple geometric representation as bipartite graphs whose vertices are labeled with indices and edges, with words, and direction of edges is from lower indices to upper ones.
Thus, an elementary term expression , , corresponds to a single edge.
The product of two terms without common indices is represented as the disjoint union of the corresponding graphs; the term is represented as two edges.
A term with bound indices corresponds to a graph obtained by gluing edges along matching vertices. Thus, the term is represented as the following.
Obviously, congruent term expressions have the same geometric representation.
As for singular terms, they arise when edges are glued cyclically, for example, when the endpoints of the same edge are glued together as in , which is the same as .
Then a singular term , where , should be represented as a closed loop labeled with the cyclically ordered sequence . The ordering is cyclic, because there is no way to say which letter is first. This is no longer a graph (because there are no vertices), but it has an obvious geometric meaning (it is a topological space, even a manifold).
In general, a tensor term can be represented as a finite set of word-labeled edges with index-labeled vertices and a finite multiset of closed loops labeled with cyclically ordered words. These geometric objects were introduced in  under the name of word cobordisms or cowordisms for short, and linear logic grammars (LLG) were formulated directly in the geometric language. Tensor grammars of this work are a reformulation and an elaboration of constructions of .
The geometric representation makes especially obvious that any term expression is congruent to a normal one (which is unique up to associativity and commutativity of multiplication).
On the other hand, any normal term expression is the product of elementary regular and elementary singular term expressions. We say that is regular, if are regular. Otherwise we say that is singular.
We say that a tensor term is regular, respectively singular, if it is the congruence class of a regular, respectively, singular term expression.
3 Tensor type calculus
3.1 Tensor types
Our goal is to assign types to terms. Our types will be formulas of multiplicative linear logic (MLL), decorated with indices intended to match free indices of terms.
We assume that we are given a set of positive atomic type symbols, and every element is assigned a valency .
The set of atomic type symbols is defined as , where . Elements of , negative atomic type symbols, are assigned valencies by the rule: if and , then .
Types, similarly to terms, will have upper and lower indices. Accordingly, we will denote the set of upper, respectively lower indices occurring in a type as , respectively and the set of all indices occurring in as .
In a tensor type, each index occurs at most once, so there are no bound indices. However, later we will add more constructors, which allow binding indices in types as well. Therefore we will explicitly use the adjective “free” for type indices right from the start and use the tautological notation , , .
Given s set of positive atomic type symbols together with the valency function , the set of tensor types over is defined by the following rules.
If with and are pairwise distinct elements of then is a positive atomic type;
if with and are pairwise distinct elements of then is a negative atomic type;
if , are types with then , are types.
A tensor type symbol (over ) is a tensor type (over ) with all indices erased. We denote the set of tensor type symbols over as .
Type valency extends from atomic type symbols to all tensor type symbols in the obvious way:
where addition of elements of is defined componentwise: .
Tensor type symbols are formulas of MLL equipped with valencies. In order to obtain a type from a type symbol it is sufficient to specify the ordered sets of free upper and lower indices. Accordingly, we will use the notation to denote a tensor type whose symbol is , and whose ordered sets of free upper and lower indices are and respectively.
The dual type symbol of a tensor type symbol , is defined by induction, as in MLL.
The base case is , which has already been discussed. The induction steps are
The linear implication type symbol is defined by
3.2 Typing judgements
A tensor sequent (over a set ) is a finite multiset of tensor type symbols (over ) A tensor type sequent (over a set ) is a finite set of tensor types (over ) whose elements have no common free indices. If is a tensor sequent, then the notation stands for the sequent .
For a tensor type sequent we define the sets of free upper and lower indices of by
A syntactic tensor typing judgement (over sets and ) is a pair , denoted as , where is a tensor type sequent (over ), and is a term expression (over ), such that , .
We often will use the notation for typing judgements, implying that is a tensor sequent and is obtained as a decoration of with some indices:
We define congruence of syntactic typing judgements as the smallest equivalence relation on syntactic tensor typing judgements satisfying the conditions:
if , are syntactic typing judgements, , , and , respectively , are obtained by replacing the index in , respectively , with , then (-conversion );
if , are syntactic typing judgements, and , then .
We define a tensor typing judgement as an equivalence class for congruence of syntactic tensor typing judgements.
We say that a typing judgement is regular if the term is regular.
3.3 Typing rules
Typing judgements are derived by the following rules of Tensor type calculus (TTC).
It is implicit in the rules above that all typing judgements are well defined, i.e. there are no index collisions. For example, syntactic representatives for the two premises of the , respectively, the Cut rule must be chosen to have no common indices.
The rules of TTC are those of multiplicative linear logic (MLL), decorated with indices and terms. One might observe that the syntax of tensor typing judgements is, basically, an in-line notation for MLL proof-nets.
In MLL it is understood that commas in sequents are, essentially, erased -connectives, and the -connective itself is a symmetrized implication as in (5). We collect a couple of straightforward observations from MLL, for further reference.
A typing judgement is derivable in TTC iff the typing judgement is derivable.
If typing judgements
are derivable in TTC , then the typing judgement
is derivable as well.
Any typing judgement derivable in TTC can be derived without the Cut rule.
Proof in the Appendix.
Any typing judgement derivable in TTC is regular.
Proof by induction on a cut-free derivation.
3.4 Tensor signatures
Let be a finite multiset of typing judgements (over some sets and ).
Choose syntactic typing judgements
representing elements of so that different elements have no indices in common.
The following is straightforward.
In the setting as above, for any typing judgement derivable from elements of using each element exactly once, there exists a syntactic representation such that
where is a closed term.
Lemma 2 (“Deduction theorem”)
Let be a finite multiset of typing judgements.
of elements of so that different elements have no indices in common.
Let be a closed term expression.
The typing judgement
is derivable in TTC from elements of using each element exactly once iff the typing judgement
is derivable in TTC.
For the opposite direction use induction on derivation.
A tensor signature is a tuple ), where is a set of positive atomic type symbols, is an alphabet of terminal symbols and is a set of typing judgements over and , called axioms of .
A typing judgement is derivable in if it is derivable in TTC from the axioms of .
A tensor grammar is a tensor signature , where and are finite, together with a positive atomic type symbol of valency (1,1).
We say that generates a word if the typing judgement
is derivable in .
The language generated by (language of ) is the set of all words generated by .
4 Examples and inadequacy
The formalism of tensor grammars can be seen as a surface representation of abstract categorial grammars (ACG), which will be discussed in the next section. Derivations of ACG translate to derivations of tensor grammars and this translation is isomorphic on the level of string languages (on the level of tree languages as well ).
In this section we discuss toy examples, adapted, in fact, from ACG, and then analyse the notorious inadequacy  of commutative grammars (such as ACG), which becomes very transparent in tensor representation.
Consider the alphabet of atomic type symbols with .
Let the terminal alphabet be
and consider the tensor grammar defined by the following axioms
(If we agree that commas in a sequent are “hypocrisies” for and is a symmetrized implication, then the type of “loves” is, of course, .)
With this we can derive the typing judgement
and then, in a similar way,
which, after a straightforward computation yields the standard sentence “John loves Mary”.
For a more elaborate example with medial extraction let us add terminal symbols
After a straightforward computation we derive the typing judgements
and, finally, get the sentence
It can be observed that representation of data in tensor grammars is rather non-economical.
For example, if we encode a transitive verb in a term of type , as is customary in categorial grammars, then we need to keep in memory three strings (elementary terms), although only one of them is nonempty. This contrasts with Lambek grammars, where English transitive verb, for example, is customarily represented as a single string of type . Of course, the contrast becomes even more striking when we consider more complicated grammatical categories.
The reason is that in tensor grammars (as well as in ACG) all information about positioning words in a sentence is contained in the corresponding term, while in Lambek grammars this information is stored in the corresponding type, once for all type elements. Apparently it would be desirable to have some finer structure on tensor types allowing similar economy.
These considerations become even more relevant when we consider complex linguistic phenomena, such as coordination. It was very convincingly explained in  that, in contrast to Lambek grammars, ACG are in a certain sense inadequate for modeling (non-constituent) coordination, at least, in the most direct, “naive” approach. This analysis applies to tensor grammars equally well.
In fact, tensor representation makes this “inadequacy” very transparent. Consider, as a very simple example, the sentence
|John loves and Jim hates Mary.|
The elements that are coordinated are the strings “John loves” and “Jim hates”, which are customarily modeled as terms of type . It follows that we need a coordinating operator of type
But the tensor type
is not elementary, i.e. its elements are not strings. Rather, they are ordered pairs of strings. Coordinating them means gluing two pairs of strings into one pair, and there are too many ways of doing this… But apparently none of these ways corresponds to actual coordination occurring in the language.
Staying with the toy grammar of the preceding subsection, we can generate at least three different kinds of terms in the type , namely:
Obviously, all three terms have direct linguistic meaning. At the same time they cannot be coordinated with each other.
On the other hand, we can note that terms of the first kind correspond to the Lambek grammar type and can be coordinated with each other. Similarly, the second kind corresponds to the type and its elements can be coordinated with each other equally well. As for the third kind, its elements cannot be represented as strings and cannot be coordinated in a simple way.
This simple analysis shows once again that the structure of tensor types (or linear implicational types in the case of ACG) is too coarse, at least for simple intuitive modeling of non-constituent coordination. It seems clear that we need type constructors capable of emulating Lambek calculus.
4.3 Towards Lambek types
The tensor representation makes very transparent how the “non-commutative” types of Lambek calculus look inside “commutative” types of tensor grammars and ACG.
Indeed, let , be types of Lambek grammar. Their elements are strings, so they can be emulated as tensor types of valency (1,1), say and . Then elements of the complex Lambek grammar types and can be represented as elements of the tensor type of the form, respectively,
It is easily computed that elements of the first form act (by means of the Cut rule) on elements of by multiplication (concatenation) on the left, and elements of the second form, by multiplication on the right.
The two formats in (8) identify two subtypes of the implicational tensor type that correspond to two implicational types of Lambek calculus. Note that, if we restrict to either of these subtypes, then only two of the four available type indices become relevant, while two other indices are automatically connected with an empty string (Kronecker delta). This suggests that the new type constructors for emulating Lambek calculus should bind indices in types.
In the next section we accurately discuss tensor representation of ACG. After that we develop extended tensor type calculus for embedding Lambek grammars basing on the above considerations.
5 Representing abstract categorial grammars
In this section we assume that the reader is familiar with basic notions of -calculus, see  for a reference.
5.1 Linear -calculus
Given a set of variables and a set of constants, with , the set of linear -terms is defined by the following.
Any is in ;
if are linear -terms whose sets of free variables are disjoint then ;
if , and occurs freely in exactly once then .
We use common notational conventions such as omitting dots and outermost brackets and writing iterated applications as
Given a set of propositional symbols or atomic types, the set of linear implicational types over is defined by induction.
Any is in ;
if , then .
Since we are not going to discuss any non-linear fragment, the title “linear” in the context of -calculus will usually be omitted.
A -typing assumption is an expression of the form , where and . A (linear) -typing context is a finite set
of typing assumptions, where are pairwise distinct.
Writing for the sequence of types in (10), and
for the vector of variables, we denote typing context (10) as .
A (linear) -typing judgement is a sequent of the form , where , , and is a typing context.
-Typing judgements are derived from the following type inference rules.
be a finite multiset of -typing judgements.
A typing judgement is derivable in from elements of using each element exactly once iff there exists a term and a typing judgement of the form
derivable in linear -calculus such that
Proof By induction on derivation.
A -signature is a triple , where is a finite set of atomic types, is a finite set of constants and is a function assigning to each constant a linear implicational type .
Typing judgements of the form
where , are called signature axioms of .
Given a -signature , we say that a typing judgement is derivable in if it is derivable from axioms of by rules of linear -calculus. We write in this case .
The following is standard.
For any -signature , the set of derivable typing judgements is closed under the substitution rule
Proof by induction on derivation.
5.2 Translating -signatures
Let be a -signature, and assume that every element is assigned a valency .
Valency defines an embedding
of implicational types to tensor type symbols by the following induction:
By abuse of notation, in the following we will denote an implicational type and the corresponding tensor type symbol the same.
Now let a finite alphabet of terminal symbols be given, and assume that each axiom of is assigned a tensor typing judgement over and , its tensor translation.
This defines a tensor signature , the tensor translation of .
We are going to extend tensor translation from axioms to all -typing judgements derivable in .
In order to avoid cumbersome notation, we will omit indices from tensor types and write only tensor type symbols, unless this leads to a confusion.
We translate -typing judgement of form derivable in to a tensor typing judgement of the form , where is some tensor term, by the following induction on derivation.
The axiom translates as .
Let a -typing judgement derivable in have the form
Then it was obtained by the rule.
By the induction hypothesis, the premise
translates as a tensor typing judgement of the form .
Translate (13) as , which is derivable from the above by the rule.
Let the -typing judgement
be obtained from the premises
by the rule.
By the induction hypothesis, the premises translate as tensor typing judgements of the form, respectively
5.2.1 Preservation of -equivalence
Let be a -signature translated to a tensor signature as above.
Assume that -typing judgements
derivable in are translated, respectively, as tensor typing judgements
Proof in the Appendix.
Let be a -signature translated to a tensor signature .
Assume that -typing judgements
are derivable in , and .
Then tensor translations of typing judgements in (19) coincide.
Proof in the Appendix.
Let be a -signature translated to a tensor signature .
If -typing judgements (19) are derivable in , and , then their tensor translations coincide.
Proof in the Appendix.
In the setting as above, if -typing judgements (19) are derivable in , and , then there tensor translations coincide.
Assume that we have a translation of implicational types to tensor types as previously.
Let be implicational types, and let be a tensor term.
Then the tensor typing judgement