The short note describes the chart parser for multimodal categorial grammars which has been developed in conjunction with the type-logical treebank for French, which is described in more detail in [moot10semi, moot12spatio, moot15tlgbank] and which is available at [tlgbank]. The chart parser itself can be downloaded as a part of Grail light at [graillight].
The chart parser is an instance of the “deductive parsing” technology of [dedpar] and the core parsing engine of their implementation has been retained in the source coude, with only some minor modifications. I am grateful to the authors for having made their source code available.
The current chart parser was originally introduced as a preprocessing step for a proof net algorithm [moot17grail]. However, this preprocessing step turned out to be so effective that it soon handled a bit under 98% of the complete French Type-Logical Treebank and therefore it made sense to add additional chart rules to handle the remaining few percent as well (these are briefly sketched in Section 2.4, the rest of this paper focuses on the basic rules of the chart parser).
This paper presupposes the reader has at least a basic familiarity with multimodal categorial grammars [m10sep, Moo11, mr12lcg] and with chart parsing [dedpar].
2 Chart rules
In this section, I will dicuss the inference rules used by the chart parser. I will start with the simplest rules and gradually introduce more detail.
2.1 AB rules
The elimination rules and appear already in [dedpar]. For AB grammars, the chart rules are very simple and shown in Figure 1. Chart items are tuples where is an antecedent term, is a formula, and and are integers representing the leftmost and rightmost string positions respectively. The meaning of such a tuple is that we have derived a formula , using the hypotheses in , spanning exactly the positions from on the left to on the right.111The use of pairs of string positions to represent substrings of an input string is widely used in parsing algorithms; see for example [PS87, dedpar, jm].
With this in mind, the chart rule for indicates that if we have derived a a formula spanning string positions and a formula spanning string positions (that is, and are adjacent with immediately to the right of ), then we can conclude that we can derive a constituent from positions to (that is, the concatenation of the strings assigned to and ).
Given these rules, proving an AB sequent corresponds to starting from axioms and deriving the goal with . To facilitate inspection of the chart items, will not be a binary tree of formulas, but a binary tree of the corresponding words. Therefore, a lexical entry for the verb “dort” (sleeps) with formula at position 1-2 will correspond not to the item but to the item .
As an example, the table below shows how the chart is filled for “Le marché financier de Paris” (the financial market of Paris).
|6||From 1,2 by|
|7||From 2,3 by|
|8||From 4,5 by|
|9||From 1,7 by|
|10||From 7,8 by|
|11||From 1,10 by|
The chart items are labeled from 1 to 11 indicating the order they are entered in the chart. We use a general chart parser of the type explained in [dedpar], so we start with an agenda containing items 1-5 (the lexical lookup for the words in the sentence) and then successively add the items of the agenda to the chart. When we add an item from the agenda to the chart, we compute all consequences according to the rules of the grammar of this item with all items already in the chart. So once item 2 is added to the chart, item 6 is added to the agenda, since it is the combination of item 2 with item 1 (already in the chart) by means of rule . Similarly, item 7 is added to the agenda when item 3 is added to the chart and item 8 is added to the agenda when item 5 is added to the chart, etc.
We complete the parse when item 11 is added to the chart. If desired, we can recover the proof by recursively finding the justification of each of the rules, going back from 11 to 1 and 10, from 10 to 7 and 8 (1 is in the lexicon and so an axiom of the proof) until we have reached all the axioms, which are justified by their respective lexical entries. The chart items marked in gray do not contribute to the proof of 11.
The actual implementation keeps track of several types of additional information: it computes the semantics of the derivation and there is also a mechanism for computing the (log-)probabilities of the rules.
The implementation also uses an important simplification: once we have
computed a chart item for a formula over span and then we
will treat this as known and reject any further derivations of this
formula over the same string (if probabilities are used, only the
most probably derivation of is kept). This can throw away alternative
semantic readings for a phrase, but reduces the size of the chart.
If desired, this behavior can easily by changed by replacing the don’t care variables
_ in the predicate
subsumes_data by a test for
-equivalence of the lambda-terms.
2.2 Hypothetical Reasoning
Hypothetical reasoning is implemented using a strategy very similar to “gap threading” in the parsing literature. Chart items are now of the form , where is a set of pairs of the form , with a position integer and a formula; the set is the set of “extracted” constituents which have been used to compute . The rules for extraction (hypothetical reasoning) are shown in Figure 2. The and rules of Figure 1 have been updated to include the new set of extracted items.
The set union of two such sets and is defined only if is empty; this reflects that fact that a hypothesis to be discharged later can only be used once.
The e_start rule states that if we have a formula with rightmost position and a formula spanning positions then we can conclude there is a formula spanning positions depending on an extracted element . The underscores indicate we do note care about this value for the chart item. So for the leftmost premiss of the e_start, we do not care about the antecedent, about the leftmost position or about the stack of extractions: the formula functions as a sort of “trigger” allowing extraction of a formula to take place to its right.
The rule e_start has as side conditions that and that is not a member of (this is a general consequence of the disjoint set union used).
The e_start rule is a combination of using a axiom in combination with a previous proof of to derive by that , with the condition that the hypothesis must be discharged at position by the formula which licensed this rule. This discharge is taken care of by the e_end rule.
The e_end rule states that if we have derived a using a hypothetical to the immediate right of a formula , then we can derive an spanning the total positions, removing the formula from the set of extracted elements; the notation indicates that the formula was derived using the formula exactly once (plus some additional, possibly empty, set of items ).
A chart item is coherent, if for all , . This is because formulas of the form are looking to their right for a constituent missing a somewhere.
We initialize all lexical entries with the empty set and at the end of a derivation, we require that the set of traces is empty. That is, our lexical entries are now of the form and our goal is of the form for some formula and with the antecedent term such that .
Typical instantiations of the formula are (for relativizers) and (for clitics).
The chart rules for extraction/hypothetical reasoning are perhaps the easiest to understand by seeing them in action. We can derive the sentence fragment “qu’on emprunte” (that we borrow) to be of type as follows.
Incompleteness of the rules
As can be seen from the rules, they are incomplete. The extraction start rule can apply only to formulas of the form , with a fixed combination of implications (excluding, for example or and only when the formula is an argument, since the extraction start rule is essentially the rule applied to a hypothesis “at a distance”. Another restriction is that each combination of rightmost position and extracted formula can introduce only one hypothetical item. We would need additional chart rules if we want to treat these other cases. The treatment of gapping, discussed briefly in Section 2.4, allows the extracted element to be the functor of an elimination rule.
Though this formula restriction and the resulting incompleteness are unfortunate, since it requires us to be careful in case the algorithm doesn’t find a proof, this rule captures most of the occurrences of the mixed associativity/commutativity rather nicely.
The actual implementation also keeps track of the rightmost position used for the e_start rule. So the set of items takes the form triples where is the rightmost position of the licensor formula and is the rightmost position of the extracted formula. This allows us to use a single rule schema for a combination of mixed associativity and mixed commutativity — the rules for shown — and for which only allow mixed associativity (or “right-node raising”). The e_end rule in this case requires that the rightmost position of the constituent is also the rightmost position of the extracted formula. This right-node raising analysis also has a rule for formulas of the form and can therefore treat lexical formulas such as , which is a transitive verb conjunction type but which allows combinations such as the following.
This is useful for patterns like “has read and might implement (Dijkstra’s algorithm” , where both “has read” and “might implement” require the derivation pattern shown above.
2.3 Head wrap
French adverbs can occur at the start of the sentence, at the end of the sentence and before the verb (where we can assign them the formulas , and respectively.222We have chosen an event semantics in the style of Davidson for adverbs, which means that we can treat many adverbs as sentence modifiers. Some subject-oriented adverbs, such as “ensemble” (together) need both the subject and the sentence for their semantics and are assigned and instead. In addition, French adverbs can occur directly after the verb but also between a verb and its arguments. In order to avoid unnecessary duplication in the lexicon, we assign adverbs the type (or, in some cases, ) and use structural rules to move the verb to a sentence-final position.
In Figure 3 we see how this idea translates into chart rules. In addition to the set of extracted items, our chart items now contain a stack of head-wrapped elements. We have chosen a stack instead of a set here to avoid generating readings which would correspond to permutations of the adverbs. With few exceptions, adverbs take scope from left to right. In the chart rules, “” corresponds to stack concatenation, indicates a stack with first element and rest of the stack (which is itself a valid stack) and is the empty stack. We both end and start our proof with empty stacks () and empty sets of traces (). That is, our lexical entries are of the form and the goal is with
The wr rule wraps a chart entry with formula to its correct syntactic position, but also pushes it onto the stack . As can been seen from the rule, the stack is then prefixed to this new stack, thereby keeping all the stack elements in the desired order: the elements in before the new item and the elements in after it.
Finally, the wpop rule simply allows us to pop a stack element whenever the current chart item containing the stack is of type .
The wrapping rules are best illustrated by example. The sentence “il occupera ensuite diverses fonctions” (he will occupy various functions afterwards) is analysed as follows.
The parse first combines the transitive verb “occupera” (will occupy, chart item 2) with the adverb “ensuite” (afterwards, chart item 3) by pushing the adverb on the stack and by combining the lexical strings, producing chart item 6. We continue the proof with elimination rules until we derive from positions 0 to 5 but with the adverb still on the stack. Since and match the formulas of a wpop rule, we pop the adverb from the stack and produce the final item 10.
The example below shows the interaction of the head wrap and the extraction rules.
Using chart items 2 and 8 above, we could have applied the rule to produce , resulting in a chart item which would otherwise be identical to item 9. Therefore, according the the implementation note discussed at the end of Section 2.1, this entry is treated as “already known” and not entered in the chart. Other chart items have multiple equivalent derivations (including even the antecedent term): for example, as shown in the table above, chart item 8 has been derived from 4 and 5 using wr but it has an alternative derivation from 1 and 6 using e_start: there are two equivalent ways to apply e_start and wr to the transitive verb to produce chart item 8.
Since the e_end rule requires an empty stack to apply, we cannot apply the e_end rule to chart item 9 and need to pop the stack first using wpop, producing chart item 10, which is the proper configuration for an application of e_end.
The implementation allows us to pop elements from the stack at the level as well. This allows infinitive arguments to take adverbs of the form .
2.4 Other chart rules
In newspaper articles, quotes speech is rather frequent. Most frequently, this takes the form of a tag like “said the Prime Minister”, and this does not necessarily occur at the end of a sentence. To complicate matters, we even have sentences like the following.
. [sl Les conservateurs], a ajouté le premier ministre …, [sr “ne sont pas des
opportunistes qui virevoltent d’une politique à l’autre ]
[sl The Conservatives], has added the Prime Minister …, [sr “ are not opportunists who flip-flop from one policy to another ]
In this sentence the quoted sentence is split into two parts (marked and ) and there two parts together are the arguments of the past participle “ajouté” (added), which itself is the argument of the auxiliary verb form “a” (has) (and the elided material “…” includes an adverb modifying the past participle).
As a solution, the additional chart rules treat these combinations much like complex adverbs. For example, we can derive “a ajouté to be for type as follows.
Gapping includes cases like those shown below.
. Le véhicule pourrait être immobilisé et la carte
The car could be immobilised and the registration certificate retained
This sentence can be paraphrased along the lines “the car could be immobilised and the registration certificate could be retained”, with the verb group “pourrait être” (could be) occurring only in the first sentence syntactically, but semantically it fills the same role in both sentences. This type of sentences is treated along the lines of [cgellipsis], though recast in the framework of [mac10]. The central idea of this analysis is that the verb group is extracted from both sentences and then infixed (at the place of the original verb group) in the first sentence.
Some conjunctions have the simplest analysis when we use the product formula. Look, for example, at the following sentence.
.augmenter [np ses fonds propres ] [pp de 90 millions de
francs ] et [np les quasi-fonds propres ] [pp de 30 millions ]
increase [np its equity ] [pp by 90 million francs ] and [np its quasi-equity ] [pp by 30 million ]
Here the verb “augmenter” (to augment) takes both an and a argument.
We can derive these cases by assigning “et” the following formula.
The rule is easy to add to the chart parser. The implementation is careful to use to product introduction rule only when an adjacent chart item requires a product argument (a naive implementation would concluded from any adjacent chart items and ).
The elimination rules are more delicate and involve patterns such as the following (these are easy to show valid using associativity of ).
Together, these allow us to combine with as follows.
Very rarely, for a total of nine times in the entire corpus, we need left-node raising, the symmetric operation of right-node raising. In the example below, we have a conjunction of two combinations of two noun post-modifiers : “français Aérospatiale” and “italien Alenia”.
…. des groupes français Aérospatiale et
italien Alenia …
… of the groups french Aérospatiale and italian Alenia …
… of the french group Aérospatiale and italian (group) Alenia …
By analysing “et” (and) as we can use the derivability of as follows.
Final implementation notes
Since the final chart parser has many inference rules which apply only in specific situations (essentially all rules, except for the basic AB rules) and since the chart parser has a fair amount of overhead trying (and failing) to match each of these rules, there is a separate mechanism which verifies if the formulas contain any patterns which trigger rules beyond the AB rules and if so, activate all potentially useful rules. Therefore, the product rules are only active if there is a formula of the form , the wrapping rules only if there is a formula , etc.
We have given a fairly high-level description of the multimodal chart parser which is part of the type-logical treebank for French. The source code, issued under the GNU Lesser General Public License, contains much more detail.
-  Hendriks1995cgellipsis Hendriks, P. 1995, Ellipsis and multimodal categorial type logic, in G. Morrill R. T. Oehrle, eds, ‘Proceedings of Formal Grammar 1995’, Barcelona, Spain, pp. 107–122.
-  Jurafsky Martin2009jm Jurafsky, D. Martin, J. H. 2009, Speech and Language Processing, 2 edn, Pearson.
-  Moortgat1996mac10 Moortgat, M. 1996, In situ binding: A modal analysis, in P. Dekker M. Stokhof, eds, ‘Proceedings 10th Amsterdam Colloquium’, ILLC, Amsterdam, pp. 539–549.
-  Moortgat2010m10sep Moortgat, M. 2010, ‘Typelogical grammar’, Stanford Encyclopedia of Philosophy Website. http://plato.stanford.edu/entries/typelogical-grammar/.
-  Moortgat2011Moo11 Moortgat, M. 2011, Categorial type logics, in J. van Benthem A. ter Meulen, eds, ‘Handbook of Logic and Language’, North-Holland Elsevier, Amsterdam, chapter 2, pp. 95–179.
-  Moot2010moot10semi Moot, R. 2010, Semi-automated extraction of a wide-coverage type-logical grammar for French, in ‘Proceedings of Traitement Automatique des Langues Naturelles (TALN)’, Montreal.
-  Moot2012moot12spatio Moot, R. 2012, ‘Wide-coverage semantics for spatio-temporal reasoning’, Traitement Automatique des Languages 53(2), 115–142.
-  Moot2015atlgbank Moot, R. 2015a, ‘TLGbank: A type-logical treebank for French’, http://richardmoot.github.io/TLGbank/.
-  Moot2015bmoot15tlgbank Moot, R. 2015b, ‘A type-logical treebank for french’, Journal of Language Modelling 3(1), 229–264.
-  Moot2017moot17grail Moot, R. 2017, The Grail theorem prover: Type theory for syntax and semantics, in Z. Luo S. Chatzikyriakidis, eds, ‘Modern Perspectives in Type Theoretical Semantics’, Studies in Linguistics and Philosophy, Springer, pp. 247–277.
-  Moot2018graillight Moot, R. 2018, ‘Grail light’, https://github.com/RichardMoot/GrailLight. Chart-based parser for type-logical grammars.
Moot, R. Retoré, C. 2012,
The Logic of Categorial Grammars: A Deductive Account of Natural
Language Syntax and Semantics, number 6850 in
‘Lecture Notes in Artificial Intelligence’, Springer.
-  Pereira Shieber1987PS87 Pereira, F. Shieber, S. 1987, Prolog and Natural Language Analysis, CSLI, Stanford.
[Shieber et al.]Shieber, Schabes Pereira1995dedpar
Shieber, S., Schabes, Y. Pereira, F. 1995, ‘Principles and implementation of deductive parsing’,
Journal of Logic Programming24(1–2), 3–36.