Efficient Semiring-Weighted Earley Parsing

by   Andreas Opedal, et al.

This paper provides a reference description, in the form of a deduction system, of Earley's (1970) context-free parsing algorithm with various speed-ups. Our presentation includes a known worst-case runtime improvement from Earley's O (N^3|G||R|), which is unworkable for the large grammars that arise in natural language processing, to O (N^3|G|), which matches the runtime of CKY on a binarized version of the grammar G. Here N is the length of the sentence, |R| is the number of productions in G, and |G| is the total length of those productions. We also provide a version that achieves runtime of O (N^3|M|) with |M| ≤ |G| when the grammar is represented compactly as a single finite-state automaton M (this is partly novel). We carefully treat the generalization to semiring-weighted deduction, preprocessing the grammar like Stolcke (1995) to eliminate deduction cycles, and further generalize Stolcke's method to compute the weights of sentence prefixes. We also provide implementation details for efficient execution, ensuring that on a preprocessed grammar, the semiring-weighted versions of our methods have the same asymptotic runtime and space requirements as the unweighted methods, including sub-cubic runtime on some grammars.


page 1

page 2

page 3

page 4


A Fast Algorithm for Computing Prefix Probabilities

Multiple algorithms are known for efficiently calculating the prefix pro...

On the Complexity of CCG Parsing

We study the parsing complexity of Combinatory Categorial Grammar (CCG) ...

Algorithms for Weighted Pushdown Automata

Weighted pushdown automata (WPDAs) are at the core of many natural langu...

Fast and Space-Efficient Construction of AVL Grammars from the LZ77 Parsing

Grammar compression is, next to Lempel-Ziv (LZ77) and run-length Burrows...

Parsing Combinatory Categorial Grammar with Answer Set Programming: Preliminary Report

Combinatory categorial grammar (CCG) is a grammar formalism used for nat...

Roadmap Enhanced Improvement to the VSIMM Tracker via a Constrained Stochastic Context Free Grammar

The aim of syntactic tracking is to classify spatio-temporal patterns of...

Approximating CKY with Transformers

We investigate the ability of transformer models to approximate the CKY ...

Please sign up or login with your details

Forgot password? Click here to reset