flap: A Deterministic Parser with Fused Lexing

04/11/2023
by   Jeremy Yallop, et al.
0

Lexers and parsers are typically defined separately and connected by a token stream. This separate definition is important for modularity and reduces the potential for parsing ambiguity. However, materializing tokens as data structures and case-switching on tokens comes with a cost. We show how to fuse separately-defined lexers and parsers, drastically improving performance without compromising modularity or increasing ambiguity. We propose a deterministic variant of Greibach Normal Form that ensures deterministic parsing with a single token of lookahead and makes fusion strikingly simple, and prove that normalizing context free expressions into the deterministic normal form is semantics-preserving. Our staged parser combinator library, flap, provides a standard interface, but generates specialized token-free code that runs two to six times faster than ocamlyacc on a range of benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2019

LL(1) Parsing with Derivatives and Zippers

In this paper, we present an efficient, functional, and formally verifie...
research
03/14/2016

Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations

We present a simple and effective scheme for dependency parsing which is...
research
10/17/2019

Marpa, A practical general parser: the recognizer

The Marpa recognizer is described. Marpa is a practical and fully implem...
research
10/09/2020

iobes: A Library for Span-Level Processing

Many tasks in natural language processing, such as named entity recognit...
research
07/16/2018

LATE Ain'T Earley: A Faster Parallel Earley Parser

We present the LATE algorithm, an asynchronous variant of the Earley alg...
research
04/10/2023

Interval Parsing Grammars for File Format Parsing

File formats specify how data is encoded for persistent storage. They ca...
research
05/24/2023

Structural Ambiguity and its Disambiguation in Language Model Based Parsers: the Case of Dutch Clause Relativization

This paper addresses structural ambiguity in Dutch relative clauses. By ...

Please sign up or login with your details

Forgot password? Click here to reset