Idris TyRE: a dependently typed regex parser

05/08/2023
by   Ohad Kammar, et al.
0

Regular expressions – regexes – are widely used not only for validating, but also for parsing textual data. Generally, regex parsers output a loose structure, e.g. an unstructured list of matches, leaving it up to the user to validate the output's properties and transform it into the desired structure. Since the regex itself carries information about the structure, this design leads to unnecessary repetition. Radanne introduced typed regexes – TyRE – a type-indexed combinator layer that can be added on top of an existing regex engine. We extend Radanne's design, and implement a parser which maintains type-safety throughout all layers: the user-facing regexes; their internal, desugared, representation; its compiled finite-state automaton; and the automaton's associated instruction-set for constructing the parse-trees. We implemented TyRE in the dependently-typed language Idris 2.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2020

A Verified Packrat Parser Interpreter for Parsing Expression Grammars

Parsing expression grammars (PEGs) offer a natural opportunity for build...
research
09/02/2018

Neural Ranking Models for Temporal Dependency Structure Parsing

We design and build the first neural temporal dependency parser. It util...
research
10/17/2019

Marpa, A practical general parser: the recognizer

The Marpa recognizer is described. Marpa is a practical and fully implem...
research
04/06/2018

Chart Parsing Multimodal Grammars

The short note describes the chart parser for multimodal type-logical gr...
research
04/17/2019

CraftAssist Instruction Parsing: Semantic Parsing for a Minecraft Assistant

We propose a large scale semantic parsing dataset focused on instruction...
research
05/13/2023

Morpheus: Automated Safety Verification of Data-dependent Parser Combinator Programs

Parser combinators are a well-known mechanism used for the compositional...

Please sign up or login with your details

Forgot password? Click here to reset