Overview of Libmarpa (Libmarpa 11.0.10)

4 Overview of Libmarpa

This chapter contains a quick overview of Libmarpa, using standard parsing terminology. It is intended to help a prospective reader of the whole document to know what to expect. Details and careful definitions will be provided in later chapters.

Libmarpa implements the Marpa parsing algorithm. Marpa is named after the legendary 11th century Tibetan translator, Marpa Lotsawa. In creating Marpa, we depended heavily on previous work by Jay Earley, Joop Leo, John Aycock and Nigel Horspool.

Marpa parses any language whose grammar can be written in BNF. That includes recursive grammars, ambiguous grammars, infinitely ambiguous grammars and grammars with useless or empty productions. Marpa does both left- and right-recursion in linear time – in fact if a grammar is in any class currently in practical use, Marpa will parse it in linear time. See Marpa theory paper.

Libmarpa implements the entire Marpa algorithm. This library does the necessary grammar preprocessing, recognizes the input, and produces a “bocage”, which is an optimized parse forest. Libmarpa also supports the ordering, iteration and evaluation of the parse trees in the bocage.

Libmarpa is very low-level. For example, it has no strings. Rules, symbols, and token values are all represented by integers. This, of course, will not suffice for many applications. Users will very often want names for the symbols, non-integer values for tokens, or both. Typically, applications will use arrays to translate Libmarpa’s integer ID’s to strings or other values as required.

Libmarpa also does not implement most of the semantics. Libmarpa does have an evaluator (called a “valuator”), but it does not manipulate the stack directly. Instead, Libmarpa, based on its traversal of the parse tree, passes optimized step by step stack manipulation instructions to the upper layer. These instructions indicate the token or rule involved, and the proper location for the true token value or the result of the rule evaluation. For rule evaluations, the instructions include the stack location of the arguments.

Marpa requires most semantics to be implemented in the application. This allows the application total flexibility. It also puts the application is in a much better position to prevent errors, to catch errors at runtime or, failing all else, to successfully debug the logic.